EgaitAdidas2014 - Healthy Participants with MoCap reference#

This dataset contains data from healthy participants walking with different speed levels through a motion capture volume. The dataset can be used to benchmark the performance of spatial parameter estimation methods based on foot worn IMUs.

General Information#

The EgaitAdidas2014 dataset contains data healthy participants walking through a vicon motion capture system with one IMU attached to each foot.

For many participants data for SHIMMER3 and SHIMMER2 is available. The SHIMMER3 data is sampled at 204.8 Hz and the SHIMMER2R data at 102.4 Hz. This also allows for a comparison of the two sensors.

For both IMUs we unify the coordinate system on loading as shown below:

coordinate system definition
coordinate system definition

Participants where instructed to walk with a specific stride length and velocity to create more variation in the data. For each trial only a couple strides were recorded within the motion capture system. The IMU data contains the entire recording. This additional data can contain just some additional strides or entire different movements depending on the trial. We recommend inspecting the specific trial in case of issues.

The Vicon motion capture system was sampled at 200 Hz. The IMUs and the mocap system are synchronized using a wireless trigger allowing for proper comparison of the calculated trajectories.

Reference (expert labeled based on IMU data) stride borders are provided for all strides that are recorded by both systems.

In the following we will show how to interact with the dataset and how to make sense of the reference information.

Warning

For this example to work, you need to have a global config set containing the path to the dataset. Check the README.md for more information.

First we create a simple instance of the dataset class.

from gaitmap_datasets import EgaitAdidas2014
from gaitmap_datasets.utils import convert_segmented_stride_list

dataset = EgaitAdidas2014()
dataset

EgaitAdidas2014 [497 groups/rows]

participant sensor stride_length stride_velocity repetition
0 000 shimmer3 high high 1
1 000 shimmer3 high high 2
2 000 shimmer3 high low 1
3 000 shimmer3 high low 2
4 000 shimmer3 high low 3
... ... ... ... ... ...
492 019 shimmer3 low low 1
493 019 shimmer3 low low 2
494 019 shimmer3 low low 3
495 019 shimmer3 normal normal 1
496 019 shimmer3 normal normal 3

497 rows × 5 columns



We can see that we have 5 levels in the metadata.

  • participant

  • sensortype (shimmer2, shimmer3)

  • stride_length (low, medium, high)

  • stride_velocity (low, medium, high)

  • repetition (1, 2, 3)

The stride_length and stride_velocity are the instructions given to the participants. For each combination of these two parameters, 3 repetitions were recorded.

However, for many participants data for at least some trials are missing for various technical issues.

For now we are selecting the data for one participant.

subset = dataset.get_subset(participant="008")
subset

EgaitAdidas2014 [28 groups/rows]

participant sensor stride_length stride_velocity repetition
0 008 shimmer2r high high 1
1 008 shimmer2r high high 2
2 008 shimmer2r high high 3
3 008 shimmer2r high low 1
4 008 shimmer2r high low 2
5 008 shimmer2r high low 3
6 008 shimmer2r low high 1
7 008 shimmer2r low high 2
8 008 shimmer2r low high 3
9 008 shimmer2r low low 1
10 008 shimmer2r low low 2
11 008 shimmer2r low low 3
12 008 shimmer2r normal normal 1
13 008 shimmer2r normal normal 2
14 008 shimmer2r normal normal 3
15 008 shimmer3 high high 1
16 008 shimmer3 high high 3
17 008 shimmer3 high low 2
18 008 shimmer3 high low 3
19 008 shimmer3 low high 1
20 008 shimmer3 low high 2
21 008 shimmer3 low high 3
22 008 shimmer3 low low 1
23 008 shimmer3 low low 2
24 008 shimmer3 low low 3
25 008 shimmer3 normal normal 1
26 008 shimmer3 normal normal 2
27 008 shimmer3 normal normal 3


For this participant we will have a look at the “normal” stride length and velocity trial of the shimmer2r sensor.

trial = subset.get_subset(stride_length="normal", stride_velocity="normal", sensor="shimmer2r", repetition="1")
trial

EgaitAdidas2014 [1 groups/rows]

participant sensor stride_length stride_velocity repetition
0 008 shimmer2r normal normal 1


The IMU data is stored in the data attribute, which is a dictionary of pandas dataframes.

sensor = "left_sensor"
imu_data = trial.data[sensor]
imu_data
acc_x acc_y acc_z gyr_x gyr_y gyr_z
time [s]
-1.132812 -1.031322 -0.714131 10.073449 21.203699 3.153437 -1.379233
-1.123047 -0.801888 -0.589762 10.309767 21.199249 2.788104 -1.008510
-1.113281 -0.914539 -0.507418 10.151644 20.484126 2.047077 -1.715091
-1.103516 -0.494919 -0.410919 10.308554 20.105190 3.500215 -0.671877
-1.093750 -0.683339 -0.293256 10.190028 21.202506 4.248431 -1.408028
... ... ... ... ... ... ...
11.621094 -1.853383 -2.149424 11.225977 -228.677130 362.540696 -345.843558
11.630859 -11.278568 1.251262 22.068987 -205.424910 371.639560 -295.038964
11.640625 -18.892065 1.202731 21.769853 -111.777802 352.998654 -187.517113
11.650391 -14.857877 -2.406800 18.292615 -36.391265 265.777965 -36.757564
11.660156 -7.501331 2.424573 11.066938 -10.335423 181.796898 85.486964

1311 rows × 6 columns



The mocap data is stored in the marker_position_ attribute, which is a dictionary of pandas dataframes, too. Note, that sometimes there are NaN values at the start and the end of the data. In these regions the mocap system was recording, but none of the markers were in frame.

mocap_data = trial.marker_position_[sensor]
mocap_data
toe_l_x toe_l_y toe_l_z toe_m_x toe_m_y toe_m_z toe_2_x toe_2_y toe_2_z cal_m_x cal_m_y cal_m_z cal_l_x cal_l_y cal_l_z heel_x heel_y heel_z
time [s]
0.000 NaN NaN NaN -3.606326 0.284704 -0.011763 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
0.005 NaN NaN NaN -3.607212 0.285005 -0.011860 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
0.010 NaN NaN NaN -3.607656 0.285096 -0.011974 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
0.015 NaN NaN NaN -3.607677 0.285041 -0.012071 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
0.020 NaN NaN NaN -3.607355 0.284940 -0.012108 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
6.975 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
6.980 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
6.985 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
6.990 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
6.995 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN

1400 rows × 18 columns



Both data sources have the time as index, so that we can easily plot them together. We converted the time axis so that the start of the Mocap data is the global 0. This means that the IMU data will have negative time values for the datapoints before the MoCap start.

import matplotlib.pyplot as plt

fig, (ax1, ax2, ax3) = plt.subplots(3, 1, sharex=True)
imu_data.filter(like="gyr").plot(ax=ax1, legend=False)
imu_data.filter(like="acc").plot(ax=ax2, legend=True)
mocap_data[["heel_z"]].plot(ax=ax3)

ax1.set_ylabel("Gyroscope [deg/s]")
ax2.set_ylabel("Acc. [m/s^2]")
ax3.set_ylabel("Pos. [m]")

fig.show()
egait adidas 2014

For the strides that are within the mocap volume, manually annotated stride labels based on the IMU data are available. They are provided in samples relative to the start of the IMU data stream.

segmented_strides = trial.segmented_stride_list_
segmented_strides[sensor]
start end
s_id
0 258 369
1 369 481
2 481 593


To get the events relative to the mocap data (i.e. in mocap samples relative to the start of the mocap data you can use the convert_events method.

trial.convert_events(segmented_strides, from_time_axis="imu", to_time_axis="mocap")[sensor]
start end
s_id
0 277 494
1 494 713
2 713 932


Similarly, you can convert the events to the same time axis as the data

trial.convert_events(segmented_strides, from_time_axis="imu", to_time_axis="time")[sensor]
start end
s_id
0 1.386719 2.470703
1 2.470703 3.564453
2 3.564453 4.658203


In addition to the segmented strides, we also provide a reference event list calculated based on the mocap data. This has the same start and end per stride as the segmented strides, but has columns for the initial contact/heel strike (ic), final contact/toe off (tc) and mid-stance (min_vel). This information is provided in samples relative to the start of the mocap data stream. (Compare to the converted segmented strides above).

mocap_events = trial.mocap_events_
mocap_events[sensor]
start end min_vel tc ic
s_id
0 277 494 427 295 379
1 494 713 644 512 596
2 713 932 863 728 813


Like the segmented stride list, we can convert them to the same time axis as the data or IMU samples.

trial.convert_events(mocap_events, from_time_axis="mocap", to_time_axis="time")[sensor]
start end min_vel tc ic
s_id
0 1.385 2.470 2.135 1.475 1.895
1 2.470 3.565 3.220 2.560 2.980
2 3.565 4.660 4.315 3.640 4.065


Below we plot the time converted event list into the plot from above In the mocap plot we also add the mocap derived gait events.

fig, (ax1, ax2, ax3) = plt.subplots(3, 1, sharex=True)
imu_data.filter(like="gyr").plot(ax=ax1, legend=False)
imu_data.filter(like="acc").plot(ax=ax2, legend=True)
mocap_data[["heel_z"]].plot(ax=ax3)
for ax in (ax1, ax2, ax3):
    for i, s in trial.convert_events(segmented_strides, from_time_axis="imu", to_time_axis="time")[sensor].iterrows():
        ax.axvspan(s["start"], s["end"], alpha=0.2, color="C1")

# We plot the events in ax3
for marker, event_name in zip(["o", "s", "*"], ["tc", "ic", "min_vel"]):
    mocap_data[["heel_z"]].iloc[mocap_events[sensor][event_name]].rename(columns={"heel_z": event_name}).plot(
        ax=ax3, style=marker, label=event_name, markersize=3
    )

ax1.set_ylabel("Gyroscope [deg/s]")
ax2.set_ylabel("Acc. [m/s^2]")
ax3.set_ylabel("Pos. [m]")

fig.show()
egait adidas 2014

As you can see, in this example, three strides are properly detected by both systems. These strides are defined based on the signal maximum in the gyr_y (i.e. gyr_ml axis).

This definition is good for segmentation. However, for calculation of gait parameters, the authors of the dataset defined strides from midstance (i.e. the min_vel point) to midstance of two consecutive strides. In result, when looking at the parameters, there will be one stride less than the number of strides in the segmented stride list.

trial.mocap_parameters_[sensor]
stride_length stride_time stance_time swing_time max_heel_clearance max_toe_clearance ic_angle tc_angle max_lateral_excursion
s_id
0 1.479162 1.085 0.66499 0.42001 0.287149 0.219740 -26.0842 73.4280 6.9791
1 1.474048 1.085 0.66499 0.42001 0.293851 0.217972 -25.2798 71.3141 7.3753


To better understand how this works, we can convert the mocap events from their segmented stride list form into a min_vel-stride list. In this form, the start and the end of each stride is defined by the min_vel event. In addition, a new pre_ic event is added. This marks the ic of the previous stride.

Overall, one less stride exists in the min_vel stride list than in the segmented stride list. The s_id of the new stride list is based on the s_id of the segmented stride that contains the pre_ic event.

mocap_min_vel_stride_list = convert_segmented_stride_list(mocap_events, target_stride_type="min_vel")
mocap_min_vel_stride_list[sensor]
start end min_vel tc ic pre_ic
s_id
0 427 644.0 427 512.0 596.0 379
1 644 863.0 644 728.0 813.0 596


Stride time is now calculated from the pre_ic to the ic event (compare trial.mocap_parameters_[sensor]).

stride_time = mocap_min_vel_stride_list[sensor]["ic"] - mocap_min_vel_stride_list[sensor]["pre_ic"]
stride_time / trial.mocap_sampling_rate_hz_
s_id
0    1.085
1    1.085
dtype: float64

As comparison the pre-calculated stride time:

trial.mocap_parameters_[sensor]["stride_time"]
s_id
0    1.085
1    1.085
Name: stride_time, dtype: float64

Stride length is calculated as the displacement in the ground-plane between start and end (i.e. the two min_vel events).

starts = mocap_min_vel_stride_list[sensor]["start"]
ends = mocap_min_vel_stride_list[sensor]["end"]
stride_length_heel = (
    (
        mocap_data[["heel_x", "heel_y"]].iloc[ends].reset_index(drop=True)
        - mocap_data[["heel_x", "heel_y"]].iloc[starts].reset_index(drop=True)
    )
    .pow(2)
    .sum(axis=1)
    .pow(0.5)
)
stride_length_heel
0    1.478275
1    1.474500
dtype: float32

As comparison the pre-calculated stride length: Note that this stride-length differs slightly from the one calculated above, as the authors of the dataset provided the average stride length over all available markers.

trial.mocap_parameters_[sensor]["stride_length"]
s_id
0    1.479162
1    1.474048
Name: stride_length, dtype: float64

Usage as validation dataset#

To compare the reference parameters with the parameters of a IMU based algorithm, you should use the segmented stride list as a starting point. From there you can calculate gait events (e.g. ic) within these strides to compare temporal parameters. Ideally store the events as a segmented stride list and then use the convert_segmented_stride_list function to bring them in the same format used to calculate the reference parameters.

When calculating spatial parameters, you should calculate your own IMU based min_vel points instead of using the mocap derived ones. These don’t always align with real moments of no movement in the IMU data and hence might lead to issues with ZUPT based algorithms.

For algorithms that rely on calculations on the entire signal (i.e. not just the strides within the mocap volume), keep in mind, that the amount of additional movement in the data varies from trial to trial. Some trials just contain walking, others resting and walking, and some contain small jumps used as fallback synchronization. Hence, if you see unexpected results for specific trails, you might want to check the raw data.

Further Notes#

In many cases clear drift in the Mocap data is observed. The authors of the dataset corrected that drift before calculating the reference parameters using a linear drift model. For further information see the two papers using the dataset [1] and [2].

Total running time of the script: ( 0 minutes 5.632 seconds)

Estimated memory usage: 24 MB

Gallery generated by Sphinx-Gallery