Note

Click here to download the full example code

EgaitAdidas2014 - Healthy Participants with MoCap reference#

This dataset contains data from healthy participants walking with different speed levels through a motion capture volume. The dataset can be used to benchmark the performance of spatial parameter estimation methods based on foot worn IMUs.

General Information#

The EgaitAdidas2014 dataset contains data healthy participants walking through a vicon motion capture system with one IMU attached to each foot.

For many participants data for SHIMMER3 and SHIMMER2 is available. The SHIMMER3 data is sampled at 204.8 Hz and the SHIMMER2R data at 102.4 Hz. This also allows for a comparison of the two sensors.

For both IMUs we unify the coordinate system on loading as shown below:

Participants where instructed to walk with a specific stride length and velocity to create more variation in the data. For each trial only a couple strides were recorded within the motion capture system. The IMU data contains the entire recording. This additional data can contain just some additional strides or entire different movements depending on the trial. We recommend inspecting the specific trial in case of issues.

The Vicon motion capture system was sampled at 200 Hz. The IMUs and the mocap system are synchronized using a wireless trigger allowing for proper comparison of the calculated trajectories.

Reference (expert labeled based on IMU data) stride borders are provided for all strides that are recorded by both systems.

In the following we will show how to interact with the dataset and how to make sense of the reference information.

Warning

For this example to work, you need to have a global config set containing the path to the dataset. Check the README.md for more information.

First we create a simple instance of the dataset class.

from gaitmap_datasets import EgaitAdidas2014
from gaitmap_datasets.utils import convert_segmented_stride_list

dataset = EgaitAdidas2014()
dataset

EgaitAdidas2014 [497 groups/rows]

	participant	sensor	stride_length	stride_velocity	repetition
0	000	shimmer3	high	high	1
1	000	shimmer3	high	high	2
2	000	shimmer3	high	low	1
3	000	shimmer3	high	low	2
4	000	shimmer3	high	low	3
...	...	...	...	...	...
492	019	shimmer3	low	low	1
493	019	shimmer3	low	low	2
494	019	shimmer3	low	low	3
495	019	shimmer3	normal	normal	1
496	019	shimmer3	normal	normal	3

497 rows × 5 columns

We can see that we have 5 levels in the metadata.

participant
sensortype (shimmer2, shimmer3)
stride_length (low, medium, high)
stride_velocity (low, medium, high)
repetition (1, 2, 3)

The stride_length and stride_velocity are the instructions given to the participants. For each combination of these two parameters, 3 repetitions were recorded.

However, for many participants data for at least some trials are missing for various technical issues.

For now we are selecting the data for one participant.

subset = dataset.get_subset(participant="008")
subset

EgaitAdidas2014 [28 groups/rows]

	participant	sensor	stride_length	stride_velocity	repetition
0	008	shimmer2r	high	high	1
1	008	shimmer2r	high	high	2
2	008	shimmer2r	high	high	3
3	008	shimmer2r	high	low	1
4	008	shimmer2r	high	low	2
5	008	shimmer2r	high	low	3
6	008	shimmer2r	low	high	1
7	008	shimmer2r	low	high	2
8	008	shimmer2r	low	high	3
9	008	shimmer2r	low	low	1
10	008	shimmer2r	low	low	2
11	008	shimmer2r	low	low	3
12	008	shimmer2r	normal	normal	1
13	008	shimmer2r	normal	normal	2
14	008	shimmer2r	normal	normal	3
15	008	shimmer3	high	high	1
16	008	shimmer3	high	high	3
17	008	shimmer3	high	low	2
18	008	shimmer3	high	low	3
19	008	shimmer3	low	high	1
20	008	shimmer3	low	high	2
21	008	shimmer3	low	high	3
22	008	shimmer3	low	low	1
23	008	shimmer3	low	low	2
24	008	shimmer3	low	low	3
25	008	shimmer3	normal	normal	1
26	008	shimmer3	normal	normal	2
27	008	shimmer3	normal	normal	3

For this participant we will have a look at the “normal” stride length and velocity trial of the shimmer2r sensor.

trial = subset.get_subset(stride_length="normal", stride_velocity="normal", sensor="shimmer2r", repetition="1")
trial

EgaitAdidas2014 [1 groups/rows]

	participant	sensor	stride_length	stride_velocity	repetition
0	008	shimmer2r	normal	normal	1

The IMU data is stored in the data attribute, which is a dictionary of pandas dataframes.

sensor = "left_sensor"
imu_data = trial.data[sensor]
imu_data

	acc_x	acc_y	acc_z	gyr_x	gyr_y	gyr_z
time [s]
-1.132812	-1.031322	-0.714131	10.073449	21.203699	3.153437	-1.379233
-1.123047	-0.801888	-0.589762	10.309767	21.199249	2.788104	-1.008510
-1.113281	-0.914539	-0.507418	10.151644	20.484126	2.047077	-1.715091
-1.103516	-0.494919	-0.410919	10.308554	20.105190	3.500215	-0.671877
-1.093750	-0.683339	-0.293256	10.190028	21.202506	4.248431	-1.408028
...	...	...	...	...	...	...
11.621094	-1.853383	-2.149424	11.225977	-228.677130	362.540696	-345.843558
11.630859	-11.278568	1.251262	22.068987	-205.424910	371.639560	-295.038964
11.640625	-18.892065	1.202731	21.769853	-111.777802	352.998654	-187.517113
11.650391	-14.857877	-2.406800	18.292615	-36.391265	265.777965	-36.757564
11.660156	-7.501331	2.424573	11.066938	-10.335423	181.796898	85.486964

1311 rows × 6 columns

The mocap data is stored in the marker_position_ attribute, which is a dictionary of pandas dataframes, too. Note, that sometimes there are NaN values at the start and the end of the data. In these regions the mocap system was recording, but none of the markers were in frame.

mocap_data = trial.marker_position_[sensor]
mocap_data

	toe_l_x	toe_l_y	toe_l_z	toe_m_x	toe_m_y	toe_m_z	toe_2_x	toe_2_y	toe_2_z	cal_m_x	cal_m_y	cal_m_z	cal_l_x	cal_l_y	cal_l_z	heel_x	heel_y	heel_z
time [s]
0.000	NaN	NaN	NaN	-3.606326	0.284704	-0.011763	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN
0.005	NaN	NaN	NaN	-3.607212	0.285005	-0.011860	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN
0.010	NaN	NaN	NaN	-3.607656	0.285096	-0.011974	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN
0.015	NaN	NaN	NaN	-3.607677	0.285041	-0.012071	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN
0.020	NaN	NaN	NaN	-3.607355	0.284940	-0.012108	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN
...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...
6.975	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN
6.980	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN
6.985	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN
6.990	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN
6.995	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN

1400 rows × 18 columns

Both data sources have the time as index, so that we can easily plot them together. We converted the time axis so that the start of the Mocap data is the global 0. This means that the IMU data will have negative time values for the datapoints before the MoCap start.

import matplotlib.pyplot as plt

fig, (ax1, ax2, ax3) = plt.subplots(3, 1, sharex=True)
imu_data.filter(like="gyr").plot(ax=ax1, legend=False)
imu_data.filter(like="acc").plot(ax=ax2, legend=True)
mocap_data[["heel_z"]].plot(ax=ax3)

ax1.set_ylabel("Gyroscope [deg/s]")
ax2.set_ylabel("Acc. [m/s^2]")
ax3.set_ylabel("Pos. [m]")

fig.show()

For the strides that are within the mocap volume, manually annotated stride labels based on the IMU data are available. They are provided in samples relative to the start of the IMU data stream.

segmented_strides = trial.segmented_stride_list_
segmented_strides[sensor]

	start	end
s_id
0	258	369
1	369	481
2	481	593

To get the events relative to the mocap data (i.e. in mocap samples relative to the start of the mocap data you can use the convert_events method.

trial.convert_events(segmented_strides, from_time_axis="imu", to_time_axis="mocap")[sensor]

	start	end
s_id
0	277	494
1	494	713
2	713	932

Similarly, you can convert the events to the same time axis as the data

trial.convert_events(segmented_strides, from_time_axis="imu", to_time_axis="time")[sensor]

	start	end
s_id
0	1.386719	2.470703
1	2.470703	3.564453
2	3.564453	4.658203

In addition to the segmented strides, we also provide a reference event list calculated based on the mocap data. This has the same start and end per stride as the segmented strides, but has columns for the initial contact/heel strike (ic), final contact/toe off (tc) and mid-stance (min_vel). This information is provided in samples relative to the start of the mocap data stream. (Compare to the converted segmented strides above).

mocap_events = trial.mocap_events_
mocap_events[sensor]

	start	end	min_vel	tc	ic
s_id
0	277	494	427	295	379
1	494	713	644	512	596
2	713	932	863	728	813

Like the segmented stride list, we can convert them to the same time axis as the data or IMU samples.

trial.convert_events(mocap_events, from_time_axis="mocap", to_time_axis="time")[sensor]

	start	end	min_vel	tc	ic
s_id
0	1.385	2.470	2.135	1.475	1.895
1	2.470	3.565	3.220	2.560	2.980
2	3.565	4.660	4.315	3.640	4.065

Below we plot the time converted event list into the plot from above In the mocap plot we also add the mocap derived gait events.

fig, (ax1, ax2, ax3) = plt.subplots(3, 1, sharex=True)
imu_data.filter(like="gyr").plot(ax=ax1, legend=False)
imu_data.filter(like="acc").plot(ax=ax2, legend=True)
mocap_data[["heel_z"]].plot(ax=ax3)
for ax in (ax1, ax2, ax3):
    for i, s in trial.convert_events(segmented_strides, from_time_axis="imu", to_time_axis="time")[sensor].iterrows():
        ax.axvspan(s["start"], s["end"], alpha=0.2, color="C1")

# We plot the events in ax3
for marker, event_name in zip(["o", "s", "*"], ["tc", "ic", "min_vel"]):
    mocap_data[["heel_z"]].iloc[mocap_events[sensor][event_name]].rename(columns={"heel_z": event_name}).plot(
        ax=ax3, style=marker, label=event_name, markersize=3
    )

ax1.set_ylabel("Gyroscope [deg/s]")
ax2.set_ylabel("Acc. [m/s^2]")
ax3.set_ylabel("Pos. [m]")

fig.show()

As you can see, in this example, three strides are properly detected by both systems. These strides are defined based on the signal maximum in the gyr_y (i.e. gyr_ml axis).

This definition is good for segmentation. However, for calculation of gait parameters, the authors of the dataset defined strides from midstance (i.e. the min_vel point) to midstance of two consecutive strides. In result, when looking at the parameters, there will be one stride less than the number of strides in the segmented stride list.

trial.mocap_parameters_[sensor]

	stride_length	stride_time	stance_time	swing_time	max_heel_clearance	max_toe_clearance	ic_angle	tc_angle	max_lateral_excursion
s_id
0	1.479162	1.085	0.66499	0.42001	0.287149	0.219740	-26.0842	73.4280	6.9791
1	1.474048	1.085	0.66499	0.42001	0.293851	0.217972	-25.2798	71.3141	7.3753

To better understand how this works, we can convert the mocap events from their segmented stride list form into a min_vel-stride list. In this form, the start and the end of each stride is defined by the min_vel event. In addition, a new pre_ic event is added. This marks the ic of the previous stride.

Overall, one less stride exists in the min_vel stride list than in the segmented stride list. The s_id of the new stride list is based on the s_id of the segmented stride that contains the pre_ic event.

mocap_min_vel_stride_list = convert_segmented_stride_list(mocap_events, target_stride_type="min_vel")
mocap_min_vel_stride_list[sensor]

	start	end	min_vel	tc	ic	pre_ic
s_id
0	427	644.0	427	512.0	596.0	379
1	644	863.0	644	728.0	813.0	596

Stride time is now calculated from the pre_ic to the ic event (compare trial.mocap_parameters_[sensor]).

stride_time = mocap_min_vel_stride_list[sensor]["ic"] - mocap_min_vel_stride_list[sensor]["pre_ic"]
stride_time / trial.mocap_sampling_rate_hz_

s_id
0    1.085
1    1.085
dtype: float64

As comparison the pre-calculated stride time:

trial.mocap_parameters_[sensor]["stride_time"]

s_id
0    1.085
1    1.085
Name: stride_time, dtype: float64

Stride length is calculated as the displacement in the ground-plane between start and end (i.e. the two min_vel events).

starts = mocap_min_vel_stride_list[sensor]["start"]
ends = mocap_min_vel_stride_list[sensor]["end"]
stride_length_heel = (
    (
        mocap_data[["heel_x", "heel_y"]].iloc[ends].reset_index(drop=True)
        - mocap_data[["heel_x", "heel_y"]].iloc[starts].reset_index(drop=True)
    )
    .pow(2)
    .sum(axis=1)
    .pow(0.5)
)
stride_length_heel

0    1.478275
1    1.474500
dtype: float32

As comparison the pre-calculated stride length: Note that this stride-length differs slightly from the one calculated above, as the authors of the dataset provided the average stride length over all available markers.

trial.mocap_parameters_[sensor]["stride_length"]

s_id
0    1.479162
1    1.474048
Name: stride_length, dtype: float64

Usage as validation dataset#

To compare the reference parameters with the parameters of a IMU based algorithm, you should use the segmented stride list as a starting point. From there you can calculate gait events (e.g. ic) within these strides to compare temporal parameters. Ideally store the events as a segmented stride list and then use the convert_segmented_stride_list function to bring them in the same format used to calculate the reference parameters.

When calculating spatial parameters, you should calculate your own IMU based min_vel points instead of using the mocap derived ones. These don’t always align with real moments of no movement in the IMU data and hence might lead to issues with ZUPT based algorithms.

For algorithms that rely on calculations on the entire signal (i.e. not just the strides within the mocap volume), keep in mind, that the amount of additional movement in the data varies from trial to trial. Some trials just contain walking, others resting and walking, and some contain small jumps used as fallback synchronization. Hence, if you see unexpected results for specific trails, you might want to check the raw data.

Further Notes#

In many cases clear drift in the Mocap data is observed. The authors of the dataset corrected that drift before calculating the reference parameters using a linear drift model. For further information see the two papers using the dataset [1] and [2].

Total running time of the script: ( 0 minutes 5.632 seconds)

Estimated memory usage: 24 MB

Gallery generated by Sphinx-Gallery