Note

Click here to download the full example code

Kluge2017 - A dataset for in-lab parameter validation in healthy and PD.#

The Kluge 2017 dataset was recorded to test the validity of spatial-temporal parameters extracted from IMUs and their test-retest reliability. The dataset contains data from healthy participants and participants with Parkinson’s disease.

General information#

Each participant performed 4x10m walk trials at a slow, preferred and fast speed. Five of the participants performed the trials twice with two weeks in between.

The dataset contains IMU data from each foot recorded with the eGait system (sensor lateral attached to the shoe) and reference data using a marker-less Simi Motion system. Gait events were annotated manually based on the camera recordings und multiple points at the foot were tracked using the Simi Motion system.

For each 10m walk, only the middle 3 m are covered by the Simi Motion system, resulting on avaerage in 2-4 strides per walk-through.

The eGait system in the dataset used the Shimmer 3 sensors using a sampling rate of 102.4 Hz. For the data loaded using the dataset class, we adjust the coordinate system as shown in the figure below. While the sensor configuration is identical to other Shimmer 3 datasets, the coordiante system of the exported data was already rotated to align both feets. Hence, we need to perform a different coordinate transformation than for the other datasets:

Warning

For this example to work, you need to have a global config set containing the path to the dataset. Check the README.md for more information.

from gaitmap_datasets import Kluge2017

First we will create a simple instance of the dataset class. We can see that for each participant we have 3 different walking speeds, the information if they are healthy or PD (patient = True) and if the data was recorded in the first or second session.

dataset = Kluge2017()
dataset

Kluge2017 [75 groups/rows]

	patient	participant	repetition	speed
0	False	216060	0	fast
1	False	216060	0	normal
2	False	216060	0	slow
3	False	216061	1	fast
4	False	216061	1	normal
...	...	...	...	...
70	True	215001	0	normal
71	True	215001	0	slow
72	True	413050	0	fast
73	True	413050	0	normal
74	True	413050	0	slow

75 rows × 4 columns

Based on the index, we will select just the PD cohort

pd_cohort = dataset.get_subset(patient=True)
pd_cohort

Kluge2017 [12 groups/rows]

	patient	participant	speed
0	True	115053	fast
1	True	115053	normal
2	True	115053	slow
3	True	214019	fast
4	True	214019	normal
5	True	214019	slow
6	True	215001	fast
7	True	215001	normal
8	True	215001	slow
9	True	413050	fast
10	True	413050	normal
11	True	413050	slow

We will simply select the first participant in the dataset for the following analysis

participant = pd_cohort.groupby("participant")[0].groupby(None)
participant

Kluge2017 [3 groups/rows]

	patient	participant	speed
0	True	115053	fast
1	True	115053	normal
2	True	115053	slow

Let’s have a look at the slow walk data

slow_walk = participant.get_subset(speed="slow")
slow_walk

Kluge2017 [1 groups/rows]

	patient	participant	repetition	speed
0	True	115053	0	slow

For each gait test, we have access to the raw IMU data, the marker position data and the gait events. Let’s have a look at the raw IMU data. This is a dictionary with the sensor name as key and a pandas DataFrame as value. Note that the index is the time in seconds already aligned with the mocap data.

foot = "right"

imu_data = slow_walk.data[f"{foot}_sensor"]
imu_data

	acc_x	acc_y	acc_z	gyr_x	gyr_y	gyr_z
time after start [s]
0.007812	-0.606944	-0.806221	9.749680	-0.233754	-2.983602	1.094100
0.017578	-0.529068	-0.844837	9.633973	0.691206	-3.561525	0.420545
0.027344	-0.490796	-0.768402	9.555994	0.989510	-4.397582	0.238766
0.037109	-0.452116	-0.768090	9.556089	2.782382	-4.679503	0.113708
0.046875	-0.490779	-0.843053	9.827733	4.531894	-4.103905	-0.197483
...	...	...	...	...	...	...
65.544922	-0.721370	-1.037000	9.789950	0.419637	11.173907	3.775545
65.554688	-0.837001	-1.114059	9.867739	3.130379	12.023181	4.900520
65.564453	-0.953323	-1.076638	9.867153	3.852850	12.442962	5.295863
65.574219	-0.952116	-1.230358	9.829633	4.541235	12.803124	5.783288
65.583984	-0.991486	-1.116189	9.751162	5.986108	12.502267	6.087537

6716 rows × 6 columns

Then let’s also load the marker position data. As all markers trajectories have the same length, this is just a simple pandas DataFrame, with the foot as the top-most column level. We can see that we get position, velocity and acceleration data for the ankle and the tip of the foot. Note that these trajectories are from “actual” markers, but virtual markers defined by the biomechanical model fitted by the SIMI system.

Many values in the dataframe will be NaN, as the mocap system only tracks the foot for a short period of time.

mocap_data = slow_walk.marker_position_[foot]
mocap_data

marker	ankle									foot_tip
metric	pos_x	pos_y	pos_z	vel_x	vel_y	vel_z	acc_x	acc_y	acc_z	pos_x	pos_y	pos_z	vel_x	vel_y	vel_z	acc_x	acc_y	acc_z
time after start [s]
1.000000e-06	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN
9.995000e-03	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN
1.999800e-02	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN
3.000000e-02	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN
3.999400e-02	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN
...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...
6.555000e+01	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN
6.555999e+01	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN
6.557000e+01	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN
6.557999e+01	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN
6.558999e+01	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN

6560 rows × 18 columns

Let’s plot the position of the foot marker and the IMU data together.

import matplotlib.pyplot as plt

fig, (ax1, ax2, ax3, ax4, ax5) = plt.subplots(nrows=5, sharex=True, figsize=(10, 10))
fig.suptitle(f"{slow_walk.group} - {foot} foot")
# Plot the IMU data
ax1.set_title("Gyro")
ax1.set_ylabel("deg/s")
imu_data.filter(like="gyr").plot(ax=ax1)
ax2.set_title("Acc")
ax2.set_ylabel("m/s²")
imu_data.filter(like="acc").plot(ax=ax2)

# Plot the mocap data
ax3.set_title("Marker position X")
ax3.set_ylabel("m")
mocap_data.xs("pos_x", level=1, axis=1).plot(ax=ax3)
ax4.set_title("Marker position Y")
ax4.set_ylabel("m")
mocap_data.xs("pos_y", level=1, axis=1).plot(ax=ax4)
ax5.set_title("Marker position Z")
ax5.set_ylabel("m")
mocap_data.xs("pos_z", level=1, axis=1).plot(ax=ax5)

plt.tight_layout()
plt.show()

Kluge2017(patient=True, participant='115053', repetition=0, speed='slow') - right foot, Gyro, Acc, Marker position X, Marker position Y, Marker position Z

We can also plot the gait events on top of the data. The gait events are stored in a dictionary with the foot as key and a pandas DataFrame as value. Each row in the DataFrame represents one gait stride. Each stride starts and ends with a heel strike/initial contact (IC) of the same foot. The HS event provided with the stride is the start event. In addition, we have the toe-off (TO), the heel-off (HO). The terminal contact (TC) is either the TO or the HO depending on which event occurs last.

All events are provided as samples in the Mocap data.

gait_events = slow_walk.mocap_events_[foot]
gait_events

event	ho	ic	to	start	end	tc
s_id
54	508	438	527	438	576	527
56	650	576	669	576	722	669
58	788	722	811	722	854	811
67	2121	2047	2146	2047	2196	2146
69	2256	2196	2279	2196	2330	2279
71	2402	2330	2418	2330	2467	2418
73	2532	2467	2558	2467	2607	2558
84	3813	3749	3841	3749	3892	3841
86	3964	3892	3986	3892	4037	3986
88	4105	4037	4129	4037	4182	4129
90	4241	4182	4265	4182	4314	4265
97	5618	5553	5644	5553	5681	5644
99	5737	5681	5766	5681	5820	5766
101	5884	5820	5910	5820	5961	5910

Let’s plot the gait events on top of the data.

import matplotlib.pyplot as plt


def plot_with_marker():
    fig, (ax1, ax2, ax3) = plt.subplots(nrows=3, sharex=True, figsize=(10, 6))
    fig.suptitle(f"{slow_walk.group} - {foot} foot")

    # Plot the IMU data
    ax1.set_title("Gyro")
    ax1.set_ylabel("deg/s")
    imu_data.filter(like="gyr").plot(ax=ax1)
    ax2.set_title("Acc")
    ax2.set_ylabel("m/s²")
    imu_data.filter(like="acc").plot(ax=ax2)

    # Plot the mocap data
    ax3.set_title("Marker position ankle Z")
    ax3.set_ylabel("m")
    mocap_data[("ankle", "pos_z")].plot(ax=ax3)

    # Plot the gait events
    for i, (_, stride_mocap) in enumerate(gait_events.iterrows()):
        stride_time = stride_mocap / slow_walk.mocap_sampling_rate_hz_
        stride_imu = (stride_time * slow_walk.sampling_rate_hz).round().astype(int)

        for ax, plot_data in zip(
            (ax1, ax2, ax3), (imu_data["gyr_y"], imu_data["acc_x"], mocap_data[("ankle", "pos_z")])
        ):
            # Plot the stride as a vertical span
            ax.axvspan(stride_time.start, stride_time.end, alpha=0.2, edgecolor="black")

            index_values = stride_imu
            if ax == ax3:
                index_values = stride_mocap

            # Plot the individual events
            ax.plot(stride_time.to, plot_data.iloc[index_values.to], "s", color="blue", label="to" if i == 0 else None)
            ax.plot(stride_time.ho, plot_data.iloc[index_values.ho], "o", color="red", label="ho" if i == 0 else None)
            ax.plot(stride_time.tc, plot_data.iloc[index_values.tc], "x", color="green", label="tc" if i == 0 else None)

    ax3.legend()

    return fig, (ax1, ax2, ax3)


plot_with_marker()
plt.tight_layout()
plt.show()

Kluge2017(patient=True, participant='115053', repetition=0, speed='slow') - right foot, Gyro, Acc, Marker position ankle Z

To better see what is going on, we will zoom into one WB.

_, (_, _, ax3) = plot_with_marker()
ax3.set_xlim(53, 61)
plt.show()

We can see that all the gait events occur in order and roughly line up with the expected signal regions in the IMU signal. We can also see that the z-axis of the ankle marker is drifting over the course of each walk. This is because the mocap system is not calibrated level to the floor. In case parameters like foot-clearance are calculated, this needs to be taken into account.

Spatial Parameters#

In case you want to caluclate spatial parameters based on the trajectories and events, you can use the marker_position_per_stride_ property. It provides the marker information per stride, making it easy to calculate various parameters

per_stride = slow_walk.marker_position_per_stride_[foot]
per_stride

	marker	ankle									foot_tip
	metric	pos_x	pos_y	pos_z	vel_x	vel_y	vel_z	acc_x	acc_y	acc_z	pos_x	pos_y	pos_z	vel_x	vel_y	vel_z	acc_x	acc_y	acc_z
s_id	sample
54	0	0.192753	1.958257	0.082491	0.286104	-0.200563	-0.097244	0.721611	8.414624	0.803715	0.200133	1.704741	0.176551	0.080548	-0.418227	-0.721726	-13.600050	5.803189	-11.501573
	1	0.193070	1.950421	0.071967	0.261980	-0.049779	-0.071751	0.374040	9.592503	8.698684	0.201110	1.698432	0.169994	-0.039450	-0.267904	-0.710838	-12.476939	7.288913	-3.356514
	2	0.199773	1.955140	0.077238	0.248103	-0.058738	-0.091036	-4.638317	8.426217	8.246024	0.193884	1.695691	0.153551	-0.339818	-0.270176	-0.831751	-5.776758	8.854914	4.857469
	3	0.203452	1.954273	0.079510	0.274201	-0.068357	0.102675	-6.699009	-0.698306	3.084113	0.190630	1.694360	0.153355	-0.479444	-0.299115	-0.793489	-2.950642	1.387353	4.389224
	4	0.203464	1.954910	0.077449	0.112368	0.150302	0.275149	-6.473012	-5.153564	1.867230	0.189794	1.693608	0.146054	-0.278234	-0.048348	-0.523989	1.823821	-3.238908	3.602642
...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...
101	137	0.349034	2.297499	0.056475	-0.493256	0.644704	-0.112424	1.574143	-5.545146	-4.041856	0.410542	2.556118	0.106533	0.308624	0.591946	-0.671209	-8.179141	-3.797458	-10.331998
	138	0.349848	2.298396	0.057959	-0.397720	0.497977	-0.291812	-1.411988	-3.820587	-4.530557	0.417965	2.557068	0.098224	0.125974	0.469589	-0.713592	-3.917129	-3.669168	-10.004805
	139	0.344574	2.310484	0.058835	-0.205808	0.457452	-0.254028	5.376377	-6.819976	-5.684798	0.410538	2.570010	0.097141	-0.075496	0.476007	-0.643591	-7.225660	-5.294753	-3.979611
	140	0.339505	2.324432	0.051485	-0.363760	0.522266	-0.151031	10.019334	-7.228755	1.519903	0.418452	2.579842	0.092782	0.004334	0.487045	-0.719070	-11.001638	-3.606945	3.266294
	141	0.340955	2.317538	0.045446	-0.296046	0.402155	-0.327242	5.374675	-6.203757	8.011034	0.408858	2.577185	0.079282	-0.026589	0.370898	-0.776057	-5.104494	-4.194661	4.593462

1963 rows × 18 columns

To extract the information of a single stride (e.g. stride_ist 55), we can use the loc method.

stride_54 = per_stride.loc[54]
stride_54

marker	ankle									foot_tip
metric	pos_x	pos_y	pos_z	vel_x	vel_y	vel_z	acc_x	acc_y	acc_z	pos_x	pos_y	pos_z	vel_x	vel_y	vel_z	acc_x	acc_y	acc_z
sample
0	0.192753	1.958257	0.082491	0.286104	-0.200563	-0.097244	0.721611	8.414624	0.803715	0.200133	1.704741	0.176551	0.080548	-0.418227	-0.721726	-13.600050	5.803189	-11.501573
1	0.193070	1.950421	0.071967	0.261980	-0.049779	-0.071751	0.374040	9.592503	8.698684	0.201110	1.698432	0.169994	-0.039450	-0.267904	-0.710838	-12.476939	7.288913	-3.356514
2	0.199773	1.955140	0.077238	0.248103	-0.058738	-0.091036	-4.638317	8.426217	8.246024	0.193884	1.695691	0.153551	-0.339818	-0.270176	-0.831751	-5.776758	8.854914	4.857469
3	0.203452	1.954273	0.079510	0.274201	-0.068357	0.102675	-6.699009	-0.698306	3.084113	0.190630	1.694360	0.153355	-0.479444	-0.299115	-0.793489	-2.950642	1.387353	4.389224
4	0.203464	1.954910	0.077449	0.112368	0.150302	0.275149	-6.473012	-5.153564	1.867230	0.189794	1.693608	0.146054	-0.278234	-0.048348	-0.523989	1.823821	-3.238908	3.602642
...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...
134	0.241468	0.760751	0.111336	0.218162	-0.831061	-0.031026	2.153917	12.167150	-7.792384	0.228153	0.509273	0.210098	-0.539473	-0.926413	-0.252263	0.766407	8.995751	-18.505880
135	0.244614	0.751074	0.111428	0.156707	-0.754187	-0.101599	0.823188	17.730366	-7.090275	0.218403	0.501140	0.211520	-0.561376	-0.832486	-0.322680	7.949738	13.164203	-22.556181
136	0.243416	0.744900	0.107063	0.244162	-0.764780	-0.175111	0.471186	14.072620	-4.902506	0.232609	0.492156	0.202855	-0.305319	-0.796774	-0.360671	4.651704	12.463881	-15.043660
137	0.244126	0.741234	0.109109	0.306830	-0.430380	-0.218322	4.352034	8.988792	-4.125971	0.221726	0.487249	0.199454	-0.072033	-0.549494	-0.702970	2.152110	10.498327	-9.447768
138	0.243791	0.727777	0.106066	0.235850	-0.244594	-0.201803	4.433455	11.586198	-1.939656	0.223650	0.473770	0.196879	-0.239267	-0.394179	-0.823661	-2.620951	12.868330	-10.005720

139 rows × 18 columns

fig, ax = plt.subplots()
stride_54["ankle"].filter(like="pos_").plot(ax=ax)

fig.show()

Further Notes#

When applying a train test split on the data, remember that some participants have multiple recordings. Hence, always group by participant before splitting (see tpcp documentation for more details).
When comparing the results of an IMU algorithm to the mocap data, the recommended way is to compare based on aggregated values over the entire gait test, as alignment of individual strides might not always be possible and hide issues, when an algorithm is missing multiple strides. However, keep in mind that the mocap system only covers the middle part of each walk. This means that when comparing aggregated values (in particular measures of variance) using all detected strides in the IMU signal might result in a biased comparison.
When calculating spatial parameters, we recommend the use of the ankle trajectory, as it is the most stable trajectory.

Total running time of the script: ( 0 minutes 24.243 seconds)

Estimated memory usage: 682 MB

Gallery generated by Sphinx-Gallery