Note

Click here to download the full example code

EgaitSegmentationValidation2014 - A Stride Segmentation validation dataset#

The EgaitSegmentationValidation2014 dataset allows access to the stride segmentation validation dataset recorded for the EGait system. It contains multiple 4x10 m walks and simulated “free-living” walks recorded by two foot worn IMU sensors.

Two sets of reference stride borders are provided for the dataset:

1. The orignal stride borders from the original publication. These stride borders were labeled manually by multiple gait experts looking at the raw IMU signal and a video of the participant. The gait experts specifically only labeled full straight strides 2. Updated stride borders, which contain the original stride borders and additionally contain stride borders for all turn and stair strides in the dataset. These new annotations where only performed on the raw data.

General information#

The dataset was recorded with Shimmer 2R sensors. In these IMU nodes, the coordinate systems of the accelerometer and the gyroscope are different.

In the version provided in this dataset, we fix this by transforming the gyroscope data to the accelerometer coordinate system and then transform the combined data to the coordinate system of the gaitmap coordinate system.

Warning

The calibration files distributed with the dataset are likely of low quality. We recommend to only use this dataset for validation of stride segmentation algorithms. Algorithms for spatial parameters that depend on the exact values of the IMU, might not provide good results with this dataset.

Warning

For this example to work, you need to have a global config set containing the path to the dataset. Check the README.md for more information.

from gaitmap_datasets import EgaitSegmentationValidation2014

First we will create a simple instance of the dataset class.

dataset = EgaitSegmentationValidation2014()
dataset

EgaitSegmentationValidation2014 [45 groups/rows]

	cohort	test	participant
0	control	4x10m	GA112030E3
1	control	4x10m	GA213010E5
2	control	4x10m	GA313009E5
3	control	4x10m	GA313011E5
4	control	4x10m	GA313041E5
5	control	4x10m	GA313073E5
6	control	4x10m	GA413003E5
7	control	4x10m	GA413011E5
8	control	4x10m	GA413013E5
9	control	4x10m	GA413048E5
10	control	free_walk	GA214026
11	control	free_walk	GA214030
12	control	free_walk	GA214033
13	control	free_walk	GASTD45JW
14	control	free_walk	GASTD46SAB
15	geriatric	4x10m	P50E6
16	geriatric	4x10m	P51E6
17	geriatric	4x10m	P52E6
18	geriatric	4x10m	P53E6
19	geriatric	4x10m	P54E6
20	geriatric	4x10m	P55E6
21	geriatric	4x10m	P56E6
22	geriatric	4x10m	P57E6
23	geriatric	4x10m	P58E6
24	geriatric	4x10m	P59E6
25	geriatric	free_walk	PAT1WALD
26	geriatric	free_walk	PAT4WALD
27	geriatric	free_walk	PAT5WALD
28	geriatric	free_walk	PAT7WALD
29	geriatric	free_walk	PAT8WALD
30	pd	4x10m	GA313039E5
31	pd	4x10m	GA413002E5
32	pd	4x10m	GA413046E5
33	pd	4x10m	GA413049E5
34	pd	4x10m	GA413051E5
35	pd	4x10m	GA413052E5
36	pd	4x10m	GA413053E5
37	pd	4x10m	GA413054E5
38	pd	4x10m	GA413055E5
39	pd	4x10m	GA413056E5
40	pd	free_walk	GA114059
41	pd	free_walk	GA114063
42	pd	free_walk	GA114065
43	pd	free_walk	GA214020
44	pd	free_walk	GA214021

Based on the index you can select either a specific cohort or test, or a specific participant.

only_free_walk = dataset.get_subset(test="free_walk")
only_free_walk

EgaitSegmentationValidation2014 [15 groups/rows]

	cohort	test	participant
0	control	free_walk	GA214026
1	control	free_walk	GA214030
2	control	free_walk	GA214033
3	control	free_walk	GASTD45JW
4	control	free_walk	GASTD46SAB
5	geriatric	free_walk	PAT1WALD
6	geriatric	free_walk	PAT4WALD
7	geriatric	free_walk	PAT5WALD
8	geriatric	free_walk	PAT7WALD
9	geriatric	free_walk	PAT8WALD
10	pd	free_walk	GA114059
11	pd	free_walk	GA114063
12	pd	free_walk	GA114065
13	pd	free_walk	GA214020
14	pd	free_walk	GA214021

We will investigate the data for a single participant in the following for both types of tests

free_walk = only_free_walk.get_subset(participant="GA214030")
free_walk

EgaitSegmentationValidation2014 [1 groups/rows]

	cohort	test	participant
0	control	free_walk	GA214030

Free-walk#

During the free-walk tests participants were asked to perform a series of activities. This mostly consisted of walking around a room and up and down stairs. We will plot the data together with the stride labels. We can see that multiple strides were labeled over the 4-min period of the measurement. The only exception is a small signal region in the center

import matplotlib.pyplot as plt


def plot_strides(imu_data, segmented_stride_list):
    fig, axs = plt.subplots(2, 1, sharex=True)
    foot = "right_sensor"
    imu_data[foot].filter(like="acc").plot(ax=axs[0])
    imu_data[foot].filter(like="gyr").plot(ax=axs[1])

    for i, s in segmented_stride_list[foot].iterrows():
        s /= free_walk.sampling_rate_hz
        for ax in axs:
            ax.axvline(s["start"], color="k", linestyle="--")
            ax.axvline(s["end"], color="k", linestyle="--")
    return fig


imu_data = free_walk.data
segmented_stride_list_original = free_walk.segmented_stride_list_original_
segmented_stride_list = free_walk.segmented_stride_list_

If we plot the original stride list (the one without stair strides labeled), we can see that there is a section in the middle without any labels

fig = plot_strides(imu_data, segmented_stride_list_original)
fig.title = "Original stride list"
fig.show()

If we zoom into this region, we can see that the signal looks “gait-like”. However, this corresponds to stair walking, which was explicitly not labeled as a stride by the authors of the dataset, as they wanted to show that the algorithms they developed could differentiate between stair walking and level walking. Further, all turning strides are not labeled when using the original stride list ( segmented_stride_list_original_.

fig = plot_strides(imu_data, segmented_stride_list_original)
fig.axes[0].set_xlim(145, 165)
fig.show()

The new relabeled stride list (segmented_stride_list_) contains all strides, including the stair and turning strides. If we plot this stride list, we can see that the signal region in the middle is now labeled as strides.

fig = plot_strides(imu_data, segmented_stride_list)
fig.title = "New stride list"
fig.show()

4x10 m walk#

only_gait_test = dataset.get_subset(test="4x10m")
gait_test = only_gait_test.get_subset(participant="GA112030E3")
gait_test

EgaitSegmentationValidation2014 [1 groups/rows]

	cohort	test	participant
0	control	4x10m	GA112030E3

We will plot the data together with the manually labeled strides. We can clearly see the 4 straight walks during the test.

Like before, if we use the original stride list, the turning strides in between the bouts are not labeled.

imu_data = gait_test.data
segmented_stride_list_original = gait_test.segmented_stride_list_original_
segmented_stride_list = gait_test.segmented_stride_list_

fig = plot_strides(imu_data, segmented_stride_list_original)
fig.title = "Original stride list"
fig.show()

If we use the new stride list, we can see that the turning strides are now labeled.

fig = plot_strides(imu_data, segmented_stride_list)
fig.title = "New stride list"
fig.show()

Stride List Recommendation#

While in many cases we are only interested in analyzing straight strides, we recommend to use the new stride list when validation stride segmentation algorithms. It is usually better to have a segmentation algorithm with high sensitivity that is able to identify all stride-like signal portions and then filter out unwanted stride types in later processing steps. For this reason, we made the new stride list the default stride list in the dataset.

Total running time of the script: ( 0 minutes 7.407 seconds)

Estimated memory usage: 85 MB

Gallery generated by Sphinx-Gallery