gaitmap_datasets.SensorPositionComparison2019Mocap#

class gaitmap_datasets.SensorPositionComparison2019Mocap(data_folder: Optional[Union[str, Path]] = None, *, include_wrong_recording: bool = False, align_data: bool = True, data_padding_s: float = 0, memory: Memory = Memory(location=None), groupby_cols: Optional[Union[List[str], str]] = None, subset_index: Optional[DataFrame] = None)[source]#

A dataset for trajectory benchmarking.

Data is only loaded once the respective attributes are accessed. This means filtering the dataset should be fast, but accessing attributes like .data can be slow. By default, we do not perform any caching of these values. This means, if you need to use the value multiple times, the best way is to assign it to a variable. Alternatively, you can use the memory parameter to create a disk based cache for the data loading.

Parameters:

data_folder: The base folder where the dataset can be found.
include_wrong_recording: If True the first trail of 6dbe is included, which has one missing sensor
align_data: If True the coordinate systems of all sensors are roughly aligned based on their known mounting orientation
data_padding_s: A number of seconds that are added to the start and the end of each IMU recording. This can be used to get a longer static period before each gait test to perform e.g. gravity based alignments. For samples before the start of the gait test, the second index of the pd.DataFrame is set to negative values. This should make it easy to remove the padded values if required.

Warning

The same padding is not applied to the mocap samples (as we do not have any mocap samples outside the gait tests)! However, the time value provided in the index of the pandas Dataframe are still aligned, as we add negative time values to the IMU time index.
memory: Optional joblib memory object to cache the data loading. Note that this can lead to large hard disk usage!
groupby_cols: tpcp internal parameters.
subset_index: tpcp internal parameters.

Attributes:

data: Get the IMU data per gait test.
data_padding_imu_samples: Get the actual padding in samples based on data_padding_s.
group: Get the current group.
grouped_index: Return the index with the groupby columns set as multiindex.
groups: Get all groups based on the set groupby level.
index: Get index.
marker_position_: Get the marker trajectories of a test.
marker_position_per_stride_
metadata: Get the metadata for a participant.
mocap_events_: Get mocap events calculated the Zeni Algorithm.
mocap_sampling_rate_hz_: Get the sampling rate of the motion capture system.
sampling_rate_hz: Get the sampling rate of the IMUs.
segmented_stride_list_: Get the manual segmented stride list per foot.
segmented_stride_list_per_sensor_: Get the segmented stride list per sensor.
shape: Get the shape of the dataset.

Methods

`as_attrs`()	Return a version of the Dataset class that can be subclassed using `attrs` defined classes.
`as_dataclass`()	Return a version of the Dataset class that can be subclassed using dataclasses.
`assert_is_single`(groupby_cols, property_name)	Raise error if index does contain more than one group/row with the given groupby settings.
`assert_is_single_group`(property_name)	Raise error if index does contain more than one group/row.
`clone`()	Create a new instance of the class with all parameters copied over.
`convert_events_with_padding`(events, ...)	Convert the time/sample values of mocap and IMU events/stride lists into other time domains.
`create_group_labels`(label_cols)	Generate a list of labels for each group/row in the dataset.
`create_index`()	Create the full index for the dataset.
`get_params`([deep])	Get parameters for this algorithm.
`get_subset`(*[, groups, index, bool_map])	Get a subset of the dataset.
`groupby`(groupby_cols)	Return a copy of the dataset grouped by the specified columns.
`is_single`(groupby_cols)	Return True if index contains only one row/group with the given groupby settings.
`is_single_group`()	Return True if index contains only one group.
`iter_level`(level)	Return generator object containing a subset for every category from the selected level.
`set_params`(**params)	Set the parameters of this Algorithm.

static as_attrs()[source]#

Return a version of the Dataset class that can be subclassed using attrs defined classes.

Note, this requires attrs to be installed!

static as_dataclass()[source]#: Return a version of the Dataset class that can be subclassed using dataclasses.

assert_is_single(groupby_cols: Optional[Union[List[str], str]], property_name) → None[source]#

Raise error if index does contain more than one group/row with the given groupby settings.

This should be used when implementing access to data values, which can only be accessed when only a single trail/participant/etc. exist in the dataset.

Parameters:

groupby_cols: None (no grouping) or a valid subset of the columns available in the dataset index.
property_name: Name of the property this check is used in. Used to format the error message.

assert_is_single_group(property_name) → None[source]#

Raise error if index does contain more than one group/row.

Note that this is different from assert_is_single as it is aware of the current grouping. Instead of checking that a certain combination of columns is left in the dataset, it checks that only a single group exists with the already selected grouping as defined by self.groupby_cols.

Parameters:

property_name: Name of the property this check is used in. Used to format the error message.

clone() → Self[source]#

Create a new instance of the class with all parameters copied over.

This will create a new instance of the class itself and all nested objects

convert_events_with_padding(events: DataFrame, from_time_axis: Literal['mocap', 'imu'], to_time_axis: Literal['mocap', 'imu', 'time'])[source]#

Convert the time/sample values of mocap and IMU events/stride lists into other time domains.

This method will use the respective sampling rates and the padding of the IMU data to convert the time/sample.

… warning::: This method will only work, if the provided samples follow the padding conventions used in this class! This means, if the input are events in IMU samples (from_time_axis="imu"), they must respect the padding of the IMU data. I.e. the first sample of the IMU data is sample 0 and test start is sample self.data_padding_imu_samples. If the input are events in mocap samples (from_time_axis="mocap"), they must not include the padding. I.e. the first sample of the mocap data is sample 0 and test start is sample 0.
… note::: If the input are events in IMU samples (from_time_axis="imu") and padding is used, it can happen that the resulting mocap samples have negative values (as the events occure before the start of the test).

create_group_labels(label_cols: Union[str, List[str]]) → List[str][source]#

Generate a list of labels for each group/row in the dataset.

Note

This has a different use case than the dataset-wide groupby. Using groupby reduces the effective size of the dataset to the number of groups. This method produces a group label for each group/row that is already in the dataset, without changing the dataset.

The output of this method can be used in combination with GroupKFold as the group label.

Parameters:

label_cols: The columns that should be included in the label. If the dataset is already grouped, this must be a subset of self.groupby_cols.

create_index() → DataFrame[source]#

Create the full index for the dataset.

This needs to be implemented by the subclass.

Warning

Make absolutely sure that the dataframe you return is deterministic and does not change between runs! This can lead to some nasty bugs! We try to catch them internally, but it is not always possible. As tips, avoid reliance on random numbers and make sure that the order is not depend on things like file system order, when creating an index by scanning a directory. Particularly nasty are cases when using non-sorted container like set, that sometimes maintain their order, but sometimes don’t. At the very least, we recommend to sort the final dataframe you return in create_index.

property data: DataFrame#

Get the IMU data per gait test.

Get the data per gait test. If self.data_padding_s is set, the extracted data region extends by that amount of second beyond the actual gait test. Keep that in mind, when aligning data to mocap. The time axis is provided in seconds and the 0 will be at the actual start of the gait test.

property data_padding_imu_samples: int#: Get the actual padding in samples based on data_padding_s.

get_params(deep: bool = True) → Dict[str, Any][source]#

Get parameters for this algorithm.

Parameters:

deep: Only relevant if object contains nested algorithm objects. If this is the case and deep is True, the params of these nested objects are included in the output using a prefix like nested_object_name__ (Note the two “_” at the end)

Returns:

params: Parameter names mapped to their values.

get_subset(*, groups: Optional[List[Union[str, Tuple[str, ...]]]] = None, index: Optional[DataFrame] = None, bool_map: Optional[Sequence[bool]] = None, **kwargs: Union[List[str], str]) → Self[source]#

Get a subset of the dataset.

Note

All arguments are mutable exclusive!

Parameters:

groups: A valid row locator or slice that can be passed to self.grouped_index.loc[locator, :]. This basically needs to be a subset of self.groups. Note that this is the only indexer that works on the grouped index. All other indexers work on the pure index.
index: pd.DataFrame that is a valid subset of the current dataset index.
bool_map: bool-map that is used to index the current index-dataframe. The list must be of same length as the number of rows in the index.
**kwargs: The key must be the name of an index column. The value is a list containing strings that correspond to the categories that should be kept. For examples see above.

Returns:

subset: New dataset object filtered by specified parameters.

property group: Union[str, Tuple[str, ...]]#

Get the current group.

Note, this attribute can only be used, if there is just a single group. If there is only a single groupby column or column in the index, this will return a string. Otherwise, this will return a named tuple.

groupby(groupby_cols: Optional[Union[List[str], str]]) → Self[source]#

Return a copy of the dataset grouped by the specified columns.

Each unique group represents a single data point in the resulting dataset.

Parameters:

groupby_cols: None (no grouping) or a valid subset of the columns available in the dataset index.

property grouped_index: DataFrame#: Return the index with the groupby columns set as multiindex.

property groups: List[Union[str, Tuple[str, ...]]]#

Get all groups based on the set groupby level.

This will either return a list of strings/integers, if there is only a single group level or index column. If there are multiple groupy levels/index columns, it will return a list of named tuples.

Note, that if one of the groupby levels/index columns is not a valid Python attribute name (e.g. in contains spaces or starts with a number), the named tuple will not contain the correct column name! For more information see the documentation of the rename parameter of collections.namedtuple.

property index: DataFrame#: Get index.

is_single(groupby_cols: Optional[Union[List[str], str]]) → bool[source]#

Return True if index contains only one row/group with the given groupby settings.

If groupby_cols=None this checks if there is only a single row left. If you want to check if there is only a single group within the current grouping, use is_single_group instead.

Parameters:

groupby_cols: None (no grouping) or a valid subset of the columns available in the dataset index.

is_single_group() → bool[source]#: Return True if index contains only one group.

iter_level(level: str) → Iterator[Self][source]#

Return generator object containing a subset for every category from the selected level.

Parameters:

level: Optional str that sets the level which shall be used for iterating. This must be one of the columns names of the index.

Returns:

subset: New dataset object containing only one category in the specified level.

property marker_position_: DataFrame#

Get the marker trajectories of a test.

Note, the index is provided in seconds after the start of the test and self.data_padding_s is ignored! However, as long as the time domain index is used, the two data streams are aligned.

All values are provided in mm in the global coordinate system of the motion capture system.

NaN values are provided, if one of the marker was not visible in the mocap system and its trajectory could not be restored.

property metadata: Dict[str, Any]#: Get the metadata for a participant.

property mocap_events_: Dict[str, DataFrame]#

Get mocap events calculated the Zeni Algorithm.

Note that the events are provided in mocap samples after the start of the test. This means self.data_padding_s is ignored here. Use self.convert_with_padding to convert the events to IMU samples/seconds while respecting the padding.

property mocap_sampling_rate_hz_: float#: Get the sampling rate of the motion capture system.

property sampling_rate_hz: float#: Get the sampling rate of the IMUs.

property segmented_stride_list_: Dict[str, DataFrame]#: Get the manual segmented stride list per foot.

property segmented_stride_list_per_sensor_: Dict[str, DataFrame]#

Get the segmented stride list per sensor.

Instead of providing the stride list per foot, this ouput has all the sensors as keys and the correct stridelist (either left or right foot) as value. This can be helpful, if you want to iterate over all sensors and get the correct stride list.

set_params(**params: Any) → Self[source]#

Set the parameters of this Algorithm.

To set parameters of nested objects use nested_object_name__para_name=.

property shape: Tuple[int]#

Get the shape of the dataset.

This only reports a single dimension. This is equal to the number of rows in the index, if self.groupby_cols=None. Otherwise, it is equal to the number of unique groups.