gaitmap_datasets.PyShoe2019Stairs#

class gaitmap_datasets.PyShoe2019Stairs(data_folder: Optional[Union[str, Path]] = None, *, groupby_cols: Optional[Union[List[str], str]] = None, subset_index: Optional[DataFrame] = None)[source]#

Dataset helper for the staircase portion for the PyShoe dataset.

Note, this only contains the data of the “test” subfolder, as only this part of the data contains the ground truth reference derived based on the stair geometries.

Parameters:
data_folder

The base folder where the dataset can be found. Note, this should be the folder that was created when downloading the PyShoe dataset and not just the “data” sub-folder.

groupby_cols

tpcp internal parameters.

subset_index

tpcp internal parameters.

Attributes:
data

Get the imu data.

group

Get the current group.

grouped_index

Return the index with the groupby columns set as multiindex.

groups

Get all groups based on the set groupby level.

index

Get index.

position_reference_

Get the position reference along the trial in mm from the starting position.

position_reference_index_

Get the indices when the position reference was sampled.

sampling_rate_hz

Get the sampling rate of the IMUs.

shape

Get the shape of the dataset.

Methods

as_attrs()

Return a version of the Dataset class that can be subclassed using attrs defined classes.

as_dataclass()

Return a version of the Dataset class that can be subclassed using dataclasses.

assert_is_single(groupby_cols, property_name)

Raise error if index does contain more than one group/row with the given groupby settings.

assert_is_single_group(property_name)

Raise error if index does contain more than one group/row.

clone()

Create a new instance of the class with all parameters copied over.

create_group_labels(label_cols)

Generate a list of labels for each group/row in the dataset.

create_index()

Create the index for the dataset.

get_params([deep])

Get parameters for this algorithm.

get_subset(*[, groups, index, bool_map])

Get a subset of the dataset.

groupby(groupby_cols)

Return a copy of the dataset grouped by the specified columns.

is_single(groupby_cols)

Return True if index contains only one row/group with the given groupby settings.

is_single_group()

Return True if index contains only one group.

iter_level(level)

Return generator object containing a subset for every category from the selected level.

set_params(**params)

Set the parameters of this Algorithm.

static as_attrs()[source]#

Return a version of the Dataset class that can be subclassed using attrs defined classes.

Note, this requires attrs to be installed!

static as_dataclass()[source]#

Return a version of the Dataset class that can be subclassed using dataclasses.

assert_is_single(groupby_cols: Optional[Union[List[str], str]], property_name) None[source]#

Raise error if index does contain more than one group/row with the given groupby settings.

This should be used when implementing access to data values, which can only be accessed when only a single trail/participant/etc. exist in the dataset.

Parameters:
groupby_cols

None (no grouping) or a valid subset of the columns available in the dataset index.

property_name

Name of the property this check is used in. Used to format the error message.

assert_is_single_group(property_name) None[source]#

Raise error if index does contain more than one group/row.

Note that this is different from assert_is_single as it is aware of the current grouping. Instead of checking that a certain combination of columns is left in the dataset, it checks that only a single group exists with the already selected grouping as defined by self.groupby_cols.

Parameters:
property_name

Name of the property this check is used in. Used to format the error message.

clone() Self[source]#

Create a new instance of the class with all parameters copied over.

This will create a new instance of the class itself and all nested objects

create_group_labels(label_cols: Union[str, List[str]]) List[str][source]#

Generate a list of labels for each group/row in the dataset.

Note

This has a different use case than the dataset-wide groupby. Using groupby reduces the effective size of the dataset to the number of groups. This method produces a group label for each group/row that is already in the dataset, without changing the dataset.

The output of this method can be used in combination with GroupKFold as the group label.

Parameters:
label_cols

The columns that should be included in the label. If the dataset is already grouped, this must be a subset of self.groupby_cols.

create_index() DataFrame[source]#

Create the index for the dataset.

property data: DataFrame#

Get the imu data.

The index is provided as seconds since the start of the trial.

get_params(deep: bool = True) Dict[str, Any][source]#

Get parameters for this algorithm.

Parameters:
deep

Only relevant if object contains nested algorithm objects. If this is the case and deep is True, the params of these nested objects are included in the output using a prefix like nested_object_name__ (Note the two “_” at the end)

Returns:
params

Parameter names mapped to their values.

get_subset(*, groups: Optional[List[Union[str, Tuple[str, ...]]]] = None, index: Optional[DataFrame] = None, bool_map: Optional[Sequence[bool]] = None, **kwargs: Union[List[str], str]) Self[source]#

Get a subset of the dataset.

Note

All arguments are mutable exclusive!

Parameters:
groups

A valid row locator or slice that can be passed to self.grouped_index.loc[locator, :]. This basically needs to be a subset of self.groups. Note that this is the only indexer that works on the grouped index. All other indexers work on the pure index.

index

pd.DataFrame that is a valid subset of the current dataset index.

bool_map

bool-map that is used to index the current index-dataframe. The list must be of same length as the number of rows in the index.

**kwargs

The key must be the name of an index column. The value is a list containing strings that correspond to the categories that should be kept. For examples see above.

Returns:
subset

New dataset object filtered by specified parameters.

property group: Union[str, Tuple[str, ...]]#

Get the current group.

Note, this attribute can only be used, if there is just a single group. If there is only a single groupby column or column in the index, this will return a string. Otherwise, this will return a named tuple.

groupby(groupby_cols: Optional[Union[List[str], str]]) Self[source]#

Return a copy of the dataset grouped by the specified columns.

Each unique group represents a single data point in the resulting dataset.

Parameters:
groupby_cols

None (no grouping) or a valid subset of the columns available in the dataset index.

property grouped_index: DataFrame#

Return the index with the groupby columns set as multiindex.

property groups: List[Union[str, Tuple[str, ...]]]#

Get all groups based on the set groupby level.

This will either return a list of strings/integers, if there is only a single group level or index column. If there are multiple groupy levels/index columns, it will return a list of named tuples.

Note, that if one of the groupby levels/index columns is not a valid Python attribute name (e.g. in contains spaces or starts with a number), the named tuple will not contain the correct column name! For more information see the documentation of the rename parameter of collections.namedtuple.

property index: DataFrame#

Get index.

is_single(groupby_cols: Optional[Union[List[str], str]]) bool[source]#

Return True if index contains only one row/group with the given groupby settings.

If groupby_cols=None this checks if there is only a single row left. If you want to check if there is only a single group within the current grouping, use is_single_group instead.

Parameters:
groupby_cols

None (no grouping) or a valid subset of the columns available in the dataset index.

is_single_group() bool[source]#

Return True if index contains only one group.

iter_level(level: str) Iterator[Self][source]#

Return generator object containing a subset for every category from the selected level.

Parameters:
level

Optional str that sets the level which shall be used for iterating. This must be one of the columns names of the index.

Returns:
subset

New dataset object containing only one category in the specified level.

property position_reference_: DataFrame#

Get the position reference along the trial in mm from the starting position.

The returning dataframe provides the expected position of the sensor during specific time points during the trials. The index is provided as seconds since the start of the trial and should line up with the imu data.

If the sampling point of the reference is required as indices of the imu data, use the position_reference_index_ property.

Note, that for this dataset only reference for the z-level is provided as the ground truth was derived based on the stair geometries.

property position_reference_index_: Series#

Get the indices when the position reference was sampled.

property sampling_rate_hz: float#

Get the sampling rate of the IMUs.

set_params(**params: Any) Self[source]#

Set the parameters of this Algorithm.

To set parameters of nested objects use nested_object_name__para_name=.

property shape: Tuple[int]#

Get the shape of the dataset.

This only reports a single dimension. This is equal to the number of rows in the index, if self.groupby_cols=None. Otherwise, it is equal to the number of unique groups.