gaitmap_datasets.Kluge2017#

class gaitmap_datasets.Kluge2017(data_folder: Optional[Union[str, Path]] = None, *, memory: Memory = Memory(location=None), groupby_cols: Optional[Union[List[str], str]] = None, subset_index: Optional[DataFrame] = None)[source]#

A dataset to validate spatial-temporal parameters in healthy and PD.

Parameters:

data_folderOptional[Union[str, Path]], optional: The base folder where the dataset can be found.
memoryMemory, optional: A memory object to optioanlly use disk caching to speed up loading.
groupby_cols: tpcp internal parameters.
subset_index: tpcp internal parameters.

Attributes:

data
group: Get the current group.
grouped_index: Return the index with the groupby columns set as multiindex.
groups: Get all groups based on the set groupby level.
index: Get index.
marker_position_
marker_position_per_stride_
mocap_events_
mocap_sampling_rate_hz_: Get the sampling rate of the IMUs.
sampling_rate_hz: Get the sampling rate of the IMUs.
shape: Get the shape of the dataset.

Methods

`as_attrs`()	Return a version of the Dataset class that can be subclassed using `attrs` defined classes.
`as_dataclass`()	Return a version of the Dataset class that can be subclassed using dataclasses.
`assert_is_single`(groupby_cols, property_name)	Raise error if index does contain more than one group/row with the given groupby settings.
`assert_is_single_group`(property_name)	Raise error if index does contain more than one group/row.
`clone`()	Create a new instance of the class with all parameters copied over.
`create_group_labels`(label_cols)	Generate a list of labels for each group/row in the dataset.
`create_index`()	Create the index for the dataset.
`get_params`([deep])	Get parameters for this algorithm.
`get_subset`(*[, groups, index, bool_map])	Get a subset of the dataset.
`groupby`(groupby_cols)	Return a copy of the dataset grouped by the specified columns.
`is_single`(groupby_cols)	Return True if index contains only one row/group with the given groupby settings.
`is_single_group`()	Return True if index contains only one group.
`iter_level`(level)	Return generator object containing a subset for every category from the selected level.
`set_params`(**params)	Set the parameters of this Algorithm.

static as_attrs()[source]#

Return a version of the Dataset class that can be subclassed using attrs defined classes.

Note, this requires attrs to be installed!

static as_dataclass()[source]#: Return a version of the Dataset class that can be subclassed using dataclasses.

assert_is_single(groupby_cols: Optional[Union[List[str], str]], property_name) → None[source]#

Raise error if index does contain more than one group/row with the given groupby settings.

This should be used when implementing access to data values, which can only be accessed when only a single trail/participant/etc. exist in the dataset.

Parameters:

groupby_cols: None (no grouping) or a valid subset of the columns available in the dataset index.
property_name: Name of the property this check is used in. Used to format the error message.

assert_is_single_group(property_name) → None[source]#

Raise error if index does contain more than one group/row.

Note that this is different from assert_is_single as it is aware of the current grouping. Instead of checking that a certain combination of columns is left in the dataset, it checks that only a single group exists with the already selected grouping as defined by self.groupby_cols.

Parameters:

property_name: Name of the property this check is used in. Used to format the error message.

clone() → Self[source]#

Create a new instance of the class with all parameters copied over.

This will create a new instance of the class itself and all nested objects

create_group_labels(label_cols: Union[str, List[str]]) → List[str][source]#

Generate a list of labels for each group/row in the dataset.

Note

This has a different use case than the dataset-wide groupby. Using groupby reduces the effective size of the dataset to the number of groups. This method produces a group label for each group/row that is already in the dataset, without changing the dataset.

The output of this method can be used in combination with GroupKFold as the group label.

Parameters:

label_cols: The columns that should be included in the label. If the dataset is already grouped, this must be a subset of self.groupby_cols.

create_index() → DataFrame[source]#: Create the index for the dataset.

get_params(deep: bool = True) → Dict[str, Any][source]#

Get parameters for this algorithm.

Parameters:

deep: Only relevant if object contains nested algorithm objects. If this is the case and deep is True, the params of these nested objects are included in the output using a prefix like nested_object_name__ (Note the two “_” at the end)

Returns:

params: Parameter names mapped to their values.

get_subset(*, groups: Optional[List[Union[str, Tuple[str, ...]]]] = None, index: Optional[DataFrame] = None, bool_map: Optional[Sequence[bool]] = None, **kwargs: Union[List[str], str]) → Self[source]#

Get a subset of the dataset.

Note

All arguments are mutable exclusive!

Parameters:

groups: A valid row locator or slice that can be passed to self.grouped_index.loc[locator, :]. This basically needs to be a subset of self.groups. Note that this is the only indexer that works on the grouped index. All other indexers work on the pure index.
index: pd.DataFrame that is a valid subset of the current dataset index.
bool_map: bool-map that is used to index the current index-dataframe. The list must be of same length as the number of rows in the index.
**kwargs: The key must be the name of an index column. The value is a list containing strings that correspond to the categories that should be kept. For examples see above.

Returns:

subset: New dataset object filtered by specified parameters.

property group: Union[str, Tuple[str, ...]]#

Get the current group.

Note, this attribute can only be used, if there is just a single group. If there is only a single groupby column or column in the index, this will return a string. Otherwise, this will return a named tuple.

groupby(groupby_cols: Optional[Union[List[str], str]]) → Self[source]#

Return a copy of the dataset grouped by the specified columns.

Each unique group represents a single data point in the resulting dataset.

Parameters:

groupby_cols: None (no grouping) or a valid subset of the columns available in the dataset index.

property grouped_index: DataFrame#: Return the index with the groupby columns set as multiindex.

property groups: List[Union[str, Tuple[str, ...]]]#

Get all groups based on the set groupby level.

This will either return a list of strings/integers, if there is only a single group level or index column. If there are multiple groupy levels/index columns, it will return a list of named tuples.

Note, that if one of the groupby levels/index columns is not a valid Python attribute name (e.g. in contains spaces or starts with a number), the named tuple will not contain the correct column name! For more information see the documentation of the rename parameter of collections.namedtuple.

property index: DataFrame#: Get index.

is_single(groupby_cols: Optional[Union[List[str], str]]) → bool[source]#

Return True if index contains only one row/group with the given groupby settings.

If groupby_cols=None this checks if there is only a single row left. If you want to check if there is only a single group within the current grouping, use is_single_group instead.

Parameters:

groupby_cols: None (no grouping) or a valid subset of the columns available in the dataset index.

is_single_group() → bool[source]#: Return True if index contains only one group.

iter_level(level: str) → Iterator[Self][source]#

Return generator object containing a subset for every category from the selected level.

Parameters:

level: Optional str that sets the level which shall be used for iterating. This must be one of the columns names of the index.

Returns:

subset: New dataset object containing only one category in the specified level.

property mocap_sampling_rate_hz_: float#: Get the sampling rate of the IMUs.

property sampling_rate_hz: float#: Get the sampling rate of the IMUs.

set_params(**params: Any) → Self[source]#

Set the parameters of this Algorithm.

To set parameters of nested objects use nested_object_name__para_name=.

property shape: Tuple[int]#

Get the shape of the dataset.

This only reports a single dimension. This is equal to the number of rows in the index, if self.groupby_cols=None. Otherwise, it is equal to the number of unique groups.