gaitmap_datasets.Kluge2017#

class gaitmap_datasets.Kluge2017(data_folder: Optional[Union[str, Path]] = None, *, memory: Memory = Memory(location=None), groupby_cols: Optional[Union[List[str], str]] = None, subset_index: Optional[DataFrame] = None)[source]#

A dataset to validate spatial-temporal parameters in healthy and PD.

Parameters:
data_folderOptional[Union[str, Path]], optional

The base folder where the dataset can be found.

memoryMemory, optional

A memory object to optioanlly use disk caching to speed up loading.

groupby_cols

tpcp internal parameters.

subset_index

tpcp internal parameters.

Attributes:
data
group

Get the current group.

grouped_index

Return the index with the groupby columns set as multiindex.

groups

Get all groups based on the set groupby level.

index

Get index.

marker_position_
marker_position_per_stride_
mocap_events_
mocap_sampling_rate_hz_

Get the sampling rate of the IMUs.

sampling_rate_hz

Get the sampling rate of the IMUs.

shape

Get the shape of the dataset.

Methods

as_attrs()

Return a version of the Dataset class that can be subclassed using attrs defined classes.

as_dataclass()

Return a version of the Dataset class that can be subclassed using dataclasses.

assert_is_single(groupby_cols, property_name)

Raise error if index does contain more than one group/row with the given groupby settings.

assert_is_single_group(property_name)

Raise error if index does contain more than one group/row.

clone()

Create a new instance of the class with all parameters copied over.

create_group_labels(label_cols)

Generate a list of labels for each group/row in the dataset.

create_index()

Create the index for the dataset.

get_params([deep])

Get parameters for this algorithm.

get_subset(*[, groups, index, bool_map])

Get a subset of the dataset.

groupby(groupby_cols)

Return a copy of the dataset grouped by the specified columns.

is_single(groupby_cols)

Return True if index contains only one row/group with the given groupby settings.

is_single_group()

Return True if index contains only one group.

iter_level(level)

Return generator object containing a subset for every category from the selected level.

set_params(**params)

Set the parameters of this Algorithm.

static as_attrs()[source]#

Return a version of the Dataset class that can be subclassed using attrs defined classes.

Note, this requires attrs to be installed!

static as_dataclass()[source]#

Return a version of the Dataset class that can be subclassed using dataclasses.

assert_is_single(groupby_cols: Optional[Union[List[str], str]], property_name) None[source]#

Raise error if index does contain more than one group/row with the given groupby settings.

This should be used when implementing access to data values, which can only be accessed when only a single trail/participant/etc. exist in the dataset.

Parameters:
groupby_cols

None (no grouping) or a valid subset of the columns available in the dataset index.

property_name

Name of the property this check is used in. Used to format the error message.

assert_is_single_group(property_name) None[source]#

Raise error if index does contain more than one group/row.

Note that this is different from assert_is_single as it is aware of the current grouping. Instead of checking that a certain combination of columns is left in the dataset, it checks that only a single group exists with the already selected grouping as defined by self.groupby_cols.

Parameters:
property_name

Name of the property this check is used in. Used to format the error message.

clone() Self[source]#

Create a new instance of the class with all parameters copied over.

This will create a new instance of the class itself and all nested objects

create_group_labels(label_cols: Union[str, List[str]]) List[str][source]#

Generate a list of labels for each group/row in the dataset.

Note

This has a different use case than the dataset-wide groupby. Using groupby reduces the effective size of the dataset to the number of groups. This method produces a group label for each group/row that is already in the dataset, without changing the dataset.

The output of this method can be used in combination with GroupKFold as the group label.

Parameters:
label_cols

The columns that should be included in the label. If the dataset is already grouped, this must be a subset of self.groupby_cols.

create_index() DataFrame[source]#

Create the index for the dataset.

get_params(deep: bool = True) Dict[str, Any][source]#

Get parameters for this algorithm.

Parameters:
deep

Only relevant if object contains nested algorithm objects. If this is the case and deep is True, the params of these nested objects are included in the output using a prefix like nested_object_name__ (Note the two “_” at the end)

Returns:
params

Parameter names mapped to their values.

get_subset(*, groups: Optional[List[Union[str, Tuple[str, ...]]]] = None, index: Optional[DataFrame] = None, bool_map: Optional[Sequence[bool]] = None, **kwargs: Union[List[str], str]) Self[source]#

Get a subset of the dataset.

Note

All arguments are mutable exclusive!

Parameters:
groups

A valid row locator or slice that can be passed to self.grouped_index.loc[locator, :]. This basically needs to be a subset of self.groups. Note that this is the only indexer that works on the grouped index. All other indexers work on the pure index.

index

pd.DataFrame that is a valid subset of the current dataset index.

bool_map

bool-map that is used to index the current index-dataframe. The list must be of same length as the number of rows in the index.

**kwargs

The key must be the name of an index column. The value is a list containing strings that correspond to the categories that should be kept. For examples see above.

Returns:
subset

New dataset object filtered by specified parameters.

property group: Union[str, Tuple[str, ...]]#

Get the current group.

Note, this attribute can only be used, if there is just a single group. If there is only a single groupby column or column in the index, this will return a string. Otherwise, this will return a named tuple.

groupby(groupby_cols: Optional[Union[List[str], str]]) Self[source]#

Return a copy of the dataset grouped by the specified columns.

Each unique group represents a single data point in the resulting dataset.

Parameters:
groupby_cols

None (no grouping) or a valid subset of the columns available in the dataset index.

property grouped_index: DataFrame#

Return the index with the groupby columns set as multiindex.

property groups: List[Union[str, Tuple[str, ...]]]#

Get all groups based on the set groupby level.

This will either return a list of strings/integers, if there is only a single group level or index column. If there are multiple groupy levels/index columns, it will return a list of named tuples.

Note, that if one of the groupby levels/index columns is not a valid Python attribute name (e.g. in contains spaces or starts with a number), the named tuple will not contain the correct column name! For more information see the documentation of the rename parameter of collections.namedtuple.

property index: DataFrame#

Get index.

is_single(groupby_cols: Optional[Union[List[str], str]]) bool[source]#

Return True if index contains only one row/group with the given groupby settings.

If groupby_cols=None this checks if there is only a single row left. If you want to check if there is only a single group within the current grouping, use is_single_group instead.

Parameters:
groupby_cols

None (no grouping) or a valid subset of the columns available in the dataset index.

is_single_group() bool[source]#

Return True if index contains only one group.

iter_level(level: str) Iterator[Self][source]#

Return generator object containing a subset for every category from the selected level.

Parameters:
level

Optional str that sets the level which shall be used for iterating. This must be one of the columns names of the index.

Returns:
subset

New dataset object containing only one category in the specified level.

property mocap_sampling_rate_hz_: float#

Get the sampling rate of the IMUs.

property sampling_rate_hz: float#

Get the sampling rate of the IMUs.

set_params(**params: Any) Self[source]#

Set the parameters of this Algorithm.

To set parameters of nested objects use nested_object_name__para_name=.

property shape: Tuple[int]#

Get the shape of the dataset.

This only reports a single dimension. This is equal to the number of rows in the index, if self.groupby_cols=None. Otherwise, it is equal to the number of unique groups.