gaitmap_datasets.Kluge2017#
- class gaitmap_datasets.Kluge2017(data_folder: Optional[Union[str, Path]] = None, *, memory: Memory = Memory(location=None), groupby_cols: Optional[Union[List[str], str]] = None, subset_index: Optional[DataFrame] = None)[source]#
A dataset to validate spatial-temporal parameters in healthy and PD.
- Parameters:
- data_folderOptional[Union[str, Path]], optional
The base folder where the dataset can be found.
- memoryMemory, optional
A memory object to optioanlly use disk caching to speed up loading.
- groupby_cols
tpcp
internal parameters.- subset_index
tpcp
internal parameters.
- Attributes:
- data
group
Get the current group.
grouped_index
Return the index with the
groupby
columns set as multiindex.groups
Get all groups based on the set groupby level.
index
Get index.
- marker_position_
- marker_position_per_stride_
- mocap_events_
mocap_sampling_rate_hz_
Get the sampling rate of the IMUs.
sampling_rate_hz
Get the sampling rate of the IMUs.
shape
Get the shape of the dataset.
Methods
as_attrs
()Return a version of the Dataset class that can be subclassed using
attrs
defined classes.Return a version of the Dataset class that can be subclassed using dataclasses.
assert_is_single
(groupby_cols, property_name)Raise error if index does contain more than one group/row with the given groupby settings.
assert_is_single_group
(property_name)Raise error if index does contain more than one group/row.
clone
()Create a new instance of the class with all parameters copied over.
create_group_labels
(label_cols)Generate a list of labels for each group/row in the dataset.
Create the index for the dataset.
get_params
([deep])Get parameters for this algorithm.
get_subset
(*[, groups, index, bool_map])Get a subset of the dataset.
groupby
(groupby_cols)Return a copy of the dataset grouped by the specified columns.
is_single
(groupby_cols)Return True if index contains only one row/group with the given groupby settings.
Return True if index contains only one group.
iter_level
(level)Return generator object containing a subset for every category from the selected level.
set_params
(**params)Set the parameters of this Algorithm.
- static as_attrs()[source]#
Return a version of the Dataset class that can be subclassed using
attrs
defined classes.Note, this requires
attrs
to be installed!
- static as_dataclass()[source]#
Return a version of the Dataset class that can be subclassed using dataclasses.
- assert_is_single(groupby_cols: Optional[Union[List[str], str]], property_name) None [source]#
Raise error if index does contain more than one group/row with the given groupby settings.
This should be used when implementing access to data values, which can only be accessed when only a single trail/participant/etc. exist in the dataset.
- Parameters:
- groupby_cols
None (no grouping) or a valid subset of the columns available in the dataset index.
- property_name
Name of the property this check is used in. Used to format the error message.
- assert_is_single_group(property_name) None [source]#
Raise error if index does contain more than one group/row.
Note that this is different from
assert_is_single
as it is aware of the current grouping. Instead of checking that a certain combination of columns is left in the dataset, it checks that only a single group exists with the already selected grouping as defined byself.groupby_cols
.- Parameters:
- property_name
Name of the property this check is used in. Used to format the error message.
- clone() Self [source]#
Create a new instance of the class with all parameters copied over.
This will create a new instance of the class itself and all nested objects
- create_group_labels(label_cols: Union[str, List[str]]) List[str] [source]#
Generate a list of labels for each group/row in the dataset.
Note
This has a different use case than the dataset-wide groupby. Using
groupby
reduces the effective size of the dataset to the number of groups. This method produces a group label for each group/row that is already in the dataset, without changing the dataset.The output of this method can be used in combination with
GroupKFold
as the group label.- Parameters:
- label_cols
The columns that should be included in the label. If the dataset is already grouped, this must be a subset of
self.groupby_cols
.
- get_params(deep: bool = True) Dict[str, Any] [source]#
Get parameters for this algorithm.
- Parameters:
- deep
Only relevant if object contains nested algorithm objects. If this is the case and deep is True, the params of these nested objects are included in the output using a prefix like
nested_object_name__
(Note the two “_” at the end)
- Returns:
- params
Parameter names mapped to their values.
- get_subset(*, groups: Optional[List[Union[str, Tuple[str, ...]]]] = None, index: Optional[DataFrame] = None, bool_map: Optional[Sequence[bool]] = None, **kwargs: Union[List[str], str]) Self [source]#
Get a subset of the dataset.
Note
All arguments are mutable exclusive!
- Parameters:
- groups
A valid row locator or slice that can be passed to
self.grouped_index.loc[locator, :]
. This basically needs to be a subset ofself.groups
. Note that this is the only indexer that works on the grouped index. All other indexers work on the pure index.- index
pd.DataFrame
that is a valid subset of the current dataset index.- bool_map
bool-map that is used to index the current index-dataframe. The list must be of same length as the number of rows in the index.
- **kwargs
The key must be the name of an index column. The value is a list containing strings that correspond to the categories that should be kept. For examples see above.
- Returns:
- subset
New dataset object filtered by specified parameters.
- property group: Union[str, Tuple[str, ...]]#
Get the current group.
Note, this attribute can only be used, if there is just a single group. If there is only a single groupby column or column in the index, this will return a string. Otherwise, this will return a named tuple.
- groupby(groupby_cols: Optional[Union[List[str], str]]) Self [source]#
Return a copy of the dataset grouped by the specified columns.
Each unique group represents a single data point in the resulting dataset.
- Parameters:
- groupby_cols
None (no grouping) or a valid subset of the columns available in the dataset index.
- property groups: List[Union[str, Tuple[str, ...]]]#
Get all groups based on the set groupby level.
This will either return a list of strings/integers, if there is only a single group level or index column. If there are multiple groupy levels/index columns, it will return a list of named tuples.
Note, that if one of the groupby levels/index columns is not a valid Python attribute name (e.g. in contains spaces or starts with a number), the named tuple will not contain the correct column name! For more information see the documentation of the
rename
parameter ofcollections.namedtuple
.
- is_single(groupby_cols: Optional[Union[List[str], str]]) bool [source]#
Return True if index contains only one row/group with the given groupby settings.
If
groupby_cols=None
this checks if there is only a single row left. If you want to check if there is only a single group within the current grouping, useis_single_group
instead.- Parameters:
- groupby_cols
None (no grouping) or a valid subset of the columns available in the dataset index.
- iter_level(level: str) Iterator[Self] [source]#
Return generator object containing a subset for every category from the selected level.
- Parameters:
- level
Optional
str
that sets the level which shall be used for iterating. This must be one of the columns names of the index.
- Returns:
- subset
New dataset object containing only one category in the specified
level
.