subgroups.core package
Submodules
subgroups.core.operator module
This file contains the avaible operators which can be used by the selectors.
- class subgroups.core.operator.Operator(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)[source]
Bases:
Enum
This enumerator provides the available operators which can be used by the selectors.
- EQUAL = 1
- GREATER = 4
- GREATER_OR_EQUAL = 6
- LESS = 3
- LESS_OR_EQUAL = 5
- NOT_EQUAL = 2
- evaluate(left_element, right_element)[source]
Method to evaluate whether the expression (left_element self right_element) is True. IMPORTANT: if the operator is not supported between both elements, a TypeError exception is raised.
- Parameters:
left_element (
typing.Union
[str
,int
,float
,pandas.core.series.Series
]) – the left element of the expression. It can be also of type ‘pandas.Series’ in order to allow comparisons with whole arrays.right_element (
typing.Union
[str
,int
,float
,pandas.core.series.Series
]) – the right element of the expression. It can be also of type ‘pandas.Series’ in order to allow comparisons with whole arrays.
- Return type:
typing.Union
[bool
,pandas.core.series.Series
]- Returns:
whether the expression (left_element self right_element) is True. In case of both elements are of type Series, a Series in which each element is of type bool is returned.
subgroups.core.pattern module
This file contains the implementation of a ‘Pattern’. A ‘Pattern’ is a sorted list of non-repeated selectors.
- class subgroups.core.pattern.Pattern(list_of_selectors)[source]
Bases:
object
This class represents a ‘Pattern’. A ‘Pattern’ is a sorted list of non-repeated selectors.
- Parameters:
list_of_selectors (
list
[subgroups.core.selector.Selector
]) – a list of selectors. IMPORTANT: we assume that the list only contains selectors.
- add_selector(selector)[source]
Method to add a selector to the pattern. If the selector already exists, this method does nothing.
- Parameters:
selector (
subgroups.core.selector.Selector
) – the selector which is added.- Return type:
None
- copy()[source]
Method to copy the Pattern.
- Return type:
- Returns:
the copy of the Pattern WITH THE SAME SELECTORS (THE SAME OBJECTS). This method does not copy the selectors of the list (it is not needed because the selectors are immutable).
- static generate_from_str(input_str)[source]
Static method to generate a Pattern from a str.
- Parameters:
input_str (
str
) – the str from which to generate the Pattern. We assume the format defined by one of the following regular expressions: (1) ‘\[\]’ (empty Pattern), (2) ‘\[selector\]’ (Pattern with only one selector), (3) ‘\[selector(, selector)+\]’ (Pattern with more than one selector).- Return type:
- Returns:
the Pattern generated from the str.
- get_selector(index)[source]
Method to get a selector from the pattern by index. If the index is out of range, an ‘IndexError’ exception is raised.
- Parameters:
index (
int
) – the index which is used.- Return type:
- is_contained(pandas_dataframe)[source]
Method to check whether the pattern is contained in each row of the pandas.DataFrame passed by parameter. IMPORTANT: If an attribute name of a selector of the pattern is not in the pandas.DataFrame passed by parameter, a KeyError exception is raised.
- Parameters:
pandas_dataframe (
pandas.core.frame.DataFrame
) – the pandas.DataFrame with which the pattern is checked.- Return type:
pandas.core.series.Series
- Returns:
whether the pattern is contained in each row of the pandas.DataFrame passed by parameter.
- is_refinement(refinement_candidate, refinement_of_itself)[source]
Method to check whether ‘refinement_candidate’ is a refinement of this (i.e., of ‘self’).
- Parameters:
refinement_candidate (
subgroups.core.pattern.Pattern
) – pattern candidate to be a refinement of this (i.e., of ‘self’).refinement_of_itself (
bool
) – is a pattern a refinement of itself? Sometimes it may be better to assume yes and sometimes no. Therefore, if both patterns are equal, then this method returns the value of ‘refinement_of_itself’.
- Return type:
bool
- Returns:
whether ‘refinement_candidate’ is a refinement of this (i.e., ‘self’).
- remove_selector(selector)[source]
Method to remove a selector from the pattern. If the selector does not exist, this method does nothing.
- Parameters:
selector (
subgroups.core.selector.Selector
) – the selector which is removed.- Return type:
None
subgroups.core.selector module
This file contains the implementation of a ‘Selector’. A ‘Selector’ is an IMMUTABLE structure which contains an attribute name, an operator and a value.
- class subgroups.core.selector.Selector(attribute_name: str, operator: Operator, value: str | int | float)[source]
Bases:
object
This class represents a ‘Selector’. A ‘Selector’ is an IMMUTABLE structure which contains an attribute name, an operator and a value.
- Parameters:
attribute_name – the attribute name. It must be a non-empty str.
operator – the operator between the attribute name and the value. If the value is of type str, only EQUAL and NOT EQUAL operators are available.
value – the value.
- property attribute_name: str
The attribute name.
- static generate_from_str(input_str)[source]
Static method to generate a Selector from a str.
- Parameters:
input_str (
str
) – the str from which to generate the Selector. We assume the following format: <attribute_name><whitespace><operator><whitespace><value>. Be careful with the whitespaces: (1) each part of the selector must be separated by only one whitespace and (2) whitespaces at the left side of the str or at the right side of the str are not allowed.- Return type:
- Returns:
the Selector generated from the str.
- match(attribute_name, value)[source]
Method to check whether the parameters ‘attribute_name’ and ‘value’ match with the selector. In this case, “match” means that the expression ((attribute_name == self.attribute_name) and (value self.operator self.value)) is True. IMPORTANT: if the selector operator is not supported between value and self.value, a TypeError exception is raised.
- Parameters:
attribute_name (
str
) – the attribute name which is compared with self.attribute_name.value (
typing.Union
[str
,int
,float
,pandas.core.series.Series
]) – the value which is compared with self.value. The value can be also of type ‘pandas.Series’ in order to allow comparisons with whole arrays.
- Return type:
typing.Union
[bool
,pandas.core.series.Series
]- Returns:
whether the parameters ‘attribute_name’ and ‘value’ match with the selector.
- property value: str | int | float
The value.
subgroups.core.subgroup module
This file contains the implementation of a ‘Subgroup’. A ‘Subgroup’ has a description (Pattern) and a target variable of interest (Selector).
- class subgroups.core.subgroup.Subgroup(description, target)[source]
Bases:
object
This class represents a ‘Subgroup’. A ‘Subgroup’ has a description (Pattern) and a target variable of interest (Selector).
- Parameters:
description (
subgroups.core.pattern.Pattern
) – a Pattern.target (
subgroups.core.selector.Selector
) – a Selector.
- filter(pandas_dataframe)[source]
Method to filter a pandas DataFrame, retrieving only certain information related to this subgroup.
- Parameters:
pandas_dataframe (
pandas.core.frame.DataFrame
) – the DataFrame which is filtered. IMPORTANT: If an attribute name of a selector of the subgroup is not in the pandas.DataFrame passed by parameter, a KeyError exception is raised.- Return type:
tuple
[pandas.core.series.Series
,pandas.core.series.Series
,pandas.core.series.Series
]- Returns:
a tuple of the form: (Series, Series, Series). It is formed by the following elements: (1) a pandas Series of booleans of the same size as pandas_dataframe indicating whether rows are covered by the description and the target, (2) a pandas Series of booleans of the same size as pandas_dataframe indicating whether rows are covered by the description but not by the target, and (3) a pandas Series of booleans of the same size as pandas_dataframe indicating whether rows are covered by the target.
- static generate_from_str(input_str)[source]
Static method to generate a Subgroup from a str.
- Parameters:
input_str (
str
) – the str from which to generate the Subgroup. We assume the format defined by the following regular expressions: ‘Description: <pattern>, Target: <selector>’. The format of <selector> and <pattern> is defined by their corresponding ‘generate_from_str’ methods.- Return type:
- Returns:
the Subgroup generated from the str.
- is_refinement(refinement_candidate, refinement_of_itself)[source]
Method to check whether ‘refinement_candidate’ is a refinement of this (i.e., of ‘self’). A subgroup Y is a refinements of other subgroup X, if the description of Y is a refinement of the description of X, and the targets are equal.
- Parameters:
refinement_candidate (
subgroups.core.subgroup.Subgroup
) – subgroup candidate to be a refinement of this (i.e., of ‘self’).refinement_of_itself (
bool
) – is a pattern a refinement of itself (in this case, the description of a subgroup)? Sometimes it may be better to assume yes and sometimes no. Therefore, if both subgroups are equal (i.e., the descriptions of both subgroups and the targets), then this method returns the value of ‘refinement_of_itself’.
- Return type:
bool
- Returns:
whether ‘refinement_candidate’ is a refinement of this (i.e., ‘self’).