subgroups.core package

Submodules

subgroups.core.operator module

This file contains the avaible operators which can be used by the selectors.

class subgroups.core.operator.Operator(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)[source]

Bases: Enum

This enumerator provides the available operators which can be used by the selectors.

EQUAL = 1
GREATER = 4
GREATER_OR_EQUAL = 6
LESS = 3
LESS_OR_EQUAL = 5
NOT_EQUAL = 2
evaluate(left_element, right_element)[source]

Method to evaluate whether the expression (left_element self right_element) is True. IMPORTANT: if the operator is not supported between both elements, a TypeError exception is raised.

Parameters:
  • left_element (typing.Union[str, int, float, pandas.core.series.Series]) – the left element of the expression. It can be also of type ‘pandas.Series’ in order to allow comparisons with whole arrays.

  • right_element (typing.Union[str, int, float, pandas.core.series.Series]) – the right element of the expression. It can be also of type ‘pandas.Series’ in order to allow comparisons with whole arrays.

Return type:

typing.Union[bool, pandas.core.series.Series]

Returns:

whether the expression (left_element self right_element) is True. In case of both elements are of type Series, a Series in which each element is of type bool is returned.

static generate_from_str(input_str)[source]

Static method to generate an Operator from a str.

Parameters:

input_str (str) – the str from which to generate the Operator.

Return type:

subgroups.core.operator.Operator

Returns:

the Operator generated from the str.

subgroups.core.pattern module

This file contains the implementation of a ‘Pattern’. A ‘Pattern’ is a sorted list of non-repeated selectors.

class subgroups.core.pattern.Pattern(list_of_selectors)[source]

Bases: object

This class represents a ‘Pattern’. A ‘Pattern’ is a sorted list of non-repeated selectors.

Parameters:

list_of_selectors (list[subgroups.core.selector.Selector]) – a list of selectors. IMPORTANT: we assume that the list only contains selectors.

add_selector(selector)[source]

Method to add a selector to the pattern. If the selector already exists, this method does nothing.

Parameters:

selector (subgroups.core.selector.Selector) – the selector which is added.

Return type:

None

copy()[source]

Method to copy the Pattern.

Return type:

subgroups.core.pattern.Pattern

Returns:

the copy of the Pattern WITH THE SAME SELECTORS (THE SAME OBJECTS). This method does not copy the selectors of the list (it is not needed because the selectors are immutable).

static generate_from_str(input_str)[source]

Static method to generate a Pattern from a str.

Parameters:

input_str (str) – the str from which to generate the Pattern. We assume the format defined by one of the following regular expressions: (1) ‘\[\]’ (empty Pattern), (2) ‘\[selector\]’ (Pattern with only one selector), (3) ‘\[selector(, selector)+\]’ (Pattern with more than one selector).

Return type:

subgroups.core.pattern.Pattern

Returns:

the Pattern generated from the str.

get_selector(index)[source]

Method to get a selector from the pattern by index. If the index is out of range, an ‘IndexError’ exception is raised.

Parameters:

index (int) – the index which is used.

Return type:

subgroups.core.selector.Selector

is_contained(pandas_dataframe)[source]

Method to check whether the pattern is contained in each row of the pandas.DataFrame passed by parameter. IMPORTANT: If an attribute name of a selector of the pattern is not in the pandas.DataFrame passed by parameter, a KeyError exception is raised.

Parameters:

pandas_dataframe (pandas.core.frame.DataFrame) – the pandas.DataFrame with which the pattern is checked.

Return type:

pandas.core.series.Series

Returns:

whether the pattern is contained in each row of the pandas.DataFrame passed by parameter.

is_refinement(refinement_candidate, refinement_of_itself)[source]

Method to check whether ‘refinement_candidate’ is a refinement of this (i.e., of ‘self’).

Parameters:
  • refinement_candidate (subgroups.core.pattern.Pattern) – pattern candidate to be a refinement of this (i.e., of ‘self’).

  • refinement_of_itself (bool) – is a pattern a refinement of itself? Sometimes it may be better to assume yes and sometimes no. Therefore, if both patterns are equal, then this method returns the value of ‘refinement_of_itself’.

Return type:

bool

Returns:

whether ‘refinement_candidate’ is a refinement of this (i.e., ‘self’).

remove_selector(selector)[source]

Method to remove a selector from the pattern. If the selector does not exist, this method does nothing.

Parameters:

selector (subgroups.core.selector.Selector) – the selector which is removed.

Return type:

None

remove_selector_by_index(index)[source]

Method to remove a selector from the pattern by index. If the index is out of range, an ‘IndexError’ exception is raised.

Parameters:

index (int) – the index which is used.

Return type:

None

subgroups.core.selector module

This file contains the implementation of a ‘Selector’. A ‘Selector’ is an IMMUTABLE structure which contains an attribute name, an operator and a value.

class subgroups.core.selector.Selector(attribute_name: str, operator: Operator, value: str | int | float)[source]

Bases: object

This class represents a ‘Selector’. A ‘Selector’ is an IMMUTABLE structure which contains an attribute name, an operator and a value.

Parameters:
  • attribute_name – the attribute name. It must be a non-empty str.

  • operator – the operator between the attribute name and the value. If the value is of type str, only EQUAL and NOT EQUAL operators are available.

  • value – the value.

property attribute_name: str

The attribute name.

static generate_from_str(input_str)[source]

Static method to generate a Selector from a str.

Parameters:

input_str (str) – the str from which to generate the Selector. We assume the following format: <attribute_name><whitespace><operator><whitespace><value>. Be careful with the whitespaces: (1) each part of the selector must be separated by only one whitespace and (2) whitespaces at the left side of the str or at the right side of the str are not allowed.

Return type:

subgroups.core.selector.Selector

Returns:

the Selector generated from the str.

match(attribute_name, value)[source]

Method to check whether the parameters ‘attribute_name’ and ‘value’ match with the selector. In this case, “match” means that the expression ((attribute_name == self.attribute_name) and (value self.operator self.value)) is True. IMPORTANT: if the selector operator is not supported between value and self.value, a TypeError exception is raised.

Parameters:
  • attribute_name (str) – the attribute name which is compared with self.attribute_name.

  • value (typing.Union[str, int, float, pandas.core.series.Series]) – the value which is compared with self.value. The value can be also of type ‘pandas.Series’ in order to allow comparisons with whole arrays.

Return type:

typing.Union[bool, pandas.core.series.Series]

Returns:

whether the parameters ‘attribute_name’ and ‘value’ match with the selector.

property operator: Operator

The operator between the attribute name and the value.

property value: str | int | float

The value.

subgroups.core.subgroup module

This file contains the implementation of a ‘Subgroup’. A ‘Subgroup’ has a description (Pattern) and a target variable of interest (Selector).

class subgroups.core.subgroup.Subgroup(description, target)[source]

Bases: object

This class represents a ‘Subgroup’. A ‘Subgroup’ has a description (Pattern) and a target variable of interest (Selector).

Parameters:
copy()[source]

Method to copy the Subgroup.

Return type:

subgroups.core.subgroup.Subgroup

Returns:

the copy of the Subgroup.

property description: Pattern

The description.

filter(pandas_dataframe)[source]

Method to filter a pandas DataFrame, retrieving only certain information related to this subgroup.

Parameters:

pandas_dataframe (pandas.core.frame.DataFrame) – the DataFrame which is filtered. IMPORTANT: If an attribute name of a selector of the subgroup is not in the pandas.DataFrame passed by parameter, a KeyError exception is raised.

Return type:

tuple[pandas.core.series.Series, pandas.core.series.Series, pandas.core.series.Series]

Returns:

a tuple of the form: (Series, Series, Series). It is formed by the following elements: (1) a pandas Series of booleans of the same size as pandas_dataframe indicating whether rows are covered by the description and the target, (2) a pandas Series of booleans of the same size as pandas_dataframe indicating whether rows are covered by the description but not by the target, and (3) a pandas Series of booleans of the same size as pandas_dataframe indicating whether rows are covered by the target.

static generate_from_str(input_str)[source]

Static method to generate a Subgroup from a str.

Parameters:

input_str (str) – the str from which to generate the Subgroup. We assume the format defined by the following regular expressions: ‘Description: <pattern>, Target: <selector>’. The format of <selector> and <pattern> is defined by their corresponding ‘generate_from_str’ methods.

Return type:

subgroups.core.subgroup.Subgroup

Returns:

the Subgroup generated from the str.

is_refinement(refinement_candidate, refinement_of_itself)[source]

Method to check whether ‘refinement_candidate’ is a refinement of this (i.e., of ‘self’). A subgroup Y is a refinements of other subgroup X, if the description of Y is a refinement of the description of X, and the targets are equal.

Parameters:
  • refinement_candidate (subgroups.core.subgroup.Subgroup) – subgroup candidate to be a refinement of this (i.e., of ‘self’).

  • refinement_of_itself (bool) – is a pattern a refinement of itself (in this case, the description of a subgroup)? Sometimes it may be better to assume yes and sometimes no. Therefore, if both subgroups are equal (i.e., the descriptions of both subgroups and the targets), then this method returns the value of ‘refinement_of_itself’.

Return type:

bool

Returns:

whether ‘refinement_candidate’ is a refinement of this (i.e., ‘self’).

property target: Selector

The target variable of interest.