ntqr.statistics¶
Responses to finite tests with finite choices can be described by integer counts of decision events. This module contains classes to construct these finite sample statistical variables.
In unsupervised settings, we get to observe the counts of decision events. These are the ‘response’ variables. We do not know the counts of true labels in the answer key to the test (the ‘Q’ variables) nor the counts of decision events given true label (the ‘label response’ variables).
NTQR uses the SymPy package to carry out its symbolic computations. The classes in this module build the variables of the form,
Q_label: for the q var associated with a label.
R_{label_i, label_j, …} for the response variables of an ensemble.
R_{label_1, label_j, …, label_true} for the label response variables.
The Q variables are constructed by AnswerKeyVariables. The R variables are properties of ResponseVariables, .responses and .label_responses respectively.
Classes¶
Variables associated with the count of labels in the answer key. |
|
Variables for the counts of decisions events given true label. |
|
Statistical responses variables for M classifiers. |
|
Statistical variables associated with a single classifier. |
Module Contents¶
- class ntqr.statistics.AnswerKeyVariables(labels: ntqr.Labels)¶
Variables associated with the count of labels in the answer key.
- qs¶
The Q_{l_i} variables. There are R of them, one for each label.
- Type:
Mapping[Label,simpy.Symbol]
- _labels¶
- _qs¶
- __eq__(other)¶
- property labels¶
- property qs¶
- __repr__()¶
- __str__()¶
- class ntqr.statistics.ResponseVariables(labels: ntqr.Labels, classifiers: Sequence[str])¶
Variables for the counts of decisions events given true label.
- responses¶
Mapping[Sequence[Label], sympy.Symbol]]
Variables for the observed counts of decision events.
- Type:
Mapping[Sequence[Label],
- responses_by_labelMapping[Label, Mapping[Sequence[Label], Mapping[…]]]
Variables for the counts of a decision event given true label.
- correct: Mapping[Label, sympy.Symbol]
All correct variable given true label.
errors: Mapping[Label, Mapping[Sequence[Label], sympy.Symbol]]
- labels¶
- classifiers¶
- _responses¶
- _label_responses¶
- _correct¶
- _errors¶
- __eq__(other)¶
- property responses¶
- property label_responses¶
- property correct¶
- property errors¶
- observables_dict(counts: Mapping[Sequence, int]) Mapping[sympy.Symbol, int]¶
- Parameters:
counts (Mapping[Sequence, int]) – Mapping from observable question-aligned decision events by the classifiers to their observed counts.
- Returns:
Mapping from observable response variable to its observed count.
- Return type:
dict
- label_response_to_observable() Mapping[sympy.Symbol, sympy.Symbol]¶
Constructs a mapping from a label response variable to its corresponding observable response variable. That is,
R_{event, true_label} -> R_{event}.
- Returns:
Mapping[sympy.Symbol, sympy.Symbol].
- Return type:
dict
- label_response_to_q() Mapping[sympy.Symbol, sympy.Symbol]¶
Creates mapping from label response variable to Q_label.
- Returns:
lr_to_q – Mapping from a label response variable to its corresponding Q_label variable.
- Return type:
Mapping[sympy.Symbol, sympy.Symbol]
- _response_variables(labels, classifiers)¶
Constructs observable response variables given ‘labels’ and ‘classifiers’.
- Parameters:
labels (List) – Labels to use.
classifiers (Sequence[str]) – Labels to use for the classifiers. The label should support being stringified.
- Returns:
Dictionary of response count variables. indexed by decisions
tuples.
- _label_response_variables(labels, classifiers)¶
Constructs variables associated with correct and wrong response counts given true label.
- Parameters:
labels (Sequence[Label]) – Labels to use.
classifier (Sequence[str]) – Index of classifier.
- Returns:
Dictionary of by-label response counts, indexed by true label.
In addition, each label contains a ‘correct’ key
that points to the variable associated with correct responses.
An ‘errors’ dictionary is indexed by possible wrong
label assignments.
- _label_correct()¶
- _label_errors()¶
- seq_str(decisions, classifiers) str¶
- Parameters:
decisions (Sequence[labels]) – A sequence of N labels.
clsfr_strs (TYPE) – The N classifier labels.
- Return type:
Comma separated string of the forms l_c.
- label_r_var_symbol(decisions, classifiers, label)¶
- __repr__()¶
- class ntqr.statistics.MClassifiersVariables(labels: ntqr.Labels, classifiers: Sequence[str])¶
Statistical responses variables for M classifiers.
- qs¶
The Q_l_i variables. There are R of them.
- Type:
List[simpy.Symbol..]
responses_by_label : Mapping[m_subset, Mapping[Label, Mapping[…]]]
Deprecated since version 0.7.6: Use
ResponseVariablesinstead.- labels¶
- classifiers¶
- qs¶
- _responses¶
- responses¶
- _label_responses¶
- _label_correct¶
- _label_errors¶
- label_responses¶
- label_correct¶
- label_errors¶
- _response_variables(labels, classifiers)¶
Constructs observable response variables given ‘labels’ and ‘classifiers’.
- Parameters:
labels (List) – Labels to use.
classifiers (Sequence[str]) – Labels to use for the classifiers. The label should support being stringified.
- Return type:
Dictionary of by-label response counts, one per label.
- _label_response_variables(labels, classifiers)¶
Constructs variables associated with correct and wrong response counts given true label.
- Parameters:
labels (Sequence[Label]) – Labels to use.
classifier (Sequence[str]) – Index of classifier.
- Returns:
Dictionary of by-label response counts, indexed by true label.
In addition, each label contains a ‘correct’ key
that points to the variable associated with correct responses.
An ‘errors’ dictionary is indexed by possible wrong
label assignments.
- _label_correct_variables(labels, classifiers)¶
- _label_error_variables(labels, classifiers)¶
- seq_str(decisions, classifiers) str¶
- Parameters:
decisions (Sequence[labels]) – A sequence of N labels.
clsfr_strs (TYPE) – The N classifier labels.
- Return type:
Comma separated string of the forms l_c.
- label_r_var_symbol(decisions, classifiers, label)¶
- r_var_symbol(decisions, classifiers)¶
- pair_correlations(pair: tuple[str, str], decisions: tuple[str, str], l_true: str) sympy.UnevaluatedExpr¶
- Parameters:
pair (tuple[str,str]) – Pair of classifiers.
decisions (tuple[str,str]) – The decisions by the pair.
- Returns:
Mapping from label to an expression for the pair correlation
for the decisions given true label.
- all_agree_subs_dict() Mapping[sympy.Symbol, sympy.UnevaluatedExpr]¶
- __repr__()¶
- class ntqr.statistics.SingleClassifierVariables(labels, classifier)¶
Statistical variables associated with a single classifier.
- question_numbers¶
The variables for the count of correct questions for each true label.
- Type:
List[simpy.Symbol..]
- responses¶
The variables for the observed interger count of labels in the test. Generally, of the form
R_{l_ i} : Number of ‘l’ label responses by classifier ‘i’
- Type:
Mapping[label, simpy.Symbol]
- responses_by_label¶
The variables associated with correct and wrong responses given the true label.
Variables are of the form, R_{l_i, l_true} : Number of l responses by classifier ‘i’ given true label, l_true.
Given true label, responses_by_label[label] returns a dictionary with keys:
‘correct’: the variable for correct responses given true label. ‘errors’: dictionary, indexed by wrong label, to incorrect responses. ‘l_1’, ‘l_2’: dictionary, indexed by response label, given true_label.
- Type:
Mapping[label, simpy.Symbol]
- .. deprecated :: 0.7.6
Use
ResponseVariablesinstead.
- questions_number¶
- responses¶
- responses_by_label¶
- response_variables(labels, classifier)¶
Constructs observable response variables given ‘labels’ and ‘classifier’.
- Parameters:
labels (List) – Labels to use.
classifier (int) – Index of classifier.
- Return type:
Dictionary of by-label response counts, one per label.
- label_response_variables(labels, classifier)¶
Constructs variables associated with correct and wrong response counts given true label.
- Parameters:
labels (List) – Labels to use.
classifier (int) – Index of classifier.
- Returns:
Dictionary of by-label response counts, three per label.
In addition, each label contains a ‘correct’ key
that points to the variable associated with correct responses.
An ‘errors’ dictionary is indexed by possible wrong
label assignments.