ntqr.statistics

Responses to finite tests with finite choices can be described by integer counts of decision events. This module contains classes to construct these finite sample statistical variables.

In unsupervised settings, we get to observe the counts of decision events. These are the ‘response’ variables. We do not know the counts of true labels in the answer key to the test (the ‘Q’ variables) nor the counts of decision events given true label (the ‘label response’ variables).

NTQR uses the SymPy package to carry out its symbolic computations. The classes in this module build the variables of the form,

  1. Q_label: for the q var associated with a label.

  2. R_{label_i, label_j, …} for the response variables of an ensemble.

  3. R_{label_1, label_j, …, label_true} for the label response variables.

The Q variables are constructed by AnswerKeyVariables. The R variables are properties of ResponseVariables, .responses and .label_responses respectively.

Classes

AnswerKeyVariables

Variables associated with the count of labels in the answer key.

ResponseVariables

Variables for the counts of decisions events given true label.

MClassifiersVariables

Statistical responses variables for M classifiers.

SingleClassifierVariables

Statistical variables associated with a single classifier.

Module Contents

class ntqr.statistics.AnswerKeyVariables(labels: ntqr.Labels)

Variables associated with the count of labels in the answer key.

qs

The Q_{l_i} variables. There are R of them, one for each label.

Type:

Mapping[Label,simpy.Symbol]

_labels
_qs
__eq__(other)
property labels
property qs
__repr__()
__str__()
class ntqr.statistics.ResponseVariables(labels: ntqr.Labels, classifiers: Sequence[str])

Variables for the counts of decisions events given true label.

responses

Mapping[Sequence[Label], sympy.Symbol]]

Variables for the observed counts of decision events.

Type:

Mapping[Sequence[Label],

responses_by_labelMapping[Label, Mapping[Sequence[Label], Mapping[…]]]

Variables for the counts of a decision event given true label.

correct: Mapping[Label, sympy.Symbol]

All correct variable given true label.

errors: Mapping[Label, Mapping[Sequence[Label], sympy.Symbol]]

labels
classifiers
_responses
_label_responses
_correct
_errors
__eq__(other)
property responses
property label_responses
property correct
property errors
observables_dict(counts: Mapping[Sequence, int]) Mapping[sympy.Symbol, int]
Parameters:

counts (Mapping[Sequence, int]) – Mapping from observable question-aligned decision events by the classifiers to their observed counts.

Returns:

Mapping from observable response variable to its observed count.

Return type:

dict

label_response_to_observable() Mapping[sympy.Symbol, sympy.Symbol]

Constructs a mapping from a label response variable to its corresponding observable response variable. That is,

R_{event, true_label} -> R_{event}.

Returns:

Mapping[sympy.Symbol, sympy.Symbol].

Return type:

dict

label_response_to_q() Mapping[sympy.Symbol, sympy.Symbol]

Creates mapping from label response variable to Q_label.

Returns:

lr_to_q – Mapping from a label response variable to its corresponding Q_label variable.

Return type:

Mapping[sympy.Symbol, sympy.Symbol]

_response_variables(labels, classifiers)

Constructs observable response variables given ‘labels’ and ‘classifiers’.

Parameters:
  • labels (List) – Labels to use.

  • classifiers (Sequence[str]) – Labels to use for the classifiers. The label should support being stringified.

Returns:

  • Dictionary of response count variables. indexed by decisions

  • tuples.

_label_response_variables(labels, classifiers)

Constructs variables associated with correct and wrong response counts given true label.

Parameters:
  • labels (Sequence[Label]) – Labels to use.

  • classifier (Sequence[str]) – Index of classifier.

Returns:

  • Dictionary of by-label response counts, indexed by true label.

  • In addition, each label contains a ‘correct’ key

  • that points to the variable associated with correct responses.

  • An ‘errors’ dictionary is indexed by possible wrong

  • label assignments.

_label_correct()
_label_errors()
seq_str(decisions, classifiers) str
Parameters:
  • decisions (Sequence[labels]) – A sequence of N labels.

  • clsfr_strs (TYPE) – The N classifier labels.

Return type:

Comma separated string of the forms l_c.

label_r_var_symbol(decisions, classifiers, label)
__repr__()
class ntqr.statistics.MClassifiersVariables(labels: ntqr.Labels, classifiers: Sequence[str])

Statistical responses variables for M classifiers.

qs

The Q_l_i variables. There are R of them.

Type:

List[simpy.Symbol..]

responses
Type:

Mapping[m_subset, Mapping[Sequence[Label], simpy.Symbol]]

responses_by_label : Mapping[m_subset, Mapping[Label, Mapping[…]]]

Deprecated since version 0.7.6: Use ResponseVariables instead.

labels
classifiers
qs
_responses
responses
_label_responses
_label_correct
_label_errors
label_responses
label_correct
label_errors
_response_variables(labels, classifiers)

Constructs observable response variables given ‘labels’ and ‘classifiers’.

Parameters:
  • labels (List) – Labels to use.

  • classifiers (Sequence[str]) – Labels to use for the classifiers. The label should support being stringified.

Return type:

Dictionary of by-label response counts, one per label.

_label_response_variables(labels, classifiers)

Constructs variables associated with correct and wrong response counts given true label.

Parameters:
  • labels (Sequence[Label]) – Labels to use.

  • classifier (Sequence[str]) – Index of classifier.

Returns:

  • Dictionary of by-label response counts, indexed by true label.

  • In addition, each label contains a ‘correct’ key

  • that points to the variable associated with correct responses.

  • An ‘errors’ dictionary is indexed by possible wrong

  • label assignments.

_label_correct_variables(labels, classifiers)
_label_error_variables(labels, classifiers)
seq_str(decisions, classifiers) str
Parameters:
  • decisions (Sequence[labels]) – A sequence of N labels.

  • clsfr_strs (TYPE) – The N classifier labels.

Return type:

Comma separated string of the forms l_c.

label_r_var_symbol(decisions, classifiers, label)
r_var_symbol(decisions, classifiers)
pair_correlations(pair: tuple[str, str], decisions: tuple[str, str], l_true: str) sympy.UnevaluatedExpr
Parameters:
  • pair (tuple[str,str]) – Pair of classifiers.

  • decisions (tuple[str,str]) – The decisions by the pair.

Returns:

  • Mapping from label to an expression for the pair correlation

  • for the decisions given true label.

all_agree_subs_dict() Mapping[sympy.Symbol, sympy.UnevaluatedExpr]
__repr__()
class ntqr.statistics.SingleClassifierVariables(labels, classifier)

Statistical variables associated with a single classifier.

question_numbers

The variables for the count of correct questions for each true label.

Type:

List[simpy.Symbol..]

responses

The variables for the observed interger count of labels in the test. Generally, of the form

R_{l_ i} : Number of ‘l’ label responses by classifier ‘i’

Type:

Mapping[label, simpy.Symbol]

responses_by_label

The variables associated with correct and wrong responses given the true label.

Variables are of the form, R_{l_i, l_true} : Number of l responses by classifier ‘i’ given true label, l_true.

Given true label, responses_by_label[label] returns a dictionary with keys:

‘correct’: the variable for correct responses given true label. ‘errors’: dictionary, indexed by wrong label, to incorrect responses. ‘l_1’, ‘l_2’: dictionary, indexed by response label, given true_label.

Type:

Mapping[label, simpy.Symbol]

.. deprecated :: 0.7.6

Use ResponseVariables instead.

questions_number
responses
responses_by_label
response_variables(labels, classifier)

Constructs observable response variables given ‘labels’ and ‘classifier’.

Parameters:
  • labels (List) – Labels to use.

  • classifier (int) – Index of classifier.

Return type:

Dictionary of by-label response counts, one per label.

label_response_variables(labels, classifier)

Constructs variables associated with correct and wrong response counts given true label.

Parameters:
  • labels (List) – Labels to use.

  • classifier (int) – Index of classifier.

Returns:

  • Dictionary of by-label response counts, three per label.

  • In addition, each label contains a ‘correct’ key

  • that points to the variable associated with correct responses.

  • An ‘errors’ dictionary is indexed by possible wrong

  • label assignments.