ntqr.testsketches¶

Module for classes and datastructures related to test sketches.

The logic of unsupervised evaluation can be viewed as similar to data streaming algorithms. We use summary statistics of a test (observable agreements and disagreements between classifiers) to deduce other statistics of the test (the set of possible group evaluations for them).

Classes¶

QuestionAlignedDecisions

The question aligned test sketch.

Module Contents¶

class ntqr.testsketches.QuestionAlignedDecisions(observed_responses: Mapping[Sequence[ntqr.Label], int], labels: ntqr.Labels)¶

The question aligned test sketch.

It can be shown algebraically that the count of question aligned responses by the test takers are “generated” by statistics of their correctness on the test. These sample statistics of a given test include their individual average performance as well as sample statistics tracking their error correlations with each other. This coupling of observed test sketch and polynomial system of unknown sample statistics of test takers is the core idea of the NTQR package.

This class represents the observable side of that coupling - the counts of the R^N ways that N test takers can agree and disagree when they are taking a test with R responses.

N¶

counts¶

labels¶

marginalize(indices: Sequence[int]) → Self¶

Marginalize counts to ‘indices’ only counts.

Parameters:: indices (Sequence[int]) – Subset of classifiers.
Returns:: Question aligned counts for subset of classifiers specified by ‘indices’.
Return type:: Self

m_subset_indices_to_val(m: int) → Mapping[tuple, int]¶

Response counts for all possible classifier subsets of size ‘m’.

Parameters:: m (int) – Size of the subsets of the aligned decisions to use.
Return type:: Mapping from m-sized subsets indices to observed response counts.