ntqr.alarms
===========

.. py:module:: ntqr.alarms

.. autoapi-nested-parse::

   Algorithms for logical alarms based on the axioms.

   Formal verification of unsupervised evaluations is carried out by
   using the agreements and disagreements between classifiers to detect
   if they are misaligned given a safety specification.

   The 'atomic' logical test for the alarms is a look at the group
   evaluations that are logically consistent with how the classifiers
   aligned in their decisions and an assumed number of corrects for
   each label in the true, but unknown, answer key for the exam.

   For example, in a test with three possible responses or classes
   for each question, we need to specify,

       qs = (q_label_1, q_label_2, q_label_3)

   where

       sum(qs) = Q,

   with Q the size of the test. So a test with Q=10, could have a qs setting of,
   (5,3,2) since sum(5,3,2) = 10.

   This atomic misalignment test at fixed qs value then allows you
   to create custom alarms depending on your application domain.
   Some examples,

   1. The prevalence of classes in your tests is biased toward
   small amounts of one label or the other. In that case, you can construct
   an alarm as,

           all([alarm.misaligned_at_qs(qs, responses) for qs my_range()])

   2. The method `ntqr.SingleClassifierAxiomsAlarm.are_misaligned`
   is a test for fully unsupervised settings and is equivalent to,

           all([alarm.misaligned_at_qs((qa,Q-qa), rs) for qa in range(0,Q+1)])

   That is, the only thing you have are the classifiers' responses and the
   size of the test, Q.

   3. You believe that your classifiers are high performing and therefore
   will only accept (Q_label_1, Q_label_2, ...) settings for which
   all your classifiers are better than x% at detecting all the labels.
   This turns the atomic logical test into a measuring instrument for
   the prevalence of the labels in the tested dataset. The method

       SingleClassifierAxiomsAlarm.are_misaligned( responses )

   is the fully unsupervised version of what logical alarms can do.
   It detects (imperfectly!) if at least one member in an ensemble is
   violating a user provided safety specification when doing classification
   with R classes.

   The name 'are_misaligned' should make clear that this detects when
   classifiers are misaligned **and** this is not the same thing as being
   correct. If a pair of classifiers are being tested, if both are wrong
   in the same way, `.are_misaligned` will return False.

   The user is encouraged to think of these alarms as building blocks for
   algorithms that use the philosophy of error-detecting codes. For example,
   by having three classifiers, as long as one of them is behaving correctly,
   `.are_misaligned` will return True.


Classes
-------

.. autoapisummary::

   ntqr.alarms.SingleClassifierAxiomsAlarm
   ntqr.alarms.LabelsSafetySpecification
   ntqr.alarms.GradeSafetySpecification


Module Contents
---------------

.. py:class:: SingleClassifierAxiomsAlarm(Q: int, classifiers_axioms: collections.abc.Sequence[ntqr.r2.raxioms.SingleClassifierAxioms | ntqr.r3.raxioms.SingleClassifierAxioms], cls_single_evals: ntqr.evaluations.SingleClassifierEvaluations)

   Alarm based on the single classifier axioms for the ensemble members.

   Although this alarm considers only single classifier axioms, they all
   share the variables related to the number of different question types
   in a test. For example, a binary test has two question types. This allows
   us to consider what evaluations are possible for a group of classifiers
   at **fixed** number of questions.

   Said another way, when we only consider the individual number of
   responses for each classifier, we are aligning the group responses on
   the whole test, not individual questions in it. Future classes will
   consider what happens when we count how pairs of them are aligned at
   the question level.


   .. py:attribute:: Q


   .. py:attribute:: classifiers_axioms


   .. py:attribute:: labels


   .. py:attribute:: evals


   .. py:method:: set_safety_specification(factors: collections.abc.Sequence[int]) -> None

      Set alarm's safetySpecification given factors.

      Currently defaulting to LabelsSafetySpecification

      :param factors: Sequence of factors that will satisfy factor*q_l_correct - q_l > 0
      :type factors: Sequence[int]

      :rtype: None


   .. py:method:: misaligned_at_qs(qs: collections.abc.Sequence[int], responses: collections.abc.Sequence[collections.abc.Sequence[int]]) -> bool

      Tests if responses are misaligned at qs.

      :param qs: Count of label in answer key.
      :type qs: Sequence[int]
      :param responses: Label responses by each classifier
      :type responses: Sequence[Sequence[int]]

      :returns: Whether one or more classifiers violated the safety specification.
      :rtype: bool


   .. py:method:: misalignment_trace(responses: collections.abc.Sequence[collections.abc.Sequence[int]]) -> set[tuple[collections.abc.Sequence[int], bool]]

      Test classifiers misalignment at all label question numbers.

      :param responses: The number of label responses by each classifier
      :type responses: Sequence[Sequence[int]]

      :returns: The set of (qs, misalignment_test_result) for all possible
                qs settings in a test of size Q.
      :rtype: set[tuple[Sequence[int], bool]]


   .. py:method:: are_misaligned(responses: collections.abc.Sequence[collections.abc.Sequence[int]]) -> bool

      Boolean AND of the misalignment trace given responses.

      :param responses: The number of label responses by each classifier
      :type responses: Sequence[Sequence[int]]

      :returns: True if the classifiers have no qs setting at which all
                classifiers satisfy the safety specification, False otherwise.
      :rtype: bool


   .. py:method:: check_responses(qs: collections.abc.Sequence[int], responses: collections.abc.Sequence[collections.abc.Sequence[int]]) -> bool

      Check logical constraints on responses.

      1. The sum of label correct questions equals the size of the test.

          sum(qs) = Q

      2. All classifiers label responses also sum to the test size.

          all( (sum(classifer_rsps) == Q) for classifier_rsps in responses)

      :param qs: Count of label in answer key.
      :type qs: Sequence[int]
      :param responses: The number of label responses by each classifier
      :type responses: Sequence[Sequence[int]]

      :returns: True if requirements 1 and 2 are satisfied, False otherwise.
      :rtype: bool


.. py:class:: LabelsSafetySpecification(factors: collections.abc.Sequence[int])

   Simple example of safety specification for each label.


   .. py:attribute:: factors


   .. py:method:: is_satisfied(qs: collections.abc.Sequence[int], correct_responses: collections.abc.Sequence[int])

      Check correct_responses at qs setting satisfy safety specification

      :param qs: Count of label in answer key.
      :type qs: Sequence[int]
      :param responses: The number of label responses by each classifier
      :type responses: Sequence[Sequence[int]]

      :returns: True if classifier assumed number of correct responses
                satisfy the safety specification, False otherwise.
                Each number of assumed label correct responses must satisfy
                factor*q_label_correct - q_label > 0.
      :rtype: bool


   .. py:method:: pair_safe_evaluations_at_qs(qs: collections.abc.Sequence[int]) -> list[collections.abc.Iterator[tuple[int, int]]]

      All pair evaluations satisfying safety spec at given qs.

      :param qs: Number of questions for each label.
      :type qs: Sequence[int]

      :returns: List of iterators, one per label, for the pair evaluations
                that satisfy the safety specification.
      :rtype: list[Iterator[tuple[int,int]]]


.. py:class:: GradeSafetySpecification(factors)

   Simple example of a grade safety specification.


   .. py:attribute:: factors


   .. py:method:: is_satisfied(qs: list[int], correct_responses: collections.abc.Sequence[int])

      Checks that list of label accuracies satisfy the
      safety specification.

      :param qs: Number of label questions in the test.
      :type qs: list(int)
      :param correct_responses: Number of label correct responses, one per label.
      :type correct_responses: Sequence[int]

      :rtype: Boolean


   .. py:method:: pair_safe_evaluations_at_qs(qs)

      All pair evaluations satisfying safety spec at given qs.