Metric API¶
Methods For Metric¶
Source code: sail_on_client/evaluate/metrics.py
Program Metric Functions.
- sail_on_client.evaluate.metrics.m_acc(gt_novel, p_class, gt_class, round_size, asymptotic_start_round)[source]¶
Compute top1 and top3 accuracy.
- Parameters
p_novel – NX1 vector with each element corresponding to probability of novelty
p_class (
ndarray
) – Nx(K+1) matrix with each row corresponding to K+1 class probabilities for each samplegt_class (
ndarray
) – Nx1 vector with ground-truth class for each sampleround_size (
int
) – Number of samples in a single round of the testasymptotic_start_round (
int
) – Round id where metric computation startsgt_novel (numpy.ndarray) –
- Return type
- Returns
Dictionary with results
- sail_on_client.evaluate.metrics.m_accuracy_on_novel(p_class, gt_class, gt_novel)[source]¶
Additional Metric: Novelty robustness.
The method computes top-K accuracy for only the novel samples
- Parameters
p_class (
ndarray
) – Nx(K+1) matrix with each row corresponding to K+1 class probabilities for each samplegt_class (
ndarray
) – Nx1 vector with ground-truth class for each samplegt_novel (
ndarray
) – Nx1 binary vector corresponding to the ground truth novel{1}/seen{0} labelsk – K value to compute accuracy at
- Return type
- Returns
Accuracy at rank-k
- sail_on_client.evaluate.metrics.m_ndp(p_novel, gt_novel, mode='full_test')[source]¶
Program Metric: Novelty detection performance.
The method computes per-sample novelty detection performance
- Parameters
- Returns
Accuracy, Precision, Recall, F1_score and Confusion matrix
- Return type
Dictionary of various metrics
- sail_on_client.evaluate.metrics.m_ndp_failed_reaction(p_novel, gt_novel, p_class, gt_class, mode='full_test')[source]¶
Additional Metric: Novelty detection when reaction fails.
The method computes novelty detection performance for only on samples with incorrect k-class predictions
- Parameters
p_novel (
ndarray
) – NX1 vector with each element corresponding to probability of noveltygt_novel (
ndarray
) – NX1 vector with each element 0 (not novel) or 1 (novel)p_class (
ndarray
) – Nx(K+1) matrix with each row corresponding to K+1 class probabilities for each samplegt_class (
ndarray
) – Nx1 vector with ground-truth class for each samplemode (
str
) – if ‘full_test’ computes on all test samples, if ‘post_novelty’ computes from the first GT novel samplek – ‘k’ used in top-K accuracy
- Returns
Accuracy, Precision, Recall, F1_score and Confusion matrix
- Return type
Dictionary of various metrics
- sail_on_client.evaluate.metrics.m_ndp_post(p_novel, gt_novel)[source]¶
Additional Metric: Novelty detection performance after novelty is introduced.
- sail_on_client.evaluate.metrics.m_ndp_pre(p_novel, gt_novel)[source]¶
Additional Metric: Novelty detection performance before novelty is introduced.
- sail_on_client.evaluate.metrics.m_num(p_novel, gt_novel)[source]¶
Program Metric: Number of samples needed for detecting novelty.
The method computes number of GT novel samples needed to predict the first true positive.
Base Class For Metric¶
Source code: sail_on_client/evaluate/program_metrics.py
Abstract Class for metrics for sail-on.
- class sail_on_client.evaluate.program_metrics.ProgramMetrics[source]¶
Abstract program metric class.
- abstract m_acc(gt_novel, p_class, gt_class, round_size, asymptotic_start_round)[source]¶
m_acc abstract function.
- Parameters
- Return type
- Returns
Dictionary containing top1, top3 accuracy over the test, pre and post novelty.
- abstract m_accuracy_on_novel(p_novel, gt_class, gt_novel)[source]¶
m_accuracy_on_novel abstract function.
- abstract m_is_cdt_and_is_early(gt_idx, ta2_idx, test_len)[source]¶
m_is_cdt_and_is_early abstract function.
- Parameters
- Return type
Dict
- Returns
Dictionary containing boolean showing if change was was detected and if it was detected early
- abstract m_ndp_failed_reaction(p_novel, gt_novel, p_class, gt_class)[source]¶
m_ndp_failed_reaction abstract function.
Activity Recognition Metric¶
Source code: sail_on_client/evaluate/activity_recognition.py
Activity Recognition Class for metrics for sail-on.
- class sail_on_client.evaluate.activity_recognition.ActivityRecognitionMetrics[source]¶
Activity Recognition program metric class.
- __init__(protocol, video_id, novel, detection, classification, spatial, temporal)[source]¶
Initialize.
- Parameters
protocol (
str
) – Name of the protocol.video_id (
int
) – Column id for videonovel (
int
) – Column id for predicting if change was detecteddetection (
int
) – Column id for predicting sample wise noveltyclassification (
int
) – Column id for predicting sample wise classesspatial (
int
) – Column id for predicting spatial attributetemporal (
int
) – Column id for predicting temporal attribute
- Returns
None
- Return type
- m_acc(gt_novel, p_class, gt_class, round_size, asymptotic_start_round)[source]¶
m_acc function.
- Parameters
gt_novel (
DataFrame
) – ground truth detections for N videos (Dimension: N X 1)p_class (
DataFrame
) – class predictions with video id for N videos (Dimension: N X 90 [vid,novel_class,88 known class])gt_class (
DataFrame
) – ground truth classes for N videos (Dimension: N X 1)round_size (
int
) – size of the roundasymptotic_start_round (
int
) – asymptotic samples considered for computing metrics
- Return type
- Returns
Dictionary containing top1, top3 accuracy over the test, pre and post novelty.
- m_accuracy_on_novel(p_class, gt_class, gt_novel)[source]¶
m_accuracy_on_novel function.
- Parameters
- Return type
- Returns
Accuracy on novely samples
- m_is_cdt_and_is_early(gt_idx, ta2_idx, test_len)[source]¶
m_is_cdt_and_is_early function.
- Parameters
- Return type
Dict
- Returns
Dictionary containing boolean showing if change was was detected and if it was detected early
- m_ndp_failed_reaction(p_novel, gt_novel, p_class, gt_class)[source]¶
m_ndp_failed_reaction function.
- Parameters
p_novel (
DataFrame
) – detection predictions for N videos (Dimension: N X 1)gt_novel (
DataFrame
) – ground truth detections for N videos (Dimension: N X 1)p_class (
DataFrame
) – class predictions with video id for N videos (Dimension: N X 90 [vid,novel_class,88 known class])gt_class (
DataFrame
) – ground truth classes for N videos (Dimension: N X 1)
- Return type
- Returns
Dictionary containing TP, FP, TN, FN, top1, top3 accuracy over the test.
Document Transcription Metric¶
Source code: sail_on_client/evaluate/document_transcription.py
Document Transcription Class for metrics for sail-on.
- class sail_on_client.evaluate.document_transcription.DocumentTranscriptionMetrics[source]¶
Document transcription program metric class.
- __init__(protocol, image_id, text, novel, representation, detection, classification, pen_pressure, letter_size, word_spacing, slant_angle, attribute)[source]¶
Initialize.
- Parameters
protocol (
str
) – Name of the protocol.image_id (
int
) – Column id for imagetext (
int
) – Transcription associated with the imagenovel (
int
) – Column id for predicting if change was detectedrepresentation (
int
) – Column id with representation novelty labeldetection (
int
) – Column id with sample wise noveltyclassification (
int
) – Column id with writer idpen_pressure (
int
) – Column id with pen pressure valuesletter_size (
int
) – Column id with letter size valuesword_spacing (
int
) – Column id with word spacing valuesslant_angle (
int
) – Column id with slant angle valuesattribute (
int
) – Column id with attribute level novelty label
- Returns
None
- Return type
- m_acc(gt_novel, p_class, gt_class, round_size, asymptotic_start_round)[source]¶
m_acc helper function used for computing novelty reaction performance.
- Parameters
gt_novel (
DataFrame
) – ground truth detections (Dimension: [img X detection])p_class (
DataFrame
) – class predictions (Dimension: [img X prob that sample is novel, prob of known classes])gt_class (
DataFrame
) – ground truth classes (Dimension: [img X class idx])round_size (
int
) – size of the roundasymptotic_start_round (
int
) – asymptotic samples considered for computing metrics
- Return type
- Returns
Dictionary containing top1, top3 accuracy over the test, pre and post novelty.
- m_accuracy_on_novel(p_class, gt_class, gt_novel)[source]¶
Additional Metric: Novelty robustness.
Not Implemented since no gt_class info for novel samples. The method computes top-K accuracy for only the novel samples
- Parameters
- Return type
- Returns
Accuracy on novely samples
- m_is_cdt_and_is_early(gt_idx, ta2_idx, test_len)[source]¶
Is change detection and is change detection early (m_is_cdt_and_is_early) function.
- Parameters
- Return type
Dict
- Returns
Dictionary containing boolean showing if change was was detected and if it was detected early
- m_ndp(p_novel, gt_novel)[source]¶
m_ndp function.
Novelty detection performance. The method computes per-sample novelty detection performance over the entire test.
- m_ndp_failed_reaction(p_novel, gt_novel, p_class, gt_class)[source]¶
m_ndp_failed_reaction function.
Not Implemented since no gt_class info for novel samples. The method computes novelty detection performance for only on samples with incorrect k-class predictions
- Parameters
p_novel (
DataFrame
) – detection predictions (Dimension: [img X novel])gt_novel (
DataFrame
) – ground truth detections (Dimension: [img X detection])p_class (
DataFrame
) – detection predictions (Dimension: [img X prob that sample is novel, prob of known classes])gt_class (
DataFrame
) – ground truth classes (Dimension: [img X class idx])
- Return type
- Returns
Dictionary containing TP, FP, TN, FN, top1, top3 accuracy over the test.
- m_ndp_post(p_novel, gt_novel)[source]¶
m_ndp_post function.
See
m_ndp()
with post_novelty. This computes from the first GT novel sample. :type p_novel:ndarray
:param p_novel: detection predictions (Dimension: [img X novel]) :type gt_novel:ndarray
:param gt_novel: ground truth detections (Dimension: [img X detection])- Return type
- Returns
Dictionary containing detection performance post novelty.
- Parameters
p_novel (numpy.ndarray) –
gt_novel (numpy.ndarray) –
- m_ndp_pre(p_novel, gt_novel)[source]¶
m_ndp_pre function.
See
m_ndp()
with post_novelty. This computes to the first GT novel sample. It really isn’t useful and is just added for completion. Should always be 0 since no possible TP.
- m_num(p_novel, gt_novel)[source]¶
m_num function.
A Program Metric where the number of samples needed for detecting novelty. The method computes the number of GT novel samples needed to predict the first true positive.
Image Classification Metric¶
Source code: sail_on_client/evaluate/image_classification.py
Image Classification Class for metrics for sail-on.
- class sail_on_client.evaluate.image_classification.ImageClassificationMetrics[source]¶
Image Classification program metric class.
- m_acc(gt_novel, p_class, gt_class, round_size, asymptotic_start_round)[source]¶
m_acc function.
- Parameters
gt_novel (
DataFrame
) – ground truth detections (Dimension: [img X detection])p_class (
DataFrame
) – class predictions (Dimension: [img X prob that sample is novel, prob of 88 known classes])gt_class (
DataFrame
) – ground truth classes (Dimension: [img X detection, classification])round_size (
int
) – size of the roundasymptotic_start_round (
int
) – asymptotic samples considered for computing metrics
- Return type
- Returns
Dictionary containing top1, top3 accuracy over the test, pre and post novelty.
- m_accuracy_on_novel(p_class, gt_class, gt_novel)[source]¶
Additional Metric: Novelty robustness.
The method computes top-K accuracy for only the novel samples
- Parameters
p_class (
DataFrame
) – detection predictions (Dimension: [img X prob that sample is novel, prob of known classes]) Nx(K+1) matrix with each row corresponding to K+1 class probabilities for each samplegt_class (
DataFrame
) – ground truth classes (Dimension: [img X classification]) Nx1 vector with ground-truth class for each samplegt_novel (
DataFrame
) – ground truth detections (Dimension: N X [img, classification]) Nx1 binary vector corresponding to the ground truth novel{1}/seen{0} labels
- Return type
- Returns
Accuracy on novely samples
- m_is_cdt_and_is_early(gt_idx, ta2_idx, test_len)[source]¶
Is change detection and is change detection early (m_is_cdt_and_is_early) function.
- Parameters
- Return type
Dict
- Returns
Dictionary containing boolean showing if change was was detected and if it was detected early
- m_ndp(p_novel, gt_novel, mode='full_test')[source]¶
Novelty Detection Performance: Program Metric.
Novelty detection performance. The method computes per-sample novelty detection performance.
- Parameters
p_novel (
ndarray
) – detection predictions (Dimension: [img X novel]) Nx1 vector with each element corresponding to probability of it being novelgt_novel (
ndarray
) – ground truth detections (Dimension: [img X detection]) Nx1 vector with each element 0 (not novel) or 1 (novel)mode (
str
) – the mode to compute the test. if ‘full_test’ computes on all test samples, if ‘post_novelty’ computes from first GT novel sample. If ‘pre_novelty’, only calculate before first novel sample.
- Return type
- Returns
Dictionary containing novelty detection performance over the test.
- m_ndp_failed_reaction(p_novel, gt_novel, p_class, gt_class, mode='full_test')[source]¶
Additional Metric: Novelty detection when reaction fails.
Not Implemented since no gt_class info for novel samples
The method computes novelty detection performance for only on samples with incorrect k-class predictions
- Parameters
p_novel (
DataFrame
) – detection predictions (Dimension: [img X novel]) Nx1 vector with each element corresponding to probability of noveltygt_novel (
DataFrame
) – ground truth detections (Dimension: [img X detection]) Nx1 vector with each element 0 (not novel) or 1 (novel)p_class (
DataFrame
) – detection predictions (Dimension: [img X prob that sample is novel, prob of 88 known classes]) Nx(K+1) matrix with each row corresponding to K+1 class probabilities for each samplegt_class (
DataFrame
) – ground truth classes (Dimension: [img X classification]) Nx1 vector with ground-truth class for each samplemode (
str
) – if ‘full_test’ computes on all test samples, if ‘post_novelty’ computes from the first GT novel sample. If ‘pre_novelty’, than everything before novelty introduced.
- Return type
- Returns
Dictionary containing TP, FP, TN, FN, top1, top3 accuracy over the test.
- m_ndp_post(p_novel, gt_novel)[source]¶
Novelty Detection Performance Post Red Light. m_ndp_post function.
- See
m_ndp()
with post_novelty. This computes from the first GT novel sample
- See
- m_ndp_pre(p_novel, gt_novel)[source]¶
Novelty Detection Performance Pre Red Light. m_ndp_pre function.
- See
m_ndp()
with post_novelty. This computes to the first GT novel sample. It really isn’t useful and is just added for completion. Should always be 0 since no possible TP.
- See
- m_num(p_novel, gt_novel)[source]¶
m_num function.
A Program Metric where the number of samples needed for detecting novelty. The method computes the number of GT novel samples needed to predict the first true positive.
- Parameters
- Return type
- Returns
Difference between the novelty introduction and predicting change in world.
- m_num_stats(p_novel, gt_novel)[source]¶
Program Metric.
- Number of samples needed for detecting novelty. The method computes number of GT novel
samples needed to predict the first true positive.
- Parameters
- Return type
- Returns
Dictionary containing indices for novelty introduction and change in world prediction.
Utility Methods¶
Source code: sail_on_client/evaluate/utils.py
Helper functions for metrics.
- sail_on_client.evaluate.utils.check_class_validity(p_class, gt_class)[source]¶
Check the validity of the inputs for image classification.
Inputs: p_class: Nx(K+1) matrix with each row corresponding to K+1 class probabilities for each sample gt_class: Nx1 vector with ground-truth class for each sample
- Return type
- Parameters
p_class (numpy.ndarray) –
gt_class (numpy.ndarray) –
- sail_on_client.evaluate.utils.check_novel_validity(p_novel, gt_novel)[source]¶
Check the validity of the inputs for per-sample novelty detection.
- sail_on_client.evaluate.utils.get_first_detect_novelty(p_novel, thresh)[source]¶
Find the first index where novelty is detected.
- sail_on_client.evaluate.utils.get_rolling_stats(p_class, gt_class, k=1, window_size=50)[source]¶
Compute rolling statistics which are used for robustness measures.
- Parameters
- Return type
- Returns
List with mean and standard deviation
- sail_on_client.evaluate.utils.top1_accuracy(p_class, gt_class, txt='')[source]¶
Compute top-1 accuracy. (see topk_accuracy() for details).
- sail_on_client.evaluate.utils.top3_accuracy(p_class, gt_class, txt='')[source]¶
Compute top-3 accuracy. (see topk_accuracy() for details).