Metric API¶

Methods For Metric¶

Source code: sail_on_client/evaluate/metrics.py

Program Metric Functions.

sail_on_client.evaluate.metrics.m_acc(gt_novel, p_class, gt_class, round_size, asymptotic_start_round)[source]¶

Compute top1 and top3 accuracy.

Parameters

p_novel – NX1 vector with each element corresponding to probability of novelty
p_class (ndarray) – Nx(K+1) matrix with each row corresponding to K+1 class probabilities for each sample
gt_class (ndarray) – Nx1 vector with ground-truth class for each sample
round_size (int) – Number of samples in a single round of the test
asymptotic_start_round (int) – Round id where metric computation starts
gt_novel (numpy.ndarray) –

Return type

Dict

Returns

Dictionary with results

sail_on_client.evaluate.metrics.m_accuracy_on_novel(p_class, gt_class, gt_novel)[source]¶

Additional Metric: Novelty robustness.

The method computes top-K accuracy for only the novel samples

Parameters

p_class (ndarray) – Nx(K+1) matrix with each row corresponding to K+1 class probabilities for each sample
gt_class (ndarray) – Nx1 vector with ground-truth class for each sample
gt_novel (ndarray) – Nx1 binary vector corresponding to the ground truth novel{1}/seen{0} labels
k – K value to compute accuracy at

Return type

Dict

Returns

Accuracy at rank-k

sail_on_client.evaluate.metrics.m_ndp(p_novel, gt_novel, mode='full_test')[source]¶

Program Metric: Novelty detection performance.

The method computes per-sample novelty detection performance

Parameters

p_novel (ndarray) – NX1 vector with each element corresponding to probability of it being novel
gt_novel (ndarray) – NX1 vector with each element 0 (not novel) or 1 (novel)
mode (str) – if ‘full_test’ computes on all test samples, if ‘post_novelty’ computes from first GT novel sample

Returns

Accuracy, Precision, Recall, F1_score and Confusion matrix

Return type

Dictionary of various metrics

sail_on_client.evaluate.metrics.m_ndp_failed_reaction(p_novel, gt_novel, p_class, gt_class, mode='full_test')[source]¶

Additional Metric: Novelty detection when reaction fails.

The method computes novelty detection performance for only on samples with incorrect k-class predictions

Parameters

p_novel (ndarray) – NX1 vector with each element corresponding to probability of novelty
gt_novel (ndarray) – NX1 vector with each element 0 (not novel) or 1 (novel)
p_class (ndarray) – Nx(K+1) matrix with each row corresponding to K+1 class probabilities for each sample
gt_class (ndarray) – Nx1 vector with ground-truth class for each sample
mode (str) – if ‘full_test’ computes on all test samples, if ‘post_novelty’ computes from the first GT novel sample
k – ‘k’ used in top-K accuracy

Returns

Accuracy, Precision, Recall, F1_score and Confusion matrix

Return type

Dictionary of various metrics

sail_on_client.evaluate.metrics.m_ndp_post(p_novel, gt_novel)[source]¶

Additional Metric: Novelty detection performance after novelty is introduced.

Parameters

p_novel (ndarray) – NX1 vector with each element corresponding to probability of novelty
gt_novel (ndarray) – NX1 vector with each element 0 (not novel) or 1 (novel)

Returns

Accuracy, Precision, Recall, F1_score and Confusion matrix

Return type

Dictionary of following metrics values

sail_on_client.evaluate.metrics.m_ndp_pre(p_novel, gt_novel)[source]¶

Additional Metric: Novelty detection performance before novelty is introduced.

Parameters

p_novel (ndarray) – NX1 vector with each element corresponding to probability of novelty
gt_novel (ndarray) – NX1 vector with each element 0 (not novel) or 1 (novel)

Returns

Accuracy, Precision, Recall, F1_score and Confusion matrix

Return type

Dictionary of following metrics values

sail_on_client.evaluate.metrics.m_num(p_novel, gt_novel)[source]¶

Program Metric: Number of samples needed for detecting novelty.

The method computes number of GT novel samples needed to predict the first true positive.

Parameters

p_novel (ndarray) – NX1 vector with each element corresponding to probability of novelty
gt_novel (ndarray) – NX1 vector with each element 0 (not novel) or 1 (novel)

Return type

Dict

Returns

single scalar for number of GT novel samples

sail_on_client.evaluate.metrics.m_num_stats(p_novel, gt_novel)[source]¶

Program Metric: Number of samples needed for detecting novelty.

The method computes number of GT novel samples needed to predict the first true positive.

Parameters

p_novel (ndarray) – NX1 vector with each element corresponding to probability of novelty
gt_novel (ndarray) – NX1 vector with each element 0 (not novel) or 1 (novel)

Return type

Dict

Returns

single scalar for number of GT novel samples

Base Class For Metric¶

Source code: sail_on_client/evaluate/program_metrics.py

Abstract Class for metrics for sail-on.

class sail_on_client.evaluate.program_metrics.ProgramMetrics[source]¶

Abstract program metric class.

__init__(protocol)[source]¶

Initialize.

Parameters: protocol (str) – Name of the protocol.
Returns: None
Return type: None

abstract m_acc(gt_novel, p_class, gt_class, round_size, asymptotic_start_round)[source]¶

m_acc abstract function.

Parameters

gt_novel (ndarray) – ground truth detections
p_class (ndarray) – class predictions
gt_class (ndarray) – ground truth classes
round_size (int) – size of the round
asymptotic_start_round (int) – asymptotic samples considered for computing metrics

Return type

Dict

Returns

Dictionary containing top1, top3 accuracy over the test, pre and post novelty.

abstract m_acc_round_wise(p_class, gt_class, round_id)[source]¶

m_acc_round_wise abstract function.

Parameters

p_class (ndarray) – class predictions
gt_class (ndarray) – ground truth classes
round_id (int) –

Return type

Dict

Returns

Dictionary containing top1, top3 accuracy for a round

abstract m_accuracy_on_novel(p_novel, gt_class, gt_novel)[source]¶

m_accuracy_on_novel abstract function.

Parameters

p_novel (ndarray) – detection predictions
gt_class (ndarray) – ground truth classes
gt_novel (ndarray) – ground truth detections

Return type

Dict

Returns

Accuracy on novely samples

abstract m_is_cdt_and_is_early(gt_idx, ta2_idx, test_len)[source]¶

m_is_cdt_and_is_early abstract function.

Parameters

gt_idx (int) – Index when novelty is introduced
ta2_idx (int) – Index when change is detected
test_len (int) – Length of test

Return type

Dict

Returns: Dictionary containing boolean showing if change was was detected and if it was detected early

Return type

Dict

Parameters

gt_idx (int) –
ta2_idx (int) –
test_len (int) –

abstract m_ndp(p_novel, gt_novel)[source]¶

m_ndp abstract function.

Parameters

p_novel (ndarray) – detection predictions
gt_novel (ndarray) – ground truth detections

Return type

Dict

Returns

Dictionary containing novelty detection performance over the test.

abstract m_ndp_failed_reaction(p_novel, gt_novel, p_class, gt_class)[source]¶

m_ndp_failed_reaction abstract function.

Parameters

p_novel (ndarray) – detection predictions
gt_novel (ndarray) – ground truth detections
p_class (ndarray) – class predictions
gt_class (ndarray) – ground truth classes

Return type

Dict

Returns

Dictionary containing TP, FP, TN, FN, top1, top3 accuracy over the test.

abstract m_ndp_post(p_novel, gt_novel)[source]¶

m_ndp_post abstract function.

Parameters

p_novel (ndarray) – detection predictions
gt_novel (ndarray) – ground truth detections

Return type

Dict

Returns

Dictionary containing detection performance post novelty.

abstract m_ndp_pre(p_novel, gt_novel)[source]¶

m_ndp_pre abstract function.

Parameters

p_novel (ndarray) – detection predictions
gt_novel (ndarray) – ground truth detections

Return type

Dict

Returns

Dictionary containing detection performance pre novelty.

abstract m_num(p_novel, gt_novel)[source]¶

m_num abstract function.

Parameters

p_novel (ndarray) – detection predictions
gt_novel (ndarray) – ground truth detections

Return type

Dict

Returns

Difference between the novelty introduction and predicting change in world.

abstract m_num_stats(p_novel, gt_novel)[source]¶

m_num_stats abstract function.

Parameters

p_novel (ndarray) – detection predictions
gt_novel (ndarray) – ground truth detections

Return type

Dict

Returns

Dictionary containing indices for novelty introduction and change in world prediction.

Activity Recognition Metric¶

Source code: sail_on_client/evaluate/activity_recognition.py

Activity Recognition Class for metrics for sail-on.

class sail_on_client.evaluate.activity_recognition.ActivityRecognitionMetrics[source]¶

Activity Recognition program metric class.

__init__(protocol, video_id, novel, detection, classification, spatial, temporal)[source]¶

Initialize.

Parameters

protocol (str) – Name of the protocol.
video_id (int) – Column id for video
novel (int) – Column id for predicting if change was detected
detection (int) – Column id for predicting sample wise novelty
classification (int) – Column id for predicting sample wise classes
spatial (int) – Column id for predicting spatial attribute
temporal (int) – Column id for predicting temporal attribute

Returns

None

Return type

None

m_acc(gt_novel, p_class, gt_class, round_size, asymptotic_start_round)[source]¶

m_acc function.

Parameters

gt_novel (DataFrame) – ground truth detections for N videos (Dimension: N X 1)
p_class (DataFrame) – class predictions with video id for N videos (Dimension: N X 90 [vid,novel_class,88 known class])
gt_class (DataFrame) – ground truth classes for N videos (Dimension: N X 1)
round_size (int) – size of the round
asymptotic_start_round (int) – asymptotic samples considered for computing metrics

Return type

Dict

Returns

Dictionary containing top1, top3 accuracy over the test, pre and post novelty.

m_acc_round_wise(p_class, gt_class, round_id)[source]¶

m_acc_round_wise function.

Parameters

p_class (DataFrame) – detection predictions
gt_class (DataFrame) – ground truth classes
round_id (int) – round identifier

Return type

Dict

Returns

Dictionary containing top1, top3 accuracy for a round

m_accuracy_on_novel(p_class, gt_class, gt_novel)[source]¶

m_accuracy_on_novel function.

Parameters

p_class (DataFrame) – class predictions with video id for N videos (Dimension: N X 90 [vid,novel_class,88 known class])
gt_class (DataFrame) – ground truth classes for N videos (Dimension: N X 1)
gt_novel (DataFrame) – ground truth detections for N videos (Dimension: N X 1)

Return type

Dict

Returns

Accuracy on novely samples

m_is_cdt_and_is_early(gt_idx, ta2_idx, test_len)[source]¶

m_is_cdt_and_is_early function.

Parameters

gt_idx (int) – Index when novelty is introduced
ta2_idx (int) – Index when change is detected
test_len (int) – Length of test

Return type

Dict

Returns: Dictionary containing boolean showing if change was was detected and if it was detected early

Return type

Dict

Parameters

gt_idx (int) –
ta2_idx (int) –
test_len (int) –

m_ndp(p_novel, gt_novel)[source]¶

m_ndp function.

Parameters

p_novel (ndarray) – detection predictions for N videos (Dimension: N X 1)
gt_novel (ndarray) – ground truth detections for N videos (Dimension: N X 1)

Return type

Dict

Returns

Dictionary containing novelty detection performance over the test.

m_ndp_failed_reaction(p_novel, gt_novel, p_class, gt_class)[source]¶

m_ndp_failed_reaction function.

Parameters

p_novel (DataFrame) – detection predictions for N videos (Dimension: N X 1)
gt_novel (DataFrame) – ground truth detections for N videos (Dimension: N X 1)
p_class (DataFrame) – class predictions with video id for N videos (Dimension: N X 90 [vid,novel_class,88 known class])
gt_class (DataFrame) – ground truth classes for N videos (Dimension: N X 1)

Return type

Dict

Returns

Dictionary containing TP, FP, TN, FN, top1, top3 accuracy over the test.

m_ndp_post(p_novel, gt_novel)[source]¶

m_ndp_post function.

Parameters

p_novel (ndarray) – detection predictions for N videos (Dimension: N X 1)
gt_novel (ndarray) – ground truth detections for N videos (Dimension: N X 1)

Return type

Dict

Returns

Dictionary containing detection performance post novelty.

m_ndp_pre(p_novel, gt_novel)[source]¶

m_ndp_pre function.

Parameters

p_novel (ndarray) – detection predictions for N videos (Dimension: N X 1)
gt_novel (ndarray) – ground truth detections for N videos (Dimension: N X 1)

Return type

Dict

Returns

Dictionary containing detection performance pre novelty.

m_nrp(ta2_acc, baseline_acc)[source]¶

m_nrp function.

Parameters

ta2_acc (Dict) – Accuracy scores for the agent
baseline_acc (Dict) – Accuracy scores for baseline

Return type

Dict

Returns

Reaction performance for the agent

m_num(p_novel, gt_novel)[source]¶

m_num function.

Parameters

p_novel (DataFrame) – detection predictions for N videos (Dimension: N X 1)
gt_novel (DataFrame) – ground truth detections for N videos (Dimension: N X 1)

Return type

Dict

Returns

Difference between the novelty introduction and predicting change in world.

m_num_stats(p_novel, gt_novel)[source]¶

m_num_stats function.

Parameters

p_novel (ndarray) – detection predictions for N videos (Dimension: N X 1)
gt_novel (ndarray) – ground truth detections for N videos (Dimension: N X 1)

Return type

Dict

Returns

Dictionary containing indices for novelty introduction and change in world prediction.

Document Transcription Metric¶

Source code: sail_on_client/evaluate/document_transcription.py

Document Transcription Class for metrics for sail-on.

class sail_on_client.evaluate.document_transcription.DocumentTranscriptionMetrics[source]¶

Document transcription program metric class.

__init__(protocol, image_id, text, novel, representation, detection, classification, pen_pressure, letter_size, word_spacing, slant_angle, attribute)[source]¶

Initialize.

Parameters

protocol (str) – Name of the protocol.
image_id (int) – Column id for image
text (int) – Transcription associated with the image
novel (int) – Column id for predicting if change was detected
representation (int) – Column id with representation novelty label
detection (int) – Column id with sample wise novelty
classification (int) – Column id with writer id
pen_pressure (int) – Column id with pen pressure values
letter_size (int) – Column id with letter size values
word_spacing (int) – Column id with word spacing values
slant_angle (int) – Column id with slant angle values
attribute (int) – Column id with attribute level novelty label

Returns

None

Return type

None

m_acc(gt_novel, p_class, gt_class, round_size, asymptotic_start_round)[source]¶

m_acc helper function used for computing novelty reaction performance.

Parameters

gt_novel (DataFrame) – ground truth detections (Dimension: [img X detection])
p_class (DataFrame) – class predictions (Dimension: [img X prob that sample is novel, prob of known classes])
gt_class (DataFrame) – ground truth classes (Dimension: [img X class idx])
round_size (int) – size of the round
asymptotic_start_round (int) – asymptotic samples considered for computing metrics

Return type

Dict

Returns

Dictionary containing top1, top3 accuracy over the test, pre and post novelty.

m_acc_round_wise(p_class, gt_class, round_id)[source]¶

m_acc_round_wise function.

Parameters

p_class (DataFrame) – class predictions
gt_class (DataFrame) – ground truth classes
round_id (int) – round identifier

Return type

Dict

Returns

Dictionary containing top1, top3 accuracy for a round

m_accuracy_on_novel(p_class, gt_class, gt_novel)[source]¶

Additional Metric: Novelty robustness.

Not Implemented since no gt_class info for novel samples. The method computes top-K accuracy for only the novel samples

Parameters

p_class (DataFrame) – detection predictions (Dimension: [img X prob that sample is novel, prob of known classes])
gt_class (DataFrame) – ground truth classes (Dimension: [img X class idx])
gt_novel (DataFrame) – ground truth detections (Dimension: [img X detection])

Return type

Dict

Returns

Accuracy on novely samples

m_is_cdt_and_is_early(gt_idx, ta2_idx, test_len)[source]¶

Is change detection and is change detection early (m_is_cdt_and_is_early) function.

Parameters

gt_idx (int) – Index when novelty is introduced
ta2_idx (int) – Index when change is detected
test_len (int) – Length of test

Return type

Dict

Returns: Dictionary containing boolean showing if change was was detected and if it was detected early

Return type

Dict

Parameters

gt_idx (int) –
ta2_idx (int) –
test_len (int) –

m_ndp(p_novel, gt_novel)[source]¶

m_ndp function.

Novelty detection performance. The method computes per-sample novelty detection performance over the entire test.

Parameters

p_novel (ndarray) – detection predictions (Dimension: [img X novel])
gt_novel (ndarray) – ground truth detections (Dimension: [img X detection])

Return type

Dict

Returns

Dictionary containing novelty detection performance over the test.

m_ndp_failed_reaction(p_novel, gt_novel, p_class, gt_class)[source]¶

m_ndp_failed_reaction function.

Not Implemented since no gt_class info for novel samples. The method computes novelty detection performance for only on samples with incorrect k-class predictions

Parameters

p_novel (DataFrame) – detection predictions (Dimension: [img X novel])
gt_novel (DataFrame) – ground truth detections (Dimension: [img X detection])
p_class (DataFrame) – detection predictions (Dimension: [img X prob that sample is novel, prob of known classes])
gt_class (DataFrame) – ground truth classes (Dimension: [img X class idx])

Return type

Dict

Returns

Dictionary containing TP, FP, TN, FN, top1, top3 accuracy over the test.

m_ndp_post(p_novel, gt_novel)[source]¶

m_ndp_post function.

See m_ndp() with post_novelty. This computes from the first GT novel sample. :type p_novel: ndarray :param p_novel: detection predictions (Dimension: [img X novel]) :type gt_novel: ndarray :param gt_novel: ground truth detections (Dimension: [img X detection])

Return type

Dict

Returns

Dictionary containing detection performance post novelty.

Parameters

p_novel (numpy.ndarray) –
gt_novel (numpy.ndarray) –

m_ndp_pre(p_novel, gt_novel)[source]¶

m_ndp_pre function.

See m_ndp() with post_novelty. This computes to the first GT novel sample. It really isn’t useful and is just added for completion. Should always be 0 since no possible TP.

Parameters

p_novel (ndarray) – detection predictions (Dimension: [img X novel])
gt_novel (ndarray) – ground truth detections (Dimension: [img X detection])

Return type

Dict

Returns

Dictionary containing detection performance pre novelty.

m_nrp(ta2_acc, baseline_acc)[source]¶

m_nrp function.

Parameters

ta2_acc (Dict) – Accuracy scores for the agent
baseline_acc (Dict) – Accuracy scores for baseline

Return type

Dict

Returns

Reaction performance for the agent

m_num(p_novel, gt_novel)[source]¶

m_num function.

A Program Metric where the number of samples needed for detecting novelty. The method computes the number of GT novel samples needed to predict the first true positive.

Parameters

p_novel (DataFrame) – detection predictions (Dimension: [img X novel])
gt_novel (DataFrame) – ground truth detections (Dimension: [img X detection])

Return type

Dict

Returns

Difference between the novelty introduction and predicting change in world.

m_num_stats(p_novel, gt_novel)[source]¶

m_num_stats function.

Number of samples needed for detecting novelty. The method computes number of GT novel samples needed to predict the first true positive.

Parameters

p_novel (ndarray) – detection predictions (Dimension: [img X novel])
gt_novel (ndarray) – ground truth detections (Dimension: [img X detection])

Return type

Dict

Returns

Dictionary containing indices for novelty introduction and change in world prediction.

Image Classification Metric¶

Source code: sail_on_client/evaluate/image_classification.py

Image Classification Class for metrics for sail-on.

class sail_on_client.evaluate.image_classification.ImageClassificationMetrics[source]¶

Image Classification program metric class.

__init__(protocol, image_id, detection, classification)[source]¶

Initialize.

Parameters

protocol (str) – Name of the protocol.
image_id (int) – Column id for image
detection (int) – Column id for predicting sample wise world detection
classification (int) – Column id for predicting sample wise classes

Returns

None

Return type

None

m_acc(gt_novel, p_class, gt_class, round_size, asymptotic_start_round)[source]¶

m_acc function.

Parameters

gt_novel (DataFrame) – ground truth detections (Dimension: [img X detection])
p_class (DataFrame) – class predictions (Dimension: [img X prob that sample is novel, prob of 88 known classes])
gt_class (DataFrame) – ground truth classes (Dimension: [img X detection, classification])
round_size (int) – size of the round
asymptotic_start_round (int) – asymptotic samples considered for computing metrics

Return type

Dict

Returns

Dictionary containing top1, top3 accuracy over the test, pre and post novelty.

m_acc_round_wise(p_class, gt_class, round_id)[source]¶

m_acc_round_wise function.

Parameters

p_class (DataFrame) – class predictions
gt_class (DataFrame) – ground truth classes
round_id (int) – round identifier

Return type

Dict

Returns

Dictionary containing top1, top3 accuracy for a round

m_accuracy_on_novel(p_class, gt_class, gt_novel)[source]¶

Additional Metric: Novelty robustness.

The method computes top-K accuracy for only the novel samples

Parameters

p_class (DataFrame) – detection predictions (Dimension: [img X prob that sample is novel, prob of known classes]) Nx(K+1) matrix with each row corresponding to K+1 class probabilities for each sample
gt_class (DataFrame) – ground truth classes (Dimension: [img X classification]) Nx1 vector with ground-truth class for each sample
gt_novel (DataFrame) – ground truth detections (Dimension: N X [img, classification]) Nx1 binary vector corresponding to the ground truth novel{1}/seen{0} labels

Return type

Dict

Returns

Accuracy on novely samples

m_is_cdt_and_is_early(gt_idx, ta2_idx, test_len)[source]¶

Is change detection and is change detection early (m_is_cdt_and_is_early) function.

Parameters

gt_idx (int) – Index when novelty is introduced
ta2_idx (int) – Index when change is detected
test_len (int) – Length of test

Return type

Dict

Returns: Dictionary containing boolean showing if change was was detected and if it was detected early

Return type

Dict

Parameters

gt_idx (int) –
ta2_idx (int) –
test_len (int) –

m_ndp(p_novel, gt_novel, mode='full_test')[source]¶

Novelty Detection Performance: Program Metric.

Novelty detection performance. The method computes per-sample novelty detection performance.

Parameters

p_novel (ndarray) – detection predictions (Dimension: [img X novel]) Nx1 vector with each element corresponding to probability of it being novel
gt_novel (ndarray) – ground truth detections (Dimension: [img X detection]) Nx1 vector with each element 0 (not novel) or 1 (novel)
mode (str) – the mode to compute the test. if ‘full_test’ computes on all test samples, if ‘post_novelty’ computes from first GT novel sample. If ‘pre_novelty’, only calculate before first novel sample.

Return type

Dict

Returns

Dictionary containing novelty detection performance over the test.

m_ndp_failed_reaction(p_novel, gt_novel, p_class, gt_class, mode='full_test')[source]¶

Additional Metric: Novelty detection when reaction fails.

Not Implemented since no gt_class info for novel samples

The method computes novelty detection performance for only on samples with incorrect k-class predictions

Parameters

p_novel (DataFrame) – detection predictions (Dimension: [img X novel]) Nx1 vector with each element corresponding to probability of novelty
gt_novel (DataFrame) – ground truth detections (Dimension: [img X detection]) Nx1 vector with each element 0 (not novel) or 1 (novel)
p_class (DataFrame) – detection predictions (Dimension: [img X prob that sample is novel, prob of 88 known classes]) Nx(K+1) matrix with each row corresponding to K+1 class probabilities for each sample
gt_class (DataFrame) – ground truth classes (Dimension: [img X classification]) Nx1 vector with ground-truth class for each sample
mode (str) – if ‘full_test’ computes on all test samples, if ‘post_novelty’ computes from the first GT novel sample. If ‘pre_novelty’, than everything before novelty introduced.

Return type

Dict

Returns

Dictionary containing TP, FP, TN, FN, top1, top3 accuracy over the test.

m_ndp_post(p_novel, gt_novel)[source]¶

Novelty Detection Performance Post Red Light. m_ndp_post function.

See m_ndp() with: post_novelty. This computes from the first GT novel sample

Parameters

p_novel (ndarray) – detection predictions (Dimension: [img X novel])
gt_novel (ndarray) – ground truth detections (Dimension: [img X detection])

Return type

Dict

Returns

Dictionary containing detection performance post novelty.

m_ndp_pre(p_novel, gt_novel)[source]¶

Novelty Detection Performance Pre Red Light. m_ndp_pre function.

See m_ndp() with: post_novelty. This computes to the first GT novel sample. It really isn’t useful and is just added for completion. Should always be 0 since no possible TP.

Parameters

p_novel (ndarray) – detection predictions (Dimension: [img X novel])
gt_novel (ndarray) – ground truth detections (Dimension: [img X detection])

Return type

Dict

Returns

Dictionary containing detection performance pre novelty.

m_nrp(ta2_acc, baseline_acc)[source]¶

m_nrp function.

Parameters

ta2_acc (Dict) – Accuracy scores for the agent
baseline_acc (Dict) – Accuracy scores for baseline

Return type

Dict

Returns

Reaction performance for the agent

m_num(p_novel, gt_novel)[source]¶

m_num function.

A Program Metric where the number of samples needed for detecting novelty. The method computes the number of GT novel samples needed to predict the first true positive.

Parameters

p_novel (DataFrame) – detection predictions (Dimension: [img X novel]) Nx1 vector with each element corresponding to probability of novelty
gt_novel (DataFrame) – ground truth detections (Dimension: [img X detection]) Nx1 vector with each element 0 (not novel) or 1 (novel)

Return type

Dict

Returns

Difference between the novelty introduction and predicting change in world.

m_num_stats(p_novel, gt_novel)[source]¶

Program Metric.

Number of samples needed for detecting novelty. The method computes number of GT novel: samples needed to predict the first true positive.

Parameters

p_novel (ndarray) – detection predictions (Dimension: [img X novel]) Nx1 vector with each element corresponding to probability of novelty
gt_novel (ndarray) – ground truth detections (Dimension: [img X detection]) Nx1 vector with each element 0 (not novel) or 1 (novel)

Return type

Dict

Returns

Dictionary containing indices for novelty introduction and change in world prediction.

sail_on_client.evaluate.image_classification.convert_df(old_filepath, new_filepath)[source]¶

Convert from the old df to the new df.

Parameters

old_filepath (str) – the filepath the the old *_single_df.csv file
new_filepath (str) – the filepath the the old *_single_df.csv file

Return type

None

Returns

None

Utility Methods¶

Source code: sail_on_client/evaluate/utils.py

Helper functions for metrics.

sail_on_client.evaluate.utils.check_class_validity(p_class, gt_class)[source]¶

Check the validity of the inputs for image classification.

Inputs: p_class: Nx(K+1) matrix with each row corresponding to K+1 class probabilities for each sample gt_class: Nx1 vector with ground-truth class for each sample

Return type

None

Parameters

p_class (numpy.ndarray) –
gt_class (numpy.ndarray) –

sail_on_client.evaluate.utils.check_novel_validity(p_novel, gt_novel)[source]¶

Check the validity of the inputs for per-sample novelty detection.

Parameters

p_novel (ndarray) – NX1 vector with each element corresponding to probability of novelty
gt_novel (ndarray) – NX1 vector with each element 0 (not novel) or 1 (novel)

Return type

None

Returns

None

sail_on_client.evaluate.utils.get_first_detect_novelty(p_novel, thresh)[source]¶

Find the first index where novelty is detected.

Parameters

p_novel (ndarray) – NX1 vector with each element corresponding to probability of novelty
thresh (float) – Score threshold for detecting when a sample is novel

Return type

int

Returns

Index where an agent reports that a sample is novel

sail_on_client.evaluate.utils.get_rolling_stats(p_class, gt_class, k=1, window_size=50)[source]¶

Compute rolling statistics which are used for robustness measures.

Parameters

p_class (ndarray) – Nx(K+1) matrix with each row corresponding to K+1 class probabilities for each sample
gt_class (ndarray) – Nx1 compute vector with ground-truth class for each sample
k (int) – ‘k’ used for selecting top k values
window_size (int) – Window size for running stats

Return type

List

Returns

List with mean and standard deviation

sail_on_client.evaluate.utils.top1_accuracy(p_class, gt_class, txt='')[source]¶

Compute top-1 accuracy. (see topk_accuracy() for details).

Parameters

p_class (ndarray) – Nx(K+1) matrix with each row corresponding to K+1 class probabilities for each sample
gt_class (ndarray) – Nx1 computevector with ground-truth class for each sample
txt (str) – Text associated with accuracy

Return type

float

Returns

top-1 accuracy

sail_on_client.evaluate.utils.top3_accuracy(p_class, gt_class, txt='')[source]¶

Compute top-3 accuracy. (see topk_accuracy() for details).

Parameters

p_class (ndarray) – Nx(K+1) matrix with each row corresponding to K+1 class probabilities for each sample
gt_class (ndarray) – Nx1 computevector with ground-truth class for each sample
txt (str) – Text associated with accuracy

Return type

float

Returns

top-3 accuracy

sail_on_client.evaluate.utils.topk_accuracy(p_class, gt_class, k, txt='')[source]¶

Compute top-K accuracy.

Parameters

p_class (ndarray) – Nx(K+1) matrix with each row corresponding to K+1 class probabilities for each sample
gt_class (ndarray) – Nx1 computevector with ground-truth class for each sample
k (int) – ‘k’ used in top-K accuracy
txt (str) – Text associated with accuracy

Return type

float

Returns

top-K accuracy

Metric API¶

Methods For Metric¶

Base Class For Metric¶

Activity Recognition Metric¶

Document Transcription Metric¶

Image Classification Metric¶

Utility Methods¶

Table of Contents

Previous topic

Next topic

This Page