Harness¶
Harnesses are used for testing and evaluating TA2 agents. The
evaluation can be conducted using a server setup by TA1 or by
providing the files containing ground truth and metadata
associated with the tests. The abstraction was primarily to communicate with the
evaluation server or replicate the same functionality without with files provided
by TA1s. They work in conjunction with the protocol classes to fulfill the input
and output requirements of an agent. The harnesses are subclasses of TestAndEvaluationHarness
.
We support two harnesses in sail-on-client
Local Harness¶
LocalHarness
is primarily used for replicating the capabilities of
ParHarness
without using the server. This allows local testing an agent
without setting up a server instance locally or via a URL. Since LocalHarness uses
the files it requires 3 parameters:
data_dir
: Root directory where the data for tests is storedgt_dir
: Root directory where ground truth is storedgt_config
: A json file with column mapping for ground truth
PAR Harness¶
ParHarness
is primarily responsible for communicating with the evaluation
server setup by the TA1 team. The interface relies on RESTful api (detailed in the next section) to provide the
following features
Support batch inquiry (also called rounds) with full data set response.
Support batch response for evaluation.
Answer requests for multiple dataset types.
Includes option for ‘hints’ accompanying datasets.
Can accept additional meta-data along with annotations, labels and localization data (e.g time intervals for video) along with class and certainty scores.
Provide feedback as requested by the algorithm after the results for a batch have been submitted.
REST API¶
This section provides a detailed description of the RESTful api used for communication.
Request Name |
Request Type |
Definition |
Request Data |
Response Data |
---|---|---|---|---|
Test Request |
GET |
TA2 Requests for Test Identifiers as part of a series of individual tests. |
|
1. CSV file containing: Test ID(s) with the following naming convention: Protocol.Group.Run.Seed |
New Session |
POST |
Create a new session to evaluate the detector using an empirical protocol. |
|
|
Dataset Request |
GET |
Request data for evaluation. |
|
|
Get Feedback |
GET |
|
|
|
Get Metadata |
GET |
Get metadata for a test |
|
|
Post Results |
POST |
Post client detector predictions for the dataset. |
|
|
Evaluation |
GET |
Get results for test(s) |
|
|
Terminate Session |
DELETE |
|
|
|