Responsible AI Toolkit

5.1 EVALUATE: Test System for Robustness, Resilience, and Reliability

Are all parts of the stack subject to testing (sensor, datasets, model, environment, etc.)?
1. For swarm and distributed effector technologies, have the devices been tested in cooperation with one another?
Have there been unit tests of each component in isolation? Have there been integration tests to understand how the components interact with one another within the overall system?
Has the system and its components been tested for robustness against:
1. Perturbations (natural and adversarial)
2. Adversarial attack
3. Data/concept/model drift or data poisoning
4. Human error and unintended or malicious use
How has the system been tested for:
1. performance
2. safety
3. maintainability
4. suitability
5. security
6. understandability of outputs
If applicable, leverage red-teaming techniques or bounties to help identify or anticipate system robustness/vulnerabilities. Leverage software tools through approaches such as fuzzing, for finding vulnerabilities.
If applicable, are end users able to appropriately understand how outputs are produced and what they mean?

Confirm that the way the concepts and constructs have been operationalized make sense given the use case, context and potential impacts, and DoD AI Ethical Principles.
Confirm all relevant elements of the ontology are included for measurement and assessment.
Describe the security review process, and the authorization received after its completion.

Update SOCs, impact and risk assessments, CONOPS, data/model cards, and DAGR as needed.