5.1 EVALUATE: Test System for Robustness, Resilience, and Reliability
Are all parts of the stack subject to testing (sensor, datasets, model, environment, etc.)?
For swarm and distributed effector technologies, have the devices been tested in cooperation with one another?
Have there been unit tests of each component in isolation? Have there been integration tests to understand how the components interact with one another within the overall system?
Has the system and its components been tested for robustness against:
Perturbations (natural and adversarial)
Adversarial attack
Data/concept/model drift or data poisoning
Human error and unintended or malicious use
How has the system been tested for:
performance
safety
maintainability
suitability
security
understandability of outputs
If applicable, leverage red-teaming techniques or bounties to help identify or anticipate system robustness/vulnerabilities. Leverage software tools through approaches such as fuzzing, for finding vulnerabilities.
If applicable, are end users able to appropriately understand how outputs are produced and what they mean?
5.2 Revisit Documentation and Security
Confirm that the way the concepts and constructs have been operationalized make sense given the use case, context and potential impacts, and DoD AI Ethical Principles.
Confirm all relevant elements of the ontology are included for measurement and assessment.
Describe the security review process, and the authorization received after its completion.
5.3 Update Documentation
Update SOCs, impact and risk assessments, CONOPS, data/model cards, and DAGR as needed.