Responsible AI Toolkit

4.1 Improve & Innovate: Instrument AI to promote Assurance

Have the acquisition requirements been developed in accordance with the DoD AI Ethical Principles and (if applicable) in ways that avoid vendor lock-in? Including issues such as:
1. Documentation requirements
2. Usage rights
3. Permissions
4. Data and data pipeline access (i.e. avoiding proprietary formats)
5. Distribution Protocols
6. Access for model auditing
If something goes wrong with an externally procured system while it is in use, has it been established and agreed upon between AI suppliers and your organization who is responsible and who is accountable, depending on the scenario?
Have you and the vendor budgeted for RAI activities, including:
1. Data/model/systems card and traceability matrix creation and updating
2. Continuous monitoring
3. Model retraining and system updating
4. Continuous harms and impact modeling
5. Stakeholder engagement
6. Human systems integration/human machine teaming testing
7. User training
8. Assurance and Trust metrics testing
9. Routine system (and component) auditing
10. Sunset procedures
11. Uploading lessons learned into use case and incident repositories
Ensure appropriate documentation procedures are in place:
1. Have you updated the data/model/system cards?
2. Are these understandable by and accessible to various personas/roles and stakeholders, and at various levels of technical expertise?
3. Will the documentation be regularly monitored and updated at each stage of product development and deployment?
4. Are there plans for the documentation of data provenance containing information such as where the data was sourced, why it was collected, who collected the data, who labelled the data, what transformations were applied, how the data was modified, etc.?
5. Are there plans to use a traceability matrix for tracking model versions and validation and verification results?
6. Is the explanation for the system's decision/behavior automatically included in the decision report/output?
Establish procedure and scope for user testing:
1. What are the possible sources of human error?
2. How will operator performance be evaluated and how can it be improved?
Establish procedures through which trust and assurance will be measured and supported:
1. Has justified confidence/trust of the operational users been measured? Is it at acceptable levels? Can it be increased?
2. Has justified confidence of other stakeholders been measured? Is it at acceptable levels? Can it be increased?
3. Have other tools been integrated to promote assurance and justified confidence in the system?
Have tools for explainability, uncertainty quantification, or competence estimation been used to increase assurance and reduce human error? How are you tracking that these metrics are understood correctly?
Have you established a cadence and procedure through which new data will be collected, models will be retrained, and the system will be updated?
Describe the periodicity and processs of adding new data.
Revisit 2.4. How are you designing your system or leveraging other data/AI-enabled capabilities to reduce the ethical/risk burden on operational users, decision makers, senior leaders, developers, and impacted stakeholders that would otherwise be present due to the employment or existence of the system.

4.2 Update Documentation

Update SOCs and data/model cards, as necessary. Have team consult and update DAGR to support continuous risk identification - as new risks (or opportunities) are identified.

4. SHIELD Development/Acquisition

4.1 Improve & Innovate: Instrument AI to promote Assurance

4.2 Update Documentation