On July 29, 2025, the National Institute of Standards & Technology (“NIST”) unveiled an outline for preliminary, stakeholder-driven standards, known as a “zero draft”, for AI testing, evaluation, verification and validation (“TEVV”). This outline is part of NIST’s AI Standards Zero Drafts pilot project, which was announced on March 25, 2025, as we previously reported. The goal is to create a flexible, high-level framework for companies to design their own AI testing and validation procedures. Of note, NIST is not prescribing exact methods for testing and validation. Instead, it offers a structure around key terms, lifecycle stages, and guiding principles that align with future international standards. NIST has asked for stakeholder input on the topics, scope, and priorities of the Zero Drafts process, and feedback is open until September 12, 2025.
The NIST outline breaks AI TEVV into several foundational elements, a non-exhaustive list of which includes:
- Clear Definitions – Establishes consistent terms for AI testing, evaluation, verification, and validation so organizations speak the same language when assessing features, qualifies, performance, and other characteristics.
- Specific Considerations – NIST seeks input on specific considerations that should be included for testing, evaluation, verification, or validation. These considerations include practical feasibility, sampling of data and cases, selection of approaches and methods, reliability, and other considerations.
- Limitations of TEVV for AI – For many AI systems, certain characteristics are likely very challenging or not practicable to ascertain with complete certainty. For example, it can be impracticable to explain or trace the individual outputs of complex large language models (LLMs), and the training data for LLMs may be too large to review. Evaluators may be resource constrained in the amount of testing they can do and have to turn to probabilistic findings for complex systems. The outline proposes addressing these hurdles by using abstract concepts that are not fully operationalized, such as assessing user satisfaction through user surveys rather than automated, metrics-driven processes. Well planned TEVV procedures with defined steps can help mitigate these challenges.
- Documentation Requirements – Documentation should provide readers with the information necessary to interpret the report and the limitations of the testing.
- Governance, Process, and Organizational Requirements – Calls for organizations conducting TEVV to define clear system objectives and characteristics, translate them into practical and measurable processes, and ensure those processes are consistent, repeatable, and capable of producing reliable results across multiple assessments. It also calls for aligning evaluation goals with organizational needs while accounting for constraints such as technical limits and budget, and for incorporating sound methodological considerations—such as validity, reliability, and proper scope definition—to ensure that testing captures the full range of relevant system factors.
- Risk-Based Approach – Encourages tailoring TEVV processes to the system’s intended use, potential impact, and associated risks.
NIST announced that a second outline for a proposed “zero draft” on “documentation of model and data characteristics for transparency among AI actors” will be released soon.
* * *
Please contact Micaela McMurrough or Jennifer Johnson if you are interested in submitting comments by the September 12 deadline.