NIST Launches ARIA Program to Assess AI Risks and Impacts in Real-World Scenarios

By Alan N. Sutin on June 3, 2024

Editor’s Note: As artificial intelligence (AI) systems increasingly permeate various aspects of our daily lives, the National Institute of Standards and Technology (NIST) has launched a groundbreaking initiative, Assessing Risks and Impacts of AI (ARIA). This program aims to critically evaluate the societal effects and risks of AI, with an initial focus on large language models (LLMs). The ARIA program seeks to bridge gaps in current AI assessments by examining real-world interactions and applications. As AI’s role in our world expands, initiatives like ARIA are vital for cybersecurity, information governance, and eDiscovery professionals committed to fostering responsible and ethical AI development.

Industry News – Artificial Intelligence Beat

NIST Launches ARIA Program to Assess AI Risks and Impacts in Real-World Scenarios

ComplexDiscovery Staff

The National Institute of Standards and Technology (NIST) has embarked on a pivotal initiative named Assessing Risks and Impacts of AI (ARIA) to evaluate the societal effects and risks of artificial intelligence (AI) systems. Initially focusing on large language models (LLMs), the goal is to enhance our understanding of AI’s capabilities and implications in real-world settings. This program, supported by the U.S. AI Safety Institute, seeks to fill critical gaps in current AI assessments by examining AI’s interactions with people and its performance in everyday contexts.

“In order to fully understand the impacts AI is having and will have on our society, we need to test how AI functions in realistic scenarios — and that’s exactly what we’re doing with this program,” said Commerce Secretary Gina Raimondo.

The ARIA program’s first iteration, ARIA 0.1, launched on May 28, 2024, will focus on three distinct evaluation scenarios. The first scenario, “TV Spoilers,” will test controlled access to privileged information, ensuring that models developed to showcase TV series expertise do not reveal details of upcoming episodes. The second scenario, “Meal Planner,” will assess an LLM’s ability to personalize content based on dietary requirements, food preferences, and sensitivities. Finally, the “Pathfinder” scenario will examine an LLM’s capability to generate factual travel recommendations, synthesizing accurate geographic and landmark information.

These evaluation scenarios are designed to assess AI systems’ robustness and societal impact through three testing layers: general model testing, red teaming, and large-scale field testing. This multi-faceted approach aims to provide a comprehensive understanding of AI’s real-world applications and potential risks.

“Measuring impacts is about more than how well a model functions in a laboratory setting,” said Reva Schwartz, ARIA program lead at NIST’s Information Technology Lab. “ARIA will consider AI beyond the model and assess systems in context, including what happens when people interact with AI technology in realistic settings under regular use. This gives a broader, more holistic view of the net effects of these technologies.”

Supporters of the ARIA program highlight its focus on developing guidelines, tools, and metrics that AI developers can use to ensure their systems are safe, reliable, and trustworthy. Laurie E. Locascio, Under Secretary of Commerce for Standards and Technology and NIST Director, emphasized the significance of this initiative. “The ARIA program is designed to meet real-world needs as the use of AI technology grows,” she said. “This new effort will support the U.S. AI Safety Institute, expand NIST’s already broad engagement with the research community, and help establish reliable methods for testing and evaluating AI’s functionality in the real world.”

Additionally, the ARIA program will aid in advancing the AI Risk Management Framework (AI RMF), a pivotal resource released by NIST in January 2023. The AI RMF provides strategies for assessing and managing AI-related risks and has become central to discussions about AI’s development in both the public and private sectors. ARIA’s evaluations will help operationalize the risk measurement functions outlined in the AI RMF.

Michiel Prins, co-founder of HackerOne, praised the ARIA program for its guidance-focused approach. “NIST’s new ARIA program focuses on assessing the societal risks and impacts of AI systems in order to offer guidance to the broader industry. Guidance, over regulation, is a great approach to managing the safety of a new technology,” he said. Prins compared NIST’s methodologies to existing industry practices, emphasizing their collective aim to improve AI safety and security.

Brian Levine, a managing director at Ernst & Young, echoed Prins’s sentiments, noting NIST’s historical expertise in technology testing. Levine called the ARIA program a promising effort to address complex AI challenges. “Given NIST’s long and illustrious history of accurately conducting a wide range of technology testing, the ARIA efforts are indeed promising,” he said. Levine also mentioned the importance of identifying suitable AI code for testing and the challenges of evaluating LLMs outside their application contexts.

The ARIA program represents a significant step in developing safe and effective AI systems. As AI technology continues to evolve rapidly, initiatives like ARIA are essential to ensuring that the deployment of AI systems considers societal impacts, thereby fostering responsible and ethical AI development.

News Sources

Assisted by GAI and LLM Technologies

Additional Reading

Source: ComplexDiscovery OÜ

The post NIST Launches ARIA Program to Assess AI Risks and Impacts in Real-World Scenarios appeared first on ComplexDiscovery.