• Home
  • Motorcycles
  • Electric Motorcycles
  • 3 wheelers
  • FUV Electric 3 wheeler
  • Shop
  • Listings

Subscribe to Updates

Get the latest creative news from CycleNews about two, three wheelers and Electric vehicles.

What's Hot

2026 BMW R 1300 R First Look [13 Fast Facts]

The Middle East Has Entered the AI Group Chat

EA Tried to Stop an ‘Anti-DEI Mod’ for ‘The Sims 4’—but More Keep Surfacing

Facebook Twitter Instagram
  • Home
  • Motorcycles
  • Electric Motorcycles
  • 3 wheelers
  • FUV Electric 3 wheeler
  • Shop
  • Listings
Facebook Twitter Instagram Pinterest
Cycle News
Submit Your Ad
Cycle News
You are at:Home » A New Benchmark for the Risks of AI
Electric Motorcycles

A New Benchmark for the Risks of AI

cycleBy cycleDecember 4, 202402 Mins Read
Share Facebook Twitter Pinterest LinkedIn Tumblr Email
Share
Facebook Twitter LinkedIn Pinterest Email


MLCommons, a nonprofit that helps companies measure the performance of their artificial intelligence systems, is launching a new benchmark to gauge AI’s bad side too.

The new benchmark, called AILuminate, assesses the responses of large language models to more than 12,000 test prompts in 12 categories including inciting violent crime, child sexual exploitation, hate speech, promoting self-harm, and intellectual property infringement.

Models are given a score of “poor,” “fair,” “good,” “very good,” or “excellent,” depending on how they perform. The prompts used to test the models are kept secret to prevent them from ending up as training data that would allow a model to ace the test.

Peter Mattson, founder and president of MLCommons and a senior staff engineer at Google, says that measuring the potential harms of AI models is technically difficult, leading to inconsistencies across the industry. “AI is a really young technology, and AI testing is a really young discipline,” he says. “Improving safety benefits society; it also benefits the market.”

Reliable, independent ways of measuring AI risks may become more relevant under the next US administration. Donald Trump has promised to get rid of President Biden’s AI Executive Order, which introduced measures aimed at ensuring AI is used responsibly by companies as well as a new AI Safety Institute to test powerful models.

The effort could also provide more of an international perspective on AI harms. MLCommons counts a number of international firms, including the Chinese companies Huawei and Alibaba, among its member organizations. If these companies all used the new benchmark, it would provide a way to compare AI safety in the US, China, and elsewhere.

Some large US AI providers have already used AILuminate to test their models. Anthropic’s Claude model, Google’s smaller model Gemma, and a model from Microsoft called Phi all scored “very good” in testing. OpenAI’s GPT-4o and Meta’s largest Llama model both scored “good.” The only model to score “poor” was OLMo from the Allen Institute for AI, although Mattson notes that this is a research offering not designed with safety in mind.

“Overall, it’s good to see scientific rigor in the AI evaluation processes,” says Rumman Chowdhury, CEO of Humane Intelligence, a nonprofit that specializes in testing or red-teaming AI models for misbehaviors. “We need best practices and inclusive methods of measurement to determine whether AI models are performing the way we expect them to.”



Source link

Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
Previous ArticleSenators Warn the Pentagon: Get a Handle on China’s Telecom Hacking
Next Article Big Red’s Breaking Battery Barriers and BMW’s R 12 S is Finally Here!
cycle
  • Website

Related Posts

The Middle East Has Entered the AI Group Chat

May 15, 2025

EA Tried to Stop an ‘Anti-DEI Mod’ for ‘The Sims 4’—but More Keep Surfacing

May 15, 2025

US Tech Visa Applications Are Being Put Through the Wringer

May 15, 2025
Add A Comment

Leave A Reply Cancel Reply

You must be logged in to post a comment.

Demo
Top Posts

2026 BMW R 1300 R First Look [13 Fast Facts]

May 15, 2025

The urban electric commuter FUELL Fllow designed by Erik Buell is now opening orders | thepack.news | THE PACK

July 29, 2023

2024 Yamaha Ténéré 700 First Look [6 Fast Facts For ADV Riding]

July 29, 2023
Stay In Touch
  • Facebook
  • YouTube
  • TikTok
  • WhatsApp
  • Twitter
  • Instagram
Latest Reviews

Subscribe to Updates

Get the latest tech news from FooBar about tech, design and biz.

Demo
Most Popular

2026 BMW R 1300 R First Look [13 Fast Facts]

May 15, 2025

The urban electric commuter FUELL Fllow designed by Erik Buell is now opening orders | thepack.news | THE PACK

July 29, 2023

2024 Yamaha Ténéré 700 First Look [6 Fast Facts For ADV Riding]

July 29, 2023
Our Picks

The Simple Math Behind Public Key Cryptography

2024 Triumph Scrambler 1200 XE Review [A Dozen Fast Facts]

Apple MacBook Pro (M3 Max, 16 Inch) Review: Untouchable Performance and Battery Life

Subscribe to Updates

Get the latest news from CycleNews about two, three wheelers and Electric vehicles.

© 2025 cyclenews.blog
  • Home
  • About us
  • Get In Touch
  • Shop
  • Listings
  • My Account
  • Submit Your Ad
  • Terms & Conditions
  • Stock Ticker

Type above and press Enter to search. Press Esc to cancel.