• Home
  • Motorcycles
  • Electric Motorcycles
  • 3 wheelers
  • FUV Electric 3 wheeler
  • Shop
  • Listings

Subscribe to Updates

Get the latest creative news from CycleNews about two, three wheelers and Electric vehicles.

What's Hot

Le Mans MotoGP Sprint and Full Race Results « MotorcycleDaily.com – Motorcycle News, Editorials, Product Reviews and Bike Reviews

Best Backpacking Sleeping Pads (2025), WIRED Tested and Reviewed

MSG Is (Once Again) Back on the Table

Facebook Twitter Instagram
  • Home
  • Motorcycles
  • Electric Motorcycles
  • 3 wheelers
  • FUV Electric 3 wheeler
  • Shop
  • Listings
Facebook Twitter Instagram Pinterest
Cycle News
Submit Your Ad
Cycle News
You are at:Home » A New Trick Uses AI to Jailbreak AI Models—Including GPT-4
Electric Motorcycles

A New Trick Uses AI to Jailbreak AI Models—Including GPT-4

cycleBy cycleDecember 5, 202303 Mins Read
Share Facebook Twitter Pinterest LinkedIn Tumblr Email
Share
Facebook Twitter LinkedIn Pinterest Email


Large language models recently emerged as a powerful and transformative new kind of technology. Their potential became headline news as ordinary people were dazzled by the capabilities of OpenAI’s ChatGPT, released just a year ago.

In the months that followed the release of ChatGPT, discovering new jailbreaking methods became a popular pastime for mischievous users, as well as those interested in the security and reliability of AI systems. But scores of startups are now building prototypes and fully fledged products on top of large language model APIs. OpenAI said at its first-ever developer conference in November that over 2 million developers are now using its APIs.

These models simply predict the text that should follow a given input, but they are trained on vast quantities of text, from the web and other digital sources, using huge numbers of computer chips, over a period of many weeks or even months. With enough data and training, language models exhibit savant-like prediction skills, responding to an extraordinary range of input with coherent and pertinent-seeming information.

The models also exhibit biases learned from their training data and tend to fabricate information when the answer to a prompt is less straightforward. Without safeguards, they can offer advice to people on how to do things like obtain drugs or make bombs. To keep the models in check, the companies behind them use the same method employed to make their responses more coherent and accurate-looking. This involves having humans grade the model’s answers and using that feedback to fine-tune the model so that it is less likely to misbehave.

Robust Intelligence provided WIRED with several example jailbreaks that sidestep such safeguards. Not all of them worked on ChatGPT, the chatbot built on top of GPT-4, but several did, including one for generating phishing messages, and another for producing ideas to help a malicious actor remain hidden on a government computer network.

A similar method was developed by a research group led by Eric Wong, an assistant professor at the University of Pennsylvania. The one from Robust Intelligence and his team involves additional refinements that let the system generate jailbreaks with half as many tries.

Brendan Dolan-Gavitt, an associate professor at New York University who studies computer security and machine learning, says the new technique revealed by Robust Intelligence shows that human fine-tuning is not a watertight way to secure models against attack.

Dolan-Gavitt says companies that are building systems on top of large language models like GPT-4 should employ additional safeguards. “We need to make sure that we design systems that use LLMs so that jailbreaks don’t allow malicious users to get access to things they shouldn’t,” he says.



Source link

Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
Previous ArticleEV Startup Arcimoto Takes On Tesla With All-Electric Three Wheeler FUV
Next Article Key Takeaways From Arcimoto (FUV) Q2 Results & Business Updates
cycle
  • Website

Related Posts

Best Backpacking Sleeping Pads (2025), WIRED Tested and Reviewed

May 11, 2025

MSG Is (Once Again) Back on the Table

May 11, 2025

Samsung Odyssey 3D (G90XF) Review: The Future of 3D Screens

May 11, 2025
Add A Comment

Leave A Reply Cancel Reply

You must be logged in to post a comment.

Demo
Top Posts

Le Mans MotoGP Sprint and Full Race Results « MotorcycleDaily.com – Motorcycle News, Editorials, Product Reviews and Bike Reviews

May 11, 2025

The urban electric commuter FUELL Fllow designed by Erik Buell is now opening orders | thepack.news | THE PACK

July 29, 2023

2024 Yamaha Ténéré 700 First Look [6 Fast Facts For ADV Riding]

July 29, 2023
Stay In Touch
  • Facebook
  • YouTube
  • TikTok
  • WhatsApp
  • Twitter
  • Instagram
Latest Reviews

Subscribe to Updates

Get the latest tech news from FooBar about tech, design and biz.

Demo
Most Popular

Le Mans MotoGP Sprint and Full Race Results « MotorcycleDaily.com – Motorcycle News, Editorials, Product Reviews and Bike Reviews

May 11, 2025

The urban electric commuter FUELL Fllow designed by Erik Buell is now opening orders | thepack.news | THE PACK

July 29, 2023

2024 Yamaha Ténéré 700 First Look [6 Fast Facts For ADV Riding]

July 29, 2023
Our Picks

Trump Shares AI-Generated Images Claiming Swifties Are Supporting Him

How to Choose a Mattress

Pebble’s Founder Wants to Relaunch the E-Paper Smartwatch

Subscribe to Updates

Get the latest news from CycleNews about two, three wheelers and Electric vehicles.

© 2025 cyclenews.blog
  • Home
  • About us
  • Get In Touch
  • Shop
  • Listings
  • My Account
  • Submit Your Ad
  • Terms & Conditions
  • Stock Ticker

Type above and press Enter to search. Press Esc to cancel.