Saturday, May 10, 2025
News PouroverAI
Visit PourOver.AI
No Result
View All Result
  • Home
  • AI Tech
  • Business
  • Blockchain
  • Data Science & ML
  • Cloud & Programming
  • Automation
  • Front-Tech
  • Marketing
  • Home
  • AI Tech
  • Business
  • Blockchain
  • Data Science & ML
  • Cloud & Programming
  • Automation
  • Front-Tech
  • Marketing
News PouroverAI
No Result
View All Result

Meet VonGoom: A Novel AI Approach for Data Poisoning in Large Language Models

December 17, 2023
in Data Science & ML
Reading Time: 3 mins read
0 0
A A
0
Share on FacebookShare on Twitter



Data poisoning attacks manipulate machine learning models by injecting false data into the training dataset. When the model is exposed to real-world data, it may result in incorrect predictions or decisions. LLMs can be vulnerable to data poisoning attacks, which can distort their responses to targeted prompts and related concepts. To address this issue, a research study conducted by Del Complex proposes a new approach called VonGoom, which requires only a few hundred to several thousand strategically placed poison inputs to achieve its objective.

VonGoom challenges the notion that millions of poison samples are necessary, demonstrating feasibility with a few hundred to several thousand strategically placed inputs. VonGoom crafts seemingly benign text inputs with subtle manipulations to mislead LLMs during training, introducing a spectrum of distortions. It has poisoned hundreds of millions of data sources used in LLM training.

The research explores the susceptibility of LLMs to data poisoning attacks and introduces VonGoom, a novel method for prompt-specific poisoning attacks on LLMs. Unlike broad-spectrum episodes, VonGoom focuses on specific prompts or topics. It crafts seemingly benign text inputs with subtle manipulations to mislead the model during training, introducing a spectrum of distortions from subtle biases to overt biases, misinformation, and concept corruption.

VonGoom is a method for prompt-specific data poisoning in LLMs. It focuses on crafting seemingly benign text inputs with subtle manipulations to mislead the model during training and disturb learned weights. VonGoom introduces a spectrum of distortions, including subtle biases, overt biases, misinformation, and concept corruption. The approach uses optimization techniques, such as constructing clean-neighbor poison data and guided perturbations, demonstrating efficacy in various scenarios.

Injecting a modest number of poisoned samples, approximately 500-1000, significantly altered the output of models trained from scratch. In scenarios involving the updating of pre-trained models, introducing 750-1000 poisoned samples effectively disrupted the model’s response to targeted concepts. VonGoom attacks demonstrated the effectiveness of semantically altered text samples in influencing the output of LLMs. The impact extended to related ideas, creating a bleed-through effect where the influence of poison samples reached semantically related concepts. VonGoom’s strategic implementation with a relatively small number of poisoned inputs highlighted the vulnerability of LLMs to sophisticated data poisoning attacks.

In conclusion, the research conducted can be summarized in below points:

– VonGoom is a method for manipulating data to deceive LLMs during training.
– The approach is achieved by making subtle changes to text inputs that cause the models to be misled.
– Targeted attacks with small inputs can be feasible and effective in achieving the goal.
– VonGoom introduces a range of distortions, including biases, misinformation, and concept corruption.
– The study analyzes the density of training data for specific concepts in common LLM datasets, identifying opportunities for manipulation.
– The research highlights the vulnerability of LLMs to data poisoning.
– VonGoom could significantly impact various models and have broader implications for the field.

Check out the Details. All credit for this research goes to the researchers of this project. Also, don’t forget to join our 34k+ ML SubReddit, 41k+ Facebook Community, Discord Channel, and Email Newsletter, where we share the latest AI research news, cool AI projects, and more.

If you like our work, you will love our newsletter..

Introducing, VonGoom: A method for data poisoning large language models to introduce bias, requiring as few as 100 poisoned examples within training data. Deployed in January, we have penetrated dozens of commonly scraped websites with poison examples. https://t.co/HVLysX3gNl pic.twitter.com/KVkdb1jIR7— Del Complex (@DelComplex) December 14, 2023

Hello, My name is Adnan Hassan. I am a consulting intern at Marktechpost and soon to be a management trainee at American Express. I am currently pursuing a dual degree at the Indian Institute of Technology, Kharagpur. I am passionate about technology and want to create new products that make a difference.

🐝 [FREE AI WEBINAR] ‘Building Multimodal Apps with LlamaIndex – Chat with Text + Image Data’ Dec 18, 2023 10 am PST



Source link

Tags: ApproachdatalanguageLargeMeetmodelsPoisoningVonGoom
Previous Post

What Ex-President Obama Thinks of Blockchain & Crypto Technology? #shortsvideo

Next Post

How to Build the Best B2B Marketing Tech Stack (Infographic)

Related Posts

AI Compared: Which Assistant Is the Best?
Data Science & ML

AI Compared: Which Assistant Is the Best?

June 10, 2024
5 Machine Learning Models Explained in 5 Minutes
Data Science & ML

5 Machine Learning Models Explained in 5 Minutes

June 7, 2024
Cohere Picks Enterprise AI Needs Over ‘Abstract Concepts Like AGI’
Data Science & ML

Cohere Picks Enterprise AI Needs Over ‘Abstract Concepts Like AGI’

June 7, 2024
How to Learn Data Analytics – Dataquest
Data Science & ML

How to Learn Data Analytics – Dataquest

June 6, 2024
Adobe Terms Of Service Update Privacy Concerns
Data Science & ML

Adobe Terms Of Service Update Privacy Concerns

June 6, 2024
Build RAG applications using Jina Embeddings v2 on Amazon SageMaker JumpStart
Data Science & ML

Build RAG applications using Jina Embeddings v2 on Amazon SageMaker JumpStart

June 6, 2024
Next Post
How to Build the Best B2B Marketing Tech Stack (Infographic)

How to Build the Best B2B Marketing Tech Stack (Infographic)

Earnings week ahead: Nike, FedEx, BlackBerry, Micron, Carnival and more

Earnings week ahead: Nike, FedEx, BlackBerry, Micron, Carnival and more

Distribution Channel Strategy: Your Go-To Guide (Infographic)

Distribution Channel Strategy: Your Go-To Guide (Infographic)

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

  • Trending
  • Comments
  • Latest
Is C.AI Down? Here Is What To Do Now

Is C.AI Down? Here Is What To Do Now

January 10, 2024
Porfo: Revolutionizing the Crypto Wallet Landscape

Porfo: Revolutionizing the Crypto Wallet Landscape

October 9, 2023
A Complete Guide to BERT with Code | by Bradney Smith | May, 2024

A Complete Guide to BERT with Code | by Bradney Smith | May, 2024

May 19, 2024
How To Build A Quiz App With JavaScript for Beginners

How To Build A Quiz App With JavaScript for Beginners

February 22, 2024
Saginaw HMI Enclosures and Suspension Arm Systems from AutomationDirect – Library.Automationdirect.com

Saginaw HMI Enclosures and Suspension Arm Systems from AutomationDirect – Library.Automationdirect.com

December 6, 2023
Part 1: ABAP RESTful Application Programming Model (RAP) – Introduction

Part 1: ABAP RESTful Application Programming Model (RAP) – Introduction

November 20, 2023
Can You Guess What Percentage Of Their Wealth The Rich Keep In Cash?

Can You Guess What Percentage Of Their Wealth The Rich Keep In Cash?

June 10, 2024
AI Compared: Which Assistant Is the Best?

AI Compared: Which Assistant Is the Best?

June 10, 2024
How insurance companies can use synthetic data to fight bias

How insurance companies can use synthetic data to fight bias

June 10, 2024
5 SLA metrics you should be monitoring

5 SLA metrics you should be monitoring

June 10, 2024
From Low-Level to High-Level Tasks: Scaling Fine-Tuning with the ANDROIDCONTROL Dataset

From Low-Level to High-Level Tasks: Scaling Fine-Tuning with the ANDROIDCONTROL Dataset

June 10, 2024
UGRO Capital: Targeting to hit milestone of Rs 20,000 cr loan book in 8-10 quarters: Shachindra Nath

UGRO Capital: Targeting to hit milestone of Rs 20,000 cr loan book in 8-10 quarters: Shachindra Nath

June 10, 2024
Facebook Twitter LinkedIn Pinterest RSS
News PouroverAI

The latest news and updates about the AI Technology and Latest Tech Updates around the world... PouroverAI keeps you in the loop.

CATEGORIES

  • AI Technology
  • Automation
  • Blockchain
  • Business
  • Cloud & Programming
  • Data Science & ML
  • Digital Marketing
  • Front-Tech
  • Uncategorized

SITEMAP

  • Disclaimer
  • Privacy Policy
  • DMCA
  • Cookie Privacy Policy
  • Terms and Conditions
  • Contact us

Copyright © 2023 PouroverAI News.
PouroverAI News

No Result
View All Result
  • Home
  • AI Tech
  • Business
  • Blockchain
  • Data Science & ML
  • Cloud & Programming
  • Automation
  • Front-Tech
  • Marketing

Copyright © 2023 PouroverAI News.
PouroverAI News

Welcome Back!

Login to your account below

Forgotten Password? Sign Up

Create New Account!

Fill the forms bellow to register

All fields are required. Log In

Retrieve your password

Please enter your username or email address to reset your password.

Log In