Harvard Researchers Unveil How Strategic Text Sequences Can Manipulate AI-Driven Search Results

Large language models (LLMs) are widely used in search engines to provide natural language responses based on users’ queries. Traditional search engines perform well in retrieving relevant pages but cannot compute the information and present it as a coherent response. LLMs can overcome this inability by compiling search results into natural language responses that directly address users’ specific queries. Google Search and Microsoft Bing have started integrating LLM-driven chat interfaces alongside their traditional search boxes.

However, it is difficult to keep the traditional LLMs updated with new information due to the limited knowledge acquired during their training. Also, they are prone to factual errors during text generation from the trained model weights. These limitations can be solved using Retrieval-augmented generation (RAG) by integrating an external knowledge source, such as a database or search engine, with the LLM to enhance the text generation process with additional context. Another drawback of LLMs is the adversarial attack in which attackers use crafted token sequences in the input prompt to bypass the model’s safety mechanisms and generate a harmful response.

Researchers from Harvard University proposed a Strategic Text Sequence (STS), a carefully crafted message that can influence LLM-driven search tools in the context of e-commerce. With the help of STS, one can improve a product’s ranking in the LLM’s recommendations by inserting an optimized sequence of tokens into the product information page. Researchers used a catalog of fictitious coffee machines. They analyzed its effect on two target products: one that appears in the LLM’s recommendations and another that usually ranks second. They found that STS enhances the visibility of both products by increasing their chances of appearing as the top recommendation.

STS has proved that an LLM can be manipulated to increase the chances of a product being listed as the top recommendation. By inserting STS into a product’s information, a framework was developed to game an LLM’s recommendations in favor of the target product. For further optimization of STS, adversarial attack algorithms such as the Greedy Coordinate Gradient (GCG) algorithm are utilized in the framework, improving product visibility in business and e-commerce. This framework also helps make the STS strong enough to handle changes in the order of product information listed in the LLM’s input prompt.

The GCG algorithm finds the optimized STS by running for 2000 iterations wherein the target product, ColdBrew Master, shows improvements over the iterations. Initially, the product was not recommended, but after 100 iterations, it shows in the top recommendation, and the effect of STS was evaluated on the rank of the target product in 200 LLM inferences with and without the sequence. STS has an equal probability of advantage and disadvantage yield; however, if the product order is randomized during the STS optimization phase, the advantages will significantly increase while the disadvantages will be minimized.

In conclusion, Researchers introduced STS, a carefully crafted message that can influence LLM-driven search tools in the context of e-commerce. It can improve a product’s ranking in the LLM’s recommendations by inserting an optimized sequence of tokens into the product information page. Also, a framework was developed by inserting STS into a product’s information and optimizing STS using the GCG algorithm, improving product visibility in business and e-commerce. The overall impact of this paper is not only bound to e-commerce but also highlights the implications of AI search optimization and the ethical considerations that come with it.

Check out the Paper and Github. All credit for this research goes to the researchers of this project. Also, don’t forget to follow us on Twitter. Join our Telegram Channel, Discord Channel, and LinkedIn Group.

If you like our work, you will love our newsletter.

Don’t Forget to join our 40k+ ML SubReddit

Want to get in front of 1.5 Million AI Audience? Work with us here

As we increasingly rely on #LLMs for product recommendations and searches, can companies game these models to enhance the visibility of their products? Our latest work provides answers to this question & demonstrates that LLMs can be manipulated to boost product visibility! [Twitter Embed]

$\"\"$

Sajjad Ansari is a final year undergraduate from IIT Kharagpur. As a Tech enthusiast, he delves into the practical applications of AI with a focus on understanding the impact of AI technologies and their real-world implications. He aims to articulate complex AI concepts in a clear and accessible manner.

🐝 Join the Fastest Growing AI Research Newsletter Read by Researchers from Google + NVIDIA + Meta + Stanford + MIT + Microsoft and many others…

Source link

Harvard Researchers Unveil How Strategic Text Sequences Can Manipulate AI-Driven Search Results

Renowned tech entrepreneur Nitin Seth launches new book ‘Mastering the Data Paradox’

GNNBench: A Plug-and-Play Deep Learning Benchmarking Platform Focused on System Innovation

Related Posts

How insurance companies can use synthetic data to fight bias

From Low-Level to High-Level Tasks: Scaling Fine-Tuning with the ANDROIDCONTROL Dataset

How Game Theory Can Make AI More Reliable

Decoding Decoder-Only Transformers: Insights from Google DeepMind’s Paper

Buffer of Thoughts (BoT): A Novel Thought-Augmented Reasoning AI Approach for Enhancing Accuracy, Efficiency, and Robustness of LLMs

Deciphering Doubt: Navigating Uncertainty in LLM Responses

GNNBench: A Plug-and-Play Deep Learning Benchmarking Platform Focused on System Innovation

Samsung Secures $6.4 Billion in US Government Grants for Chip Manufacturing Expansion in Texas

Researchers at Stanford Propose a Family of Representation Finetuning (ReFT) Methods that Operates on a Frozen Base Model and Learn Task-Specific Interventions on Hidden Representations

Leave a Reply Cancel reply

Amazon’s Bedrock and Titan Generative AI Services Enter General Availability

Fireworks AI Open Sources FireLLaVA: A Commercially-Usable Version of the LLaVA Model Leveraging Only OSS Models for Data Generation and Training

9 Best Open Source Text-to-Speech (TTS) Engines

Creating contrast themes with CSS prefers-contrast and JavaScript

Creating Fluid Typography with the CSS clamp() Function — SitePoint

Link between adversity, psychiatric and cognitive decline

Can You Guess What Percentage Of Their Wealth The Rich Keep In Cash?

AI Compared: Which Assistant Is the Best?

How insurance companies can use synthetic data to fight bias

5 SLA metrics you should be monitoring

From Low-Level to High-Level Tasks: Scaling Fine-Tuning with the ANDROIDCONTROL Dataset

UGRO Capital: Targeting to hit milestone of Rs 20,000 cr loan book in 8-10 quarters: Shachindra Nath

CATEGORIES

SITEMAP

Welcome Back!

Create New Account!

Retrieve your password