Friday, May 9, 2025
News PouroverAI
Visit PourOver.AI
No Result
View All Result
  • Home
  • AI Tech
  • Business
  • Blockchain
  • Data Science & ML
  • Cloud & Programming
  • Automation
  • Front-Tech
  • Marketing
  • Home
  • AI Tech
  • Business
  • Blockchain
  • Data Science & ML
  • Cloud & Programming
  • Automation
  • Front-Tech
  • Marketing
News PouroverAI
No Result
View All Result

Protecting LLM applications with Azure AI Content Safety

May 9, 2024
in Cloud & Programming
Reading Time: 5 mins read
0 0
A A
0
Share on FacebookShare on Twitter


Both extremely promising and extremely risky, generative AI has distinct failure modes that we need to defend against to protect our users and our code. We’ve all seen the news, where chatbots are encouraged to be insulting or racist, or large language models (LLMs) are exploited for malicious purposes, and where outputs are at best fanciful and at worst dangerous.None of this is particularly surprising. It’s possible to craft complex prompts that force undesired outputs, pushing the input window past the guidelines and guardrails we’re using. At the same time, we can see outputs that go beyond the data in the foundation model, generating text that’s no longer grounded in reality, producing plausible, semantically correct nonsense.While we can use techniques like retrieval-augmented generation (RAG) and tools like Semantic Kernel and LangChain to keep our applications grounded in our data, there are still prompt attacks that can produce bad outputs and cause reputational risks. What’s needed is a way to test our AI applications in advance to, if not ensure their safety, at least mitigate the risk of these attacks—as well as making sure that our own prompts don’t force bias or allow inappropriate queries.

Introducing Azure AI Content SafetyMicrosoft has long been aware of these risks. You don’t have a PR disaster like the Tay chatbot without learning lessons. As a result the company has been investing heavily in a cross-organizational responsible AI program. Part of that team, Azure AI Responsible AI, has been focused on protecting applications built using Azure AI Studio, and has been developing a set of tools that are bundled as Azure AI Content Safety.Dealing with prompt injection attacks is increasingly important, as a malicious prompt not only could deliver unsavory content, but could be used to extract the data used to ground a model, delivering proprietary information in an easy to exfiltrate format. While it’s obviously important to ensure RAG data doesn’t contain personally identifiable information or commercially sensitive data, private API connections to line-of-business systems are ripe for manipulation by bad actors.

We need a set of tools that allow us to test AI applications before they’re delivered to users, and that allow us to apply advanced filters to inputs to reduce the risk of prompt injection, blocking known attack types before they can be used on our models. While you could build your own filters, logging all inputs and outputs and using them to build a set of detectors, your application may not have the necessary scale to trap all attacks before they’re used on you. There aren’t many bigger AI platforms than Microsoft’s ever-growing family of models, and its Azure AI Studio development environment. With Microsoft’s own Copilot services building on its investment in OpenAI, it’s able to track prompts and outputs across a wide range of different scenarios, with various levels of grounding and with many different data sources. That allows Microsoft’s AI safety team to understand quickly what types of prompt cause problems and to fine-tune their service guardrails accordingly.

Using Prompt Shields to control AI inputsPrompt Shields are a set of real-time input filters that sit in front of a large language model. You construct prompts as normal, either directly or via RAG, and the Prompt Shield analyses them and blocks malicious prompts before they are submitted to your LLM. Currently there are two kinds of Prompt Shields. Prompt Shields for User Prompts is designed to protect your application from user prompts that redirect the model away from your grounding data and towards inappropriate outputs. These can clearly be a significant reputational risk, and by blocking prompts that elicit these outputs, your LLM application should remain focused on your specific use cases. While the attack surface for your LLM application may be small, Copilot’s is large. By enabling Prompt Shields you can leverage the scale of Microsoft’s security engineering.

Prompt Shields for Documents helps reduce the risk of compromise via indirect attacks. These use alternative data sources, for example poisoned documents or malicious websites, that hide additional prompt content from existing protections. Prompt Shields for Documents analyses the contents of these files and blocks those that match patterns associated with attacks. With attackers increasingly taking advantage of techniques like this, there’s a significant risk associated with them, as they’re hard to detect using conventional security tooling. It’s important to use protections like Prompt Shields with AI applications that, for example, summarize documents or automatically reply to emails.

Using Prompt Shields involves making an API call with the user prompt and any supporting documents. These are analyzed for vulnerabilities, with the response simply showing that an attack has been detected. You can then add code to your LLM orchestration to trap this response, then block that user’s access, check the prompt they’ve used, and develop additional filters to keep those attacks from being used in the future.

Checking for ungrounded outputsAlong with these prompt defenses, Azure AI Content Safety includes tools to help detect when a model becomes ungrounded, generating random (if plausible) outputs. This feature works only with applications that use grounding data sources, for example a RAG application or a document summarizer. The Groundedness Detection tool is itself a language model, one that’s used to provide a feedback loop for LLM output. It compares the output of the LLM with the data that’s used to ground it, evaluating it to see if it is based on the source data, and if not, generating an error. This process, Natural Language Inference, is still in its early days, and the underlying model is intended to be updated as Microsoft’s responsible AI teams continue to develop ways to keep AI models from losing context.

Keeping users safe with warningsOne important aspect of the Azure AI Content Safety services is informing users when they’re doing something unsafe with an LLM. Perhaps they’ve been socially engineered to deliver a prompt that exfiltrates data: “Try this, it’ll do something really cool!” Or maybe they’ve simply made an error. Providing guidance for writing safe prompts for a LLM is as much a part of securing a service as providing shields for your prompts.Microsoft is adding system message templates to Azure AI Studio that can be used in conjunction with Prompt Shields and with other AI security tools. These are shown automatically in the Azure AI Studio development playground, allowing you to understand what systems messages are displayed when, helping you create your own custom messages that fit your application design and content strategy.

Testing and monitoring your modelsAzure AI Studio remains the best place to build applications that work with Azure-hosted LLMs, whether they’re from the Azure OpenAI service or imported from Hugging Face. The studio includes automated evaluations for your applications, which now include ways of assessing the safety of your application, using prebuilt attacks to test how your model responds to jailbreaks and indirect attacks, and whether it might output harmful content. You can use your own prompts or Microsoft’s adversarial prompt templates as the basis of your test inputs. Once you have an AI application up and running, you will need to monitor it to ensure that new adversarial prompts don’t succeed in jailbreaking it. Azure OpenAI now includes risk monitoring, tied to the various filters used by the service, including Prompt Shields. You can see the types of attacks used, both inputs and outputs, as well as the volume of the attacks. There’s the option of understanding which users are using your application maliciously, allowing you to identify the patterns behind attacks and to tune block lists appropriately.

Ensuring that malicious users can’t jailbreak a LLM is only one part of delivering trustworthy, responsible AI applications. Output is as important as input. By checking output data against source documents, we can add a feedback loop that lets us refine prompts to avoid losing groundedness. All we need to remember is that these tools will need to evolve alongside our AI services, getting better and stronger as generative AI models improve.

Copyright © 2024 IDG Communications, Inc.



Source link

Tags: applicationsAzureContentLLMprotectingSafety
Previous Post

How Secure Is iCloud? Our Expert Explains for 2024

Next Post

Best Web3 Tokens List for 2024

Related Posts

Top 20 Javascript Libraries You Should Know in 2024
Cloud & Programming

Top 20 Javascript Libraries You Should Know in 2024

June 10, 2024
Simplify risk and compliance assessments with the new common control library in AWS Audit Manager
Cloud & Programming

Simplify risk and compliance assessments with the new common control library in AWS Audit Manager

June 6, 2024
Simplify Regular Expressions with RegExpBuilderJS
Cloud & Programming

Simplify Regular Expressions with RegExpBuilderJS

June 6, 2024
How to learn data visualization to accelerate your career
Cloud & Programming

How to learn data visualization to accelerate your career

June 6, 2024
BitTitan Announces Seasoned Tech Leader Aaron Wadsworth as General Manager
Cloud & Programming

BitTitan Announces Seasoned Tech Leader Aaron Wadsworth as General Manager

June 6, 2024
Copilot Studio turns to AI-powered workflows
Cloud & Programming

Copilot Studio turns to AI-powered workflows

June 6, 2024
Next Post
Best Web3 Tokens List for 2024

Best Web3 Tokens List for 2024

6 Compelling Reasons To Invest In A Hospital Management System (HMS)

6 Compelling Reasons To Invest In A Hospital Management System (HMS)

Byju’s cuts annual course fees by 30-40%, hikes sales incentive by 50-100%

Byju's cuts annual course fees by 30-40%, hikes sales incentive by 50-100%

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

  • Trending
  • Comments
  • Latest
Is C.AI Down? Here Is What To Do Now

Is C.AI Down? Here Is What To Do Now

January 10, 2024
Porfo: Revolutionizing the Crypto Wallet Landscape

Porfo: Revolutionizing the Crypto Wallet Landscape

October 9, 2023
A Complete Guide to BERT with Code | by Bradney Smith | May, 2024

A Complete Guide to BERT with Code | by Bradney Smith | May, 2024

May 19, 2024
A faster, better way to prevent an AI chatbot from giving toxic responses | MIT News

A faster, better way to prevent an AI chatbot from giving toxic responses | MIT News

April 10, 2024
Part 1: ABAP RESTful Application Programming Model (RAP) – Introduction

Part 1: ABAP RESTful Application Programming Model (RAP) – Introduction

November 20, 2023
Saginaw HMI Enclosures and Suspension Arm Systems from AutomationDirect – Library.Automationdirect.com

Saginaw HMI Enclosures and Suspension Arm Systems from AutomationDirect – Library.Automationdirect.com

December 6, 2023
Can You Guess What Percentage Of Their Wealth The Rich Keep In Cash?

Can You Guess What Percentage Of Their Wealth The Rich Keep In Cash?

June 10, 2024
AI Compared: Which Assistant Is the Best?

AI Compared: Which Assistant Is the Best?

June 10, 2024
How insurance companies can use synthetic data to fight bias

How insurance companies can use synthetic data to fight bias

June 10, 2024
5 SLA metrics you should be monitoring

5 SLA metrics you should be monitoring

June 10, 2024
From Low-Level to High-Level Tasks: Scaling Fine-Tuning with the ANDROIDCONTROL Dataset

From Low-Level to High-Level Tasks: Scaling Fine-Tuning with the ANDROIDCONTROL Dataset

June 10, 2024
UGRO Capital: Targeting to hit milestone of Rs 20,000 cr loan book in 8-10 quarters: Shachindra Nath

UGRO Capital: Targeting to hit milestone of Rs 20,000 cr loan book in 8-10 quarters: Shachindra Nath

June 10, 2024
Facebook Twitter LinkedIn Pinterest RSS
News PouroverAI

The latest news and updates about the AI Technology and Latest Tech Updates around the world... PouroverAI keeps you in the loop.

CATEGORIES

  • AI Technology
  • Automation
  • Blockchain
  • Business
  • Cloud & Programming
  • Data Science & ML
  • Digital Marketing
  • Front-Tech
  • Uncategorized

SITEMAP

  • Disclaimer
  • Privacy Policy
  • DMCA
  • Cookie Privacy Policy
  • Terms and Conditions
  • Contact us

Copyright © 2023 PouroverAI News.
PouroverAI News

No Result
View All Result
  • Home
  • AI Tech
  • Business
  • Blockchain
  • Data Science & ML
  • Cloud & Programming
  • Automation
  • Front-Tech
  • Marketing

Copyright © 2023 PouroverAI News.
PouroverAI News

Welcome Back!

Login to your account below

Forgotten Password? Sign Up

Create New Account!

Fill the forms bellow to register

All fields are required. Log In

Retrieve your password

Please enter your username or email address to reset your password.

Log In