Skip to content

Commit

Permalink
Merge pull request #120 from MicrosoftDocs/main
Browse files Browse the repository at this point in the history
9/4/2024 AM Publish
  • Loading branch information
Taojunshen authored Sep 4, 2024
2 parents 7d4d00a + ca73c4f commit 0a7bb05
Show file tree
Hide file tree
Showing 63 changed files with 453 additions and 1,213 deletions.
12 changes: 6 additions & 6 deletions articles/ai-services/content-safety/overview.md
Original file line number Diff line number Diff line change
Expand Up @@ -47,9 +47,9 @@ There are different types of analysis available from this service. The following

| Feature | Functionality | Concepts guide | Get started |
| :-------------------------- | :---------------------- | --| --|
| [Prompt Shields](/rest/api/contentsafety/text-operations/detect-text-jailbreak) (preview) | Scans text for the risk of a User input attack on a Large Language Model. | [Prompt Shields concepts](/azure/ai-services/content-safety/concepts/jailbreak-detection)|[Quickstart](./quickstart-jailbreak.md) |
| [Prompt Shields](/rest/api/contentsafety/text-operations/detect-text-jailbreak) | Scans text for the risk of a User input attack on a Large Language Model. | [Prompt Shields concepts](/azure/ai-services/content-safety/concepts/jailbreak-detection)|[Quickstart](./quickstart-jailbreak.md) |
| [Groundedness detection](/rest/api/contentsafety/text-groundedness-detection-operations/detect-groundedness-options) (preview) | Detects whether the text responses of large language models (LLMs) are grounded in the source materials provided by the users. | [Groundedness detection concepts](/azure/ai-services/content-safety/concepts/groundedness)|[Quickstart](./quickstart-groundedness.md) |
| [Protected material text detection](/rest/api/contentsafety/text-operations/detect-text-protected-material) (preview) | Scans AI-generated text for known text content (for example, song lyrics, articles, recipes, selected web content). | [Protected material concepts](/azure/ai-services/content-safety/concepts/protected-material)|[Quickstart](./quickstart-protected-material.md)|
| [Protected material text detection](/rest/api/contentsafety/text-operations/detect-text-protected-material) | Scans AI-generated text for known text content (for example, song lyrics, articles, recipes, selected web content). | [Protected material concepts](/azure/ai-services/content-safety/concepts/protected-material)|[Quickstart](./quickstart-protected-material.md)|
| Custom categories API (preview) | Lets you create and train your own custom content categories and scan text for matches. | [Custom categories concepts](/azure/ai-services/content-safety/concepts/custom-categories)|[Quickstart](./quickstart-custom-categories.md) |
| Custom categories (rapid) API (preview) | Lets you define emerging harmful content patterns and scan text and images for matches. | [Custom categories concepts](/azure/ai-services/content-safety/concepts/custom-categories)| [How-to guide](./how-to/custom-categories-rapid.md) |
| [Analyze text](/rest/api/contentsafety/text-operations/analyze-text) API | Scans text for sexual content, violence, hate, and self harm with multi-severity levels. | [Harm categories](/azure/ai-services/content-safety/concepts/harm-categories)| [Quickstart](/azure/ai-services/content-safety/quickstart-text) |
Expand Down Expand Up @@ -105,7 +105,7 @@ Currently, Azure AI Content Safety has an **F0 and S0** pricing tier. See the Az
See the following list for the input requirements for each feature.

<!--
| | Analyze text API | Analyze image API | Prompt Shields<br>(preview) | Groundedness<br>detection (preview) | Protected material<br>detection (preview) |
| | Analyze text API | Analyze image API | Prompt Shields<br> | Groundedness<br>detection (preview) | Protected material<br>detection |
|-------|---|----------|----------|-----|-----|
| Input requirements: | Default maximum length: 10K characters (split longer texts as needed). | Maximum image file size: 4 MB<br>Dimensions between 50x50 and 2048x2048 pixels.<br>Images can be in JPEG, PNG, GIF, BMP, TIFF, or WEBP formats. | Maximum prompt length: 10K characters.<br>Up to five documents with a total of 10D characters. | Maximum 55,000 characters for grounding sources per API call.<br>Maximum text and query length: 7,500 characters. | Default maximum: 1K characters.<br>Minimum: 111 characters (for scanning LLM completions, not user prompts). | -->

Expand All @@ -115,13 +115,13 @@ See the following list for the input requirements for each feature.
- Maximum image file size: 4 MB
- Dimensions between 50 x 50 and 2048 x 2048 pixels.
- Images can be in JPEG, PNG, GIF, BMP, TIFF, or WEBP formats.
- **Prompt Shields (preview)**:
- **Prompt Shields**:
- Maximum prompt length: 10K characters.
- Up to five documents with a total of 10K characters.
- **Groundedness detection (preview)**:
- Maximum length for grounding sources: 55,000 characters (per API call).
- Maximum text and query length: 7,500 characters.
- **Protected material detection (preview)**:
- **Protected material detection**:
- Default maximum length: 1K characters.
- Default minimum length: 110 characters (for scanning LLM completions, not user prompts).
- **Custom categories (standard)**:
Expand Down Expand Up @@ -172,7 +172,7 @@ Feel free to [contact us](mailto:[email protected]) if you need

Content Safety features have query rate limits in requests-per-second (RPS) or requests-per-10-seconds (RP10S) . See the following table for the rate limits for each feature.

|Pricing tier | Moderation APIs<br>(text and image) | Prompt Shields<br>(preview) | Protected material<br>detection (preview) | Groundedness<br>detection (preview) | Custom categories<br>(rapid) (preview) | Custom categories<br>(standard) (preview)|
|Pricing tier | Moderation APIs<br>(text and image) | Prompt Shields | Protected material<br>detection | Groundedness<br>detection (preview) | Custom categories<br>(rapid) (preview) | Custom categories<br>(standard) (preview)|
|--------|---------|-------------|---------|---------|---------|--|
| F0 | 1000 RP10S | 1000 RP10S | 1000 RP10S | 50 RP10S | 1000 RP10S | 5 RPS|
| S0 | 1000 RP10S | 1000 RP10S | 1000 RP10S | 50 RP10S | 1000 RP10S | 5 RPS|
Expand Down
44 changes: 40 additions & 4 deletions articles/ai-services/content-safety/quickstart-jailbreak.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
---
title: "Quickstart: Prompt Shields (preview)"
title: "Quickstart: Prompt Shields "
titleSuffix: Azure AI services
description: Learn how to detect large language model input attack risks and mitigate risk with Azure AI Content Safety.
services: ai-services
Expand All @@ -13,10 +13,46 @@ ms.author: pafarley

# Quickstart: Prompt Shields (preview)

Follow this guide to use Azure AI Content Safety Prompt Shields to check your large language model (LLM) inputs for both User Prompt and Document attacks.
"Prompt Shields" in Azure AI Content Safety are specifically designed to safeguard generative AI systems from generating harmful or inappropriate content. These shields detect and mitigate risks associated with both User Prompt Attacks (malicious or harmful user-generated inputs) and Document Attacks (inputs containing harmful content embedded within documents). The use of "Prompt Shields" is crucial in environments where GenAI is employed, ensuring that AI outputs remain safe, compliant, and trustworthy.

The primary objectives of the "Prompt Shields" feature for GenAI applications are:

- To detect and block harmful or policy-violating user prompts that could lead to unsafe AI outputs.
- To identify and mitigate document attacks where harmful content is embedded within user-provided documents.
- To maintain the integrity, safety, and compliance of AI-generated content, thereby preventing misuse of GenAI systems.

For more information on Prompt Shields, see the [Prompt Shields concept page](./concepts/jailbreak-detection.md). For API input limits, see the [Input requirements](./overview.md#input-requirements) section of the Overview.




## User scenarios
### 1. AI content creation platforms: Detecting harmful prompts
- Scenario: An AI content creation platform uses generative AI models to produce marketing copy, social media posts, and articles based on user-provided prompts. To prevent the generation of harmful or inappropriate content, the platform integrates "Prompt Shields."
- User: Content creators, platform administrators, and compliance officers.
- Action: The platform uses Azure AI Content Safety's "Prompt Shields" to analyze user prompts before generating content. If a prompt is detected as potentially harmful or likely to lead to policy-violating outputs (e.g., prompts asking for defamatory content or hate speech), the shield blocks the prompt and alerts the user to modify their input.
- Outcome: The platform ensures all AI-generated content is safe, ethical, and compliant with community guidelines, enhancing user trust and protecting the platform's reputation.
### 2. AI-powered chatbots: Mitigating risk from user prompt attacks
- Scenario: A customer service provider uses AI-powered chatbots for automated support. To safeguard against user prompts that could lead the AI to generate inappropriate or unsafe responses, the provider uses "Prompt Shields."
- User: Customer service agents, chatbot developers, and compliance teams.
- Action: The chatbot system integrates "Prompt Shields" to monitor and evaluate user inputs in real-time. If a user prompt is identified as potentially harmful or designed to exploit the AI (e.g., attempting to provoke inappropriate responses or extract sensitive information), the shield intervenes by blocking the response or redirecting the query to a human agent.
- Outcome: The customer service provider maintains high standards of interaction safety and compliance, preventing the chatbot from generating responses that could harm users or breach policies.
### 3. E-learning platforms: Preventing inappropriate AI-generated educational content
- Scenario: An e-learning platform employs GenAI to generate personalized educational content based on student inputs and reference documents. To avoid generating inappropriate or misleading educational content, the platform utilizes "Prompt Shields."
- User: Educators, content developers, and compliance officers.
- Action: The platform uses "Prompt Shields" to analyze both user prompts and uploaded documents for content that could lead to unsafe or policy-violating AI outputs. If a prompt or document is detected as likely to generate inappropriate educational content, the shield blocks it and suggests alternative, safe inputs.
- Outcome: The platform ensures that all AI-generated educational materials are appropriate and compliant with academic standards, fostering a safe and effective learning environment.
### 4. Healthcare AI assistants: Blocking unsafe prompts and document inputs
- Scenario: A healthcare provider uses AI assistants to offer preliminary medical advice based on user inputs and uploaded medical documents. To ensure the AI does not generate unsafe or misleading medical advice, the provider implements "Prompt Shields."
- User: Healthcare providers, AI developers, and compliance teams.
- Action: The AI assistant employs "Prompt Shields" to analyze patient prompts and uploaded medical documents for harmful or misleading content. If a prompt or document is identified as potentially leading to unsafe medical advice, the shield prevents the AI from generating a response and redirects the patient to a human healthcare professional.
- Outcome: The healthcare provider ensures that AI-generated medical advice remains safe and accurate, protecting patient safety and maintaining compliance with healthcare regulations.
### 5. Generative AI for creative writing: Protecting against prompt manipulation
- Scenario: A creative writing platform uses GenAI to assist writers in generating stories, poetry, and scripts based on user inputs. To prevent the generation of inappropriate or offensive content, the platform incorporates "Prompt Shields."
- User: Writers, platform moderators, and content reviewers.
- Action: The platform integrates "Prompt Shields" to evaluate user prompts for creative writing. If a prompt is detected as likely to produce offensive, defamatory, or otherwise inappropriate content, the shield blocks the AI from generating such content and suggests revisions to the user.


## Prerequisites

* An Azure subscription - [Create one for free](https://azure.microsoft.com/free/cognitive-services/)
Expand All @@ -33,7 +69,7 @@ This section walks through a sample request with cURL. Paste the command below i
1. Optionally, replace the `"userPrompt"` or `"documents"` fields in the body with your own text you'd like to analyze.

```shell
curl --location --request POST '<endpoint>/contentsafety/text:shieldPrompt?api-version=2024-02-15-preview' \
curl --location --request POST '<endpoint>/contentsafety/text:shieldPrompt?api-version=2024-09-01' \
--header 'Ocp-Apim-Subscription-Key: <your_subscription_key>' \
--header 'Content-Type: application/json' \
--data-raw '{
Expand All @@ -48,7 +84,7 @@ The following fields must be included in the URL:

| Name | Required? | Description | Type |
| :-- | :-- | :----- | :----- |
| **API Version** | Required | This is the API version to be used. The current version is: api-version=2024-02-15-preview. Example: `<endpoint>/contentsafety/text:shieldPrompt?api-version=2024-02-15-preview` | String |
| **API Version** | Required | This is the API version to be used. The current version is: api-version=2024-09-01. Example: `<endpoint>/contentsafety/text:shieldPrompt?api-version=2024-09-01` | String |

The parameters in the request body are defined in this table:

Expand Down
Loading

0 comments on commit 0a7bb05

Please sign in to comment.