Multi-Modal-Content-Safety-Evaluators (Azure#38002)

* Initial-Commit-multimodal * Fix * Sync eng/common directory with azure-sdk-tools for PR 9092 (Azure#37713) * Export the subscription data from the service connection * Update deploy-test-resources.yml --------- Co-authored-by: Wes Haggard <[email protected]> Co-authored-by: Wes Haggard <[email protected]> * Removing private parameter from __call__ of AdversarialSimulator (Azure#37709) * Update task_query_response.prompty remove required keys * Update task_simulate.prompty * Update task_query_response.prompty * Update task_simulate.prompty * Remove private variable and use kwargs * Add experimental tag to adv sim --------- Co-authored-by: Nagkumar Arkalgud <[email protected]> * Enabling option to disable response payload on writes (Azure#37365) * Initial draft * Adding tests * Renaming parameter * Update container.py * Renaming test file * Fixing LINT issues * Update container.py * Update _base.py * Update _base.py * Fixing tests * Fixing tests * Adding support to disable response payload on write for AIO * Update CHANGELOG.md * Update _cosmos_client.py * Reacting to code review comments * Addressing code review feedback * Addressed CR feedback * Fixing pyLint errors * Fixing pylint errors * Update test_crud.py * Fixing svc regression * Update sdk/cosmos/azure-cosmos/azure/cosmos/aio/_container.py Co-authored-by: Anna Tisch <[email protected]> * Reacting to code review feedback. * Update container.py * Update test_query_vector_similarity.py --------- Co-authored-by: Anna Tisch <[email protected]> * deprecate azure_germany (Azure#37654) * deprecate azure_germany * update * update * Update sdk/identity/azure-identity/azure/identity/_constants.py Co-authored-by: Paul Van Eck <[email protected]> * update --------- Co-authored-by: Paul Van Eck <[email protected]> * Add default impl to handle token challenges (Azure#37652) * Add default impl to handle token challenges * update version * update * update * update * update * Update sdk/core/azure-core/azure/core/pipeline/policies/_utils.py Co-authored-by: Paul Van Eck <[email protected]> * Update sdk/core/azure-core/azure/core/pipeline/policies/_utils.py Co-authored-by: Paul Van Eck <[email protected]> * update * Update sdk/core/azure-core/tests/test_utils.py Co-authored-by: Paul Van Eck <[email protected]> * Update sdk/core/azure-core/azure/core/pipeline/policies/_utils.py Co-authored-by: Paul Van Eck <[email protected]> * update --------- Co-authored-by: Paul Van Eck <[email protected]> * Make Credentials Required for Content Safety and Protected Materials Evaluators (Azure#37707) * Make Credentials Required for Content Safety Evaluators * fix a typo * lint, fix content safety evaluator * revert test change * remove credential from rai_service * addFeedRangesAndUseFeedRangeInQueryChangeFeed (Azure#37687) * Add getFeedRanges API * Add feedRange support in query changeFeed Co-authored-by: annie-mac <[email protected]> * Update release date for core (Azure#37723) * Improvements to mindependency dev_requirement conflict resolution (Azure#37669) * during mindependency runs, dev_requirements on local relative paths are now checked for conflict with the targeted set of minimum dependencies * multiple type clarifications within azure-sdk-tools * added tests for new conflict resolution logic --------- Co-authored-by: McCoy Patiño <[email protected]> * Need to add environment to subscription configuration (Azure#37726) Co-authored-by: Wes Haggard <[email protected]> * Enable samples for formrecognizer (Azure#37676) * multi-modal-changes * fixes * Fix with latest * dict-fix * adding-protected-material * adding-protected-material * adding-protected-material * bumping-version * adding assets * Added image in simulator * Added image in simulator * bumping-version * push-asset * assets * pushing asset * remove-containt-on-key * asset * asset2 * asset3 * asset4 * adding conftest * conftest * cred fix * asset-new * fix * asset * adding multi-modal-without-tests * asset-from-main * asset-from-main * fix * adding one test only * new asset * tests,fix: Sanitizer should replace with enum value not enum name * test-asset * [AutoRelease] t2-containerservicefleet-2024-09-24-42036(can only be merged by SDK owner) (Azure#37538) * code and test * Update CHANGELOG.md * update-testcase --------- Co-authored-by: azure-sdk <PythonSdkPipelines> Co-authored-by: ChenxiJiang333 <[email protected]> Co-authored-by: ChenxiJiang333 <[email protected]> * [AutoRelease] t2-dns-2024-09-25-81486(can only be merged by SDK owner) (Azure#37560) * code and test * update-testcase * Update CHANGELOG.md * Update test_mgmt_dns_test.py --------- Co-authored-by: azure-sdk <PythonSdkPipelines> Co-authored-by: ChenxiJiang333 <[email protected]> Co-authored-by: ChenxiJiang333 <[email protected]> * [AutoRelease] t2-appconfiguration-2024-10-09-68726(can only be merged by SDK owner) (Azure#37800) * code and test * update-testcase * Update pyproject.toml --------- Co-authored-by: azure-sdk <PythonSdkPipelines> Co-authored-by: ChenxiJiang333 <[email protected]> Co-authored-by: Yuchao Yan <[email protected]> * code and test (Azure#37855) Co-authored-by: azure-sdk <PythonSdkPipelines> * [AutoRelease] t2-servicefabricmanagedclusters-2024-10-08-57405(can only be merged by SDK owner) (Azure#37768) * code and test * update-testcase * update-testcases --------- Co-authored-by: azure-sdk <PythonSdkPipelines> Co-authored-by: ChenxiJiang333 <[email protected]> * [AutoRelease] t2-containerinstance-2024-10-21-66631(can only be merged by SDK owner) (Azure#38005) * code and test * update-testcase * Update CHANGELOG.md * Update CHANGELOG.md * Update CHANGELOG.md --------- Co-authored-by: azure-sdk <PythonSdkPipelines> Co-authored-by: ChenxiJiang333 <[email protected]> Co-authored-by: ChenxiJiang333 <[email protected]> * [sdk generation pipeline] bump typespec-python 0.36.1 (Azure#38008) * update version * update package.json * [AutoRelease] t2-dnsresolver-2024-10-12-16936(can only be merged by SDK owner) (Azure#37864) * code and test * update-testcase * Update CHANGELOG.md * Update CHANGELOG.md --------- Co-authored-by: azure-sdk <PythonSdkPipelines> Co-authored-by: ChenxiJiang333 <[email protected]> Co-authored-by: ChenxiJiang333 <[email protected]> Co-authored-by: Yuchao Yan <[email protected]> * new asset after fix in conftest * asset * chore: Update assets.json * Move perf pipelines to TME subscription (Azure#38020) Co-authored-by: Wes Haggard <[email protected]> * fix * after-comments * fix * asset * new asset with 1 test recording only * chore: Update assets.json * conftest fix * assets change * new test * few changes * removing proxy start * added all tests * asset * fixes * fixes with asset * asset-after-tax * enabling 2 more tests * unit test fix * asset * new asset * fixes per comments * changes by black * merge fix * pylint fix * pylint fix * ground test fix * fixes - pylint, black, mypy * more tests * docstring fixes * doc string fix * asset * few updates after Nagkumar review --------- Co-authored-by: Azure SDK Bot <[email protected]> Co-authored-by: Wes Haggard <[email protected]> Co-authored-by: Wes Haggard <[email protected]> Co-authored-by: Nagkumar Arkalgud <[email protected]> Co-authored-by: Nagkumar Arkalgud <[email protected]> Co-authored-by: Fabian Meiswinkel <[email protected]> Co-authored-by: Anna Tisch <[email protected]> Co-authored-by: Xiang Yan <[email protected]> Co-authored-by: Paul Van Eck <[email protected]> Co-authored-by: Neehar Duvvuri <[email protected]> Co-authored-by: Annie Liang <[email protected]> Co-authored-by: annie-mac <[email protected]> Co-authored-by: Scott Beddall <[email protected]> Co-authored-by: McCoy Patiño <[email protected]> Co-authored-by: kdestin <[email protected]> Co-authored-by: ChenxiJiang333 <[email protected]> Co-authored-by: ChenxiJiang333 <[email protected]> Co-authored-by: Yuchao Yan <[email protected]>
azure-sdk · Oct 28, 2024 · 5b78782 · 5b78782
1 parent 558336a
commit 5b78782
Show file tree

Hide file tree

Showing 28 changed files with 1,680 additions and 31 deletions.
diff --git a/sdk/evaluation/azure-ai-evaluation/CHANGELOG.md b/sdk/evaluation/azure-ai-evaluation/CHANGELOG.md
@@ -1,6 +1,5 @@
 # Release History
 
-
 ## 1.0.0b5 (Unreleased)
 
 ### Features Added
@@ -23,6 +22,7 @@ outputs = asyncio.run(custom_simulator(
     max_conversation_turns=1,
 ))
 ```
+- Adding evaluator for multimodal use cases
 
 ### Breaking Changes
 - Renamed environment variable `PF_EVALS_BATCH_USE_ASYNC` to `AI_EVALS_BATCH_USE_ASYNC`.

diff --git a/sdk/evaluation/azure-ai-evaluation/assets.json b/sdk/evaluation/azure-ai-evaluation/assets.json
@@ -2,5 +2,5 @@
   "AssetsRepo": "Azure/azure-sdk-assets",
   "AssetsRepoPrefixPath": "python",
   "TagPrefix": "python/evaluation/azure-ai-evaluation",
-  "Tag": "python/evaluation/azure-ai-evaluation_f0444ef220"
+  "Tag": "python/evaluation/azure-ai-evaluation_eb4989f81d"
 }
diff --git a/sdk/evaluation/azure-ai-evaluation/azure/ai/evaluation/__init__.py b/sdk/evaluation/azure-ai-evaluation/azure/ai/evaluation/__init__.py
@@ -12,6 +12,14 @@
     SexualEvaluator,
     ViolenceEvaluator,
 )
+from ._evaluators._multimodal._content_safety_multimodal import (
+    ContentSafetyMultimodalEvaluator,
+    HateUnfairnessMultimodalEvaluator,
+    SelfHarmMultimodalEvaluator,
+    SexualMultimodalEvaluator,
+    ViolenceMultimodalEvaluator,
+)
+from ._evaluators._multimodal._protected_material import ProtectedMaterialMultimodalEvaluator
 from ._evaluators._f1_score import F1ScoreEvaluator
 from ._evaluators._fluency import FluencyEvaluator
 from ._evaluators._gleu import GleuScoreEvaluator
@@ -65,4 +73,10 @@
     "Conversation",
     "Message",
     "EvaluationResult",
+    "ContentSafetyMultimodalEvaluator",
+    "HateUnfairnessMultimodalEvaluator",
+    "SelfHarmMultimodalEvaluator",
+    "SexualMultimodalEvaluator",
+    "ViolenceMultimodalEvaluator",
+    "ProtectedMaterialMultimodalEvaluator",
 ]
diff --git a/sdk/evaluation/azure-ai-evaluation/azure/ai/evaluation/_common/rai_service.py b/sdk/evaluation/azure-ai-evaluation/azure/ai/evaluation/_common/rai_service.py
@@ -13,10 +13,12 @@
 
 import jwt
 
+from promptflow.core._errors import MissingRequiredPackage
 from azure.ai.evaluation._exceptions import ErrorBlame, ErrorCategory, ErrorTarget, EvaluationException
 from azure.ai.evaluation._http_utils import AsyncHttpPipeline, get_async_http_client
 from azure.ai.evaluation._model_configurations import AzureAIProject
 from azure.core.credentials import TokenCredential
+from azure.core.exceptions import HttpResponseError
 from azure.core.pipeline.policies import AsyncRetryPolicy
 
 from .constants import (
@@ -26,7 +28,7 @@
     Tasks,
     _InternalEvaluationMetrics,
 )
-from .utils import get_harm_severity_level
+from .utils import get_harm_severity_level, retrieve_content_type
 
 try:
     version = importlib.metadata.version("azure-ai-evaluation")
@@ -281,7 +283,11 @@ def _parse_content_harm_response(
     if key == EvaluationMetrics.HATE_FAIRNESS:
         key = EvaluationMetrics.HATE_UNFAIRNESS
 
-    result: Dict[str, Union[str, float]] = {key: math.nan, key + "_score": math.nan, key + "_reason": ""}
+    result: Dict[str, Union[str, float]] = {
+        (key.value if hasattr(key, "value") else key): math.nan,
+        f"{(key.value if hasattr(key, 'value') else key)}_score": math.nan,
+        f"{(key.value if hasattr(key, 'value') else key)}_reason": math.nan,
+    }
 
     response = batch_response[0]
     if metric_name not in response:
@@ -330,6 +336,8 @@ def _parse_content_harm_response(
         reason = ""
 
     harm_score = metric_value
+    if metric_value == "n/a":
+        return result
     if not math.isnan(metric_value):
         # int(math.nan) causes a value error, and math.nan is already handled
         # by get_harm_severity_level
@@ -465,3 +473,109 @@ async def evaluate_with_rai_service(
     result = parse_response(annotation_response, metric_name, metric_display_name)
 
     return result
+
+
+def generate_payload_multimodal(content_type: str, messages, metric: str) -> Dict:
+    """Generate the payload for the annotation request
+    :param content_type: The type of the content representing multimodal or images.
+    :type content_type: str
+    :param messages: The normalized list of messages to be entered as the "Contents" in the payload.
+    :type messages: str
+    :param metric: The evaluation metric to use. This determines the task type, and whether a "MetricList" is needed
+        in the payload.
+    :type metric: str
+    :return: The payload for the annotation request.
+    :rtype: Dict
+    """
+    include_metric = True
+    task = Tasks.CONTENT_HARM
+    if metric == EvaluationMetrics.PROTECTED_MATERIAL:
+        task = Tasks.PROTECTED_MATERIAL
+        include_metric = False
+
+    if include_metric:
+        return {
+            "ContentType": content_type,
+            "Contents": [{"messages": messages}],
+            "AnnotationTask": task,
+            "MetricList": [metric],
+        }
+    return {
+        "ContentType": content_type,
+        "Contents": [{"messages": messages}],
+        "AnnotationTask": task,
+    }
+
+
+async def submit_multimodal_request(messages, metric: str, rai_svc_url: str, token: str) -> str:
+    """Submit request to Responsible AI service for evaluation and return operation ID
+    :param messages: The normalized list of messages to be entered as the "Contents" in the payload.
+    :type messages: str
+    :param metric: The evaluation metric to use.
+    :type metric: str
+    :param rai_svc_url: The Responsible AI service URL.
+    :type rai_svc_url: str
+    :param token: The Azure authentication token.
+    :type token: str
+    :return: The operation ID.
+    :rtype: str
+    """
+    ## handle json payload and payload from inference sdk strongly type messages
+    if len(messages) > 0 and not isinstance(messages[0], dict):
+        try:
+            from azure.ai.inference.models import ChatRequestMessage
+        except ImportError as ex:
+            error_message = (
+                "Please install 'azure-ai-inference' package to use SystemMessage, UserMessage, AssistantMessage"
+            )
+            raise MissingRequiredPackage(message=error_message) from ex
+        if len(messages) > 0 and isinstance(messages[0], ChatRequestMessage):
+            messages = [message.as_dict() for message in messages]
+
+    filtered_messages = [message for message in messages if message["role"] != "system"]
+    assistant_messages = [message for message in messages if message["role"] == "assistant"]
+    content_type = retrieve_content_type(assistant_messages, metric)
+    payload = generate_payload_multimodal(content_type, filtered_messages, metric)
+
+    ## calling rai service for annotation
+    url = rai_svc_url + "/submitannotation"
+    headers = get_common_headers(token)
+    async with get_async_http_client() as client:
+        response = await client.post(  # pylint: disable=too-many-function-args,unexpected-keyword-arg
+            url, json=payload, headers=headers
+        )
+    if response.status_code != 202:
+        raise HttpResponseError(
+            message=f"Received unexpected HTTP status: {response.status_code} {response.text()}", response=response
+        )
+    result = response.json()
+    operation_id = result["location"].split("/")[-1]
+    return operation_id
+
+
+async def evaluate_with_rai_service_multimodal(
+    messages, metric_name: str, project_scope: AzureAIProject, credential: TokenCredential
+):
+    """ "Evaluate the content safety of the response using Responsible AI service
+       :param messages: The normalized list of messages.
+       :type messages: str
+       :param metric_name: The evaluation metric to use.
+       :type metric_name: str
+       :param project_scope: The Azure AI project scope details.
+       :type project_scope: Dict
+       :param credential: The Azure authentication credential.
+       :type credential:
+    ~azure.core.credentials.TokenCredential
+       :return: The parsed annotation result.
+       :rtype: List[List[Dict]]
+    """
+
+    # Get RAI service URL from discovery service and check service availability
+    token = await fetch_or_reuse_token(credential)
+    rai_svc_url = await get_rai_svc_url(project_scope, token)
+    await ensure_service_availability(rai_svc_url, token, Tasks.CONTENT_HARM)
+    # Submit annotation request and fetch result
+    operation_id = await submit_multimodal_request(messages, metric_name, rai_svc_url, token)
+    annotation_response = cast(List[Dict], await fetch_result(operation_id, rai_svc_url, credential, token))
+    result = parse_response(annotation_response, metric_name)
+    return result
diff --git a/sdk/evaluation/azure-ai-evaluation/azure/ai/evaluation/_common/utils.py b/sdk/evaluation/azure-ai-evaluation/azure/ai/evaluation/_common/utils.py
@@ -9,9 +9,9 @@
 
 import nltk
 from typing_extensions import NotRequired, Required, TypeGuard
-
+from promptflow.core._errors import MissingRequiredPackage
 from azure.ai.evaluation._constants import AZURE_OPENAI_TYPE, OPENAI_TYPE
-from azure.ai.evaluation._exceptions import ErrorBlame, ErrorCategory, EvaluationException
+from azure.ai.evaluation._exceptions import ErrorBlame, ErrorCategory, ErrorTarget, EvaluationException
 from azure.ai.evaluation._model_configurations import (
     AzureAIProject,
     AzureOpenAIModelConfiguration,
@@ -312,3 +312,100 @@ def remove_optional_singletons(eval_class, singletons):
             if param in singletons:
                 del required_singletons[param]
     return required_singletons
+
+
+def retrieve_content_type(assistant_messages: List, metric: str) -> str:
+    """Get the content type for service payload.
+
+    :param assistant_messages: The list of messages to be annotated by evaluation service
+    :type assistant_messages: list
+    :param metric: A string representing the metric type
+    :type metric: str
+    :return: A text representing the content type. Example: 'text', or 'image'
+    :rtype: str
+    """
+    # Check if metric is "protected_material"
+    if metric == "protected_material":
+        return "image"
+
+    # Iterate through each message
+    for item in assistant_messages:
+        # Ensure "content" exists in the message and is iterable
+        content = item.get("content", [])
+        for message in content:
+            if message.get("type", "") == "image_url":
+                return "image"
+    # Default return if no image was found
+    return "text"
+
+
+def validate_conversation(conversation):
+    def raise_exception(msg, target):
+        raise EvaluationException(
+            message=msg,
+            internal_message=msg,
+            target=target,
+            category=ErrorCategory.INVALID_VALUE,
+            blame=ErrorBlame.USER_ERROR,
+        )
+
+    if not conversation or "messages" not in conversation:
+        raise_exception(
+            "Attribute 'messages' is missing in the request",
+            ErrorTarget.CONTENT_SAFETY_CHAT_EVALUATOR,
+        )
+    messages = conversation["messages"]
+    if not isinstance(messages, list):
+        raise_exception(
+            "'messages' parameter must be a JSON-compatible list of chat messages",
+            ErrorTarget.CONTENT_SAFETY_MULTIMODAL_EVALUATOR,
+        )
+    expected_roles = {"user", "assistant", "system"}
+    image_found = False
+    for num, message in enumerate(messages, 1):
+        if not isinstance(message, dict):
+            try:
+                from azure.ai.inference.models import (
+                    ChatRequestMessage,
+                    UserMessage,
+                    AssistantMessage,
+                    SystemMessage,
+                    ImageContentItem,
+                )
+            except ImportError as ex:
+                raise MissingRequiredPackage(
+                    message="Please install 'azure-ai-inference' package to use SystemMessage, AssistantMessage"
+                ) from ex
+
+            if isinstance(messages[0], ChatRequestMessage) and not isinstance(
+                message, (UserMessage, AssistantMessage, SystemMessage)
+            ):
+                raise_exception(
+                    f"Messages must be a strongly typed class of ChatRequestMessage. Message number: {num}",
+                    ErrorTarget.CONTENT_SAFETY_MULTIMODAL_EVALUATOR,
+                )
+
+            if isinstance(message.content, list) and any(
+                isinstance(item, ImageContentItem) for item in message.content
+            ):
+                image_found = True
+            continue
+        if message.get("role") not in expected_roles:
+            raise_exception(
+                f"Invalid role provided: {message.get('role')}. Message number: {num}",
+                ErrorTarget.CONTENT_SAFETY_MULTIMODAL_EVALUATOR,
+            )
+        content = message.get("content")
+        if not isinstance(content, (str, list)):
+            raise_exception(
+                f"Content in each turn must be a string or array. Message number: {num}",
+                ErrorTarget.CONTENT_SAFETY_MULTIMODAL_EVALUATOR,
+            )
+        if isinstance(content, list):
+            if any(item.get("type") == "image_url" and "url" in item.get("image_url", {}) for item in content):
+                image_found = True
+    if not image_found:
+        raise_exception(
+            "Message needs to have multi-modal input like images.",
+            ErrorTarget.CONTENT_SAFETY_MULTIMODAL_EVALUATOR,
+        )
diff --git a/sdk/evaluation/azure-ai-evaluation/azure/ai/evaluation/_evaluate/_utils.py b/sdk/evaluation/azure-ai-evaluation/azure/ai/evaluation/_evaluate/_utils.py
@@ -8,6 +8,8 @@
 import tempfile
 from pathlib import Path
 from typing import Any, Dict, NamedTuple, Optional, Tuple, Union
+import uuid
+import base64
 
 import pandas as pd
 from promptflow.client import PFClient
@@ -81,6 +83,33 @@ def _azure_pf_client_and_triad(trace_destination) -> Tuple[PFClient, AzureMLWork
     return azure_pf_client, ws_triad
 
 
+def _store_multimodal_content(messages, tmpdir: str):
+    # verify if images folder exists
+    images_folder_path = os.path.join(tmpdir, "images")
+    os.makedirs(images_folder_path, exist_ok=True)
+
+    # traverse all messages and replace base64 image data with new file name.
+    for message in messages:
+        for content in message.get("content", []):
+            if content.get("type") == "image_url":
+                image_url = content.get("image_url")
+                if image_url and "url" in image_url and image_url["url"].startswith("data:image/jpg;base64,"):
+                    # Extract the base64 string
+                    base64image = image_url["url"].replace("data:image/jpg;base64,", "")
+
+                    # Generate a unique filename
+                    image_file_name = f"{str(uuid.uuid4())}.jpg"
+                    image_url["url"] = f"images/{image_file_name}"  # Replace the base64 URL with the file path
+
+                    # Decode the base64 string to binary image data
+                    image_data_binary = base64.b64decode(base64image)
+
+                    # Write the binary image data to the file
+                    image_file_path = os.path.join(images_folder_path, image_file_name)
+                    with open(image_file_path, "wb") as f:
+                        f.write(image_data_binary)
+
+
 def _log_metrics_and_instance_results(
     metrics: Dict[str, Any],
     instance_results: pd.DataFrame,
@@ -110,6 +139,15 @@ def _log_metrics_and_instance_results(
         artifact_name = EvalRun.EVALUATION_ARTIFACT if run else EvalRun.EVALUATION_ARTIFACT_DUMMY_RUN
 
         with tempfile.TemporaryDirectory() as tmpdir:
+            # storing multi_modal images if exists
+            col_name = "inputs.conversation"
+            if col_name in instance_results.columns:
+                for item in instance_results[col_name].items():
+                    value = item[1]
+                    if "messages" in value:
+                        _store_multimodal_content(value["messages"], tmpdir)
+
+            # storing artifact result
             tmp_path = os.path.join(tmpdir, artifact_name)
 
             with open(tmp_path, "w", encoding=DefaultOpenEncoding.WRITE) as f:

diff --git a/...ure-ai-evaluation/azure/ai/evaluation/_evaluators/_content_safety/_content_safety_chat.py b/...ure-ai-evaluation/azure/ai/evaluation/_evaluators/_content_safety/_content_safety_chat.py
@@ -99,10 +99,10 @@ def __init__(
         self._eval_last_turn = eval_last_turn
         self._parallel = parallel
         self._evaluators: List[Callable[..., Dict[str, Union[str, float]]]] = [
-            ViolenceEvaluator(azure_ai_project, credential),
-            SexualEvaluator(azure_ai_project, credential),
-            SelfHarmEvaluator(azure_ai_project, credential),
-            HateUnfairnessEvaluator(azure_ai_project, credential),
+            ViolenceEvaluator(credential, azure_ai_project),
+            SexualEvaluator(credential, azure_ai_project),
+            SelfHarmEvaluator(credential, azure_ai_project),
+            HateUnfairnessEvaluator(credential, azure_ai_project),
         ]
 
     def __call__(self, *, conversation: list, **kwargs):