Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update action to support output rail #895

Open
wants to merge 2 commits into
base: develop
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions docs/user-guides/community/active-fence.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# ActiveFence Integration

NeMo Guardrails supports using the [ActiveFence ActiveScore API](https://docs.activefence.com/index.html) as an input rail out-of-the-box (you need to have the `ACTIVEFENCE_API_KEY` environment variable set).
NeMo Guardrails supports using the [ActiveFence ActiveScore API](https://docs.activefence.com/index.html) as an input and output rail out-of-the-box (you need to have the `ACTIVEFENCE_API_KEY` environment variable set).

```yaml
rails:
Expand All @@ -13,7 +13,7 @@ rails:
# - activefence moderation detailed
```

The `activefence moderation` flow uses the maximum risk score with an 0.85 threshold to decide if the input should be allowed or not (i.e., if the risk score is above the threshold, it is considered a violation). The `activefence moderation detailed` has individual scores per category of violation.
The `activefence moderation` flow uses the maximum risk score with an 0.85 threshold to decide if the text should be allowed or not (i.e., if the risk score is above the threshold, it is considered a violation). The `activefence moderation detailed` has individual scores per category of violation.

To customize the scores, you have to overwrite the [default flows](https://github.com/NVIDIA/NeMo-Guardrails/tree/develop/nemoguardrails/library/activefence/flows.co) in your config. For example, to change the threshold for `activefence moderation` you can add the following flow to your config:

Expand Down
5 changes: 4 additions & 1 deletion docs/user-guides/guardrails-library.md
Original file line number Diff line number Diff line change
Expand Up @@ -611,7 +611,7 @@ This category of rails relies on 3rd party APIs for various guardrailing tasks.

### ActiveFence

NeMo Guardrails supports using the [ActiveFence ActiveScore API](https://docs.activefence.com/index.html) as an input rail out-of-the-box (you need to have the `ACTIVEFENCE_API_KEY` environment variable set).
NeMo Guardrails supports using the [ActiveFence ActiveScore API](https://docs.activefence.com/index.html) as an input and output rail out-of-the-box (you need to have the `ACTIVEFENCE_API_KEY` environment variable set).

#### Example usage

Expand All @@ -620,6 +620,9 @@ rails:
input:
flows:
- activefence moderation
output:
flows:
- activefence moderation
```

For more details, check out the [ActiveFence Integration](./community/active-fence.md) page.
Expand Down
7 changes: 5 additions & 2 deletions nemoguardrails/library/activefence/actions.py
Original file line number Diff line number Diff line change
Expand Up @@ -32,12 +32,15 @@ async def call_activefence_api(context: Optional[dict] = None):
if api_key is None:
raise ValueError("ACTIVEFENCE_API_KEY environment variable not set.")

user_message = context.get("user_message")
if context.get("triggered_input_rail"):
text = context["user_message"]
else:
text = context["bot_message"]

url = "https://apis.activefence.com/sync/v3/content/text"
headers = {"af-api-key": api_key, "af-source": "nemo-guardrails"}
data = {
"text": user_message,
"text": text,
"content_id": "ng-" + new_uuid(),
}

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@
from tests.utils import TestChat


def test_1(monkeypatch):
def test_input(monkeypatch):
monkeypatch.setenv("ACTIVEFENCE_API_KEY", "xxx")

config = RailsConfig.from_content(
Expand Down Expand Up @@ -88,3 +88,47 @@ def test_1(monkeypatch):

chat >> "you are stupid!"
chat << "I'm sorry, I can't respond to that."


def test_output(monkeypatch):
monkeypatch.setenv("ACTIVEFENCE_API_KEY", "xxx")

config = RailsConfig.from_content(
yaml_content="""
models:
- type: main
engine: openai
model: gpt-3.5-turbo-instruct

rails:
output:
flows:
- activefence moderation
""",
)
chat = TestChat(
config,
llm_completions=[
" You are stupid!",
],
)

with aioresponses() as m:
m.post(
"https://apis.activefence.com/sync/v3/content/text",
payload={
"response_id": "36f76a43-ddbe-4308-bc86-1a2b068a00ea",
"entity_id": "59fe8fe0-5036-494f-970c-8e28305a3716",
"entity_type": "content",
"violations": [
{
"violation_type": "abusive_or_harmful.profanity",
"risk_score": 0.95,
}
],
"errors": [],
},
)

chat >> "Hello!"
chat << "I'm sorry, I can't respond to that."