Prompt_security #920

lior-ps · 2025-01-04T13:40:53Z

Description

Prompt Security is a startup specializing in security services for LLMs and generative AI. We can protect prompts and responses a wide variaty of risks like prompt injection, jailbreak, sensitive data disclosure and inappropriate content by adding guardrails.

Related Issue(s)

None

Checklist

[v] I've read the CONTRIBUTING guidelines.
[v] I've updated the documentation if applicable.
[v] I've added tests if applicable.
[v] @mentions of the person or team responsible for reviewing proposed changes.

.gitignore

docs/user-guides/community/prompt-security.md

docs/user-guides/guardrails-library.md

nemoguardrails/library/prompt_security/actions.py

Pouyanpi

Thank you @lior-ps for the PR. My review is not completed yet, but please feel free to have a look at the comments.

I think my recent comment here also applies to this PR. It'd be great to have the possibility to run live tests.

nemoguardrails/library/prompt_security/actions.py

nemoguardrails/library/prompt_security/flows.co

Pouyanpi · 2025-01-13T10:33:05Z

Another feedback:

As long as we have Colang 1.0 one should not use one flow name for both input and output which you are doing (e.g., protect prompt and protect response)

Currently when both input and output rails are activated when the interaction is multi round in the subsequent rounds of interaction both user and bot messages might be available in a context variable (one can argue that this is a bug). So passing them explicitly in action definition is the appropriate way to do it.

I will highlight the code line that need this change.

nemoguardrails/library/prompt_security/actions.py

nemoguardrails/library/prompt_security/flows.v1.co

Pouyanpi

Applied suggestion in this comment.

Colang 2.0 flows need change but I will provide the code.

docs/user-guides/community/prompt-security.md

Pouyanpi · 2025-01-20T10:01:59Z

thank you @lior-ps it looks great, just tried to run the test without the mocks and am facing some issues.

Would you please have a look?

For example once I comment relevant lines of test_prompt_secuirty_protection_input:

@pytest.mark.unit
def test_prompt_security_protection_input():
    config = RailsConfig.from_content(
        yaml_content="""
            models: []
            rails:
              input:
                flows:
                  - protect prompt
        """,
        colang_content="""
            define user express greeting
              "hi"

            define flow
              user express greeting
              bot express greeting

            define bot inform answer unknown
              "I can't answer that."
        """,
    )

    chat = TestChat(
        config,
        llm_completions=[
            "  express greeting",
            '  "Hi! My name is John as well."',
        ],
    )

    # chat.app.register_action(retrieve_relevant_chunks, "retrieve_relevant_chunks")
    # chat.app.register_action(mock_protect_text(True), "protect_text")
    chat >> "Hi! I am Mr. John! And my email is [email protected]"
    chat << "I can't answer that."

I get Hi! My name is John as well., ideally we should mock the actual behavior.

nemoguardrails/library/prompt_security/actions.py

lior-ps · 2025-01-25T15:10:31Z

Hi @Pouyanp, I fixed the pytest code, can you please check again?

Pouyanpi · 2025-01-27T08:14:31Z

Thank you @lior-ps , It looks good (maybe we can add more tests later)

would you please just sign your commits and run pre-commit per contributing guidelines?

You can do an interactive rebase to jus sign them and apply pre-commit hooks.

lior-ps added 3 commits December 19, 2024 22:16

add prompt security integration

bd4c36e

use context and try to modify user_message or bot_message when needed

1c32fc5

option to modify user_message or bot_message

63b9339

Pouyanpi self-requested a review January 8, 2025 06:24

Pouyanpi self-assigned this Jan 8, 2025

Pouyanpi added enhancement New feature or request status: in review labels Jan 8, 2025

Pouyanpi reviewed Jan 9, 2025

View reviewed changes

.gitignore Outdated Show resolved Hide resolved

Pouyanpi reviewed Jan 9, 2025

View reviewed changes

docs/user-guides/community/prompt-security.md Outdated Show resolved Hide resolved

Pouyanpi reviewed Jan 9, 2025

View reviewed changes

docs/user-guides/guardrails-library.md Show resolved Hide resolved

Pouyanpi reviewed Jan 9, 2025

View reviewed changes

nemoguardrails/library/prompt_security/actions.py Outdated Show resolved Hide resolved

Pouyanpi reviewed Jan 9, 2025

View reviewed changes

nemoguardrails/library/prompt_security/actions.py Outdated Show resolved Hide resolved

Pouyanpi reviewed Jan 9, 2025

View reviewed changes

nemoguardrails/library/prompt_security/actions.py Outdated Show resolved Hide resolved

Pouyanpi requested changes Jan 9, 2025

View reviewed changes

Pouyanpi assigned lior-ps Jan 9, 2025

add :

68b25d2

Pouyanpi reviewed Jan 13, 2025

View reviewed changes

nemoguardrails/library/prompt_security/actions.py Outdated Show resolved Hide resolved

Pouyanpi reviewed Jan 13, 2025

View reviewed changes

nemoguardrails/library/prompt_security/actions.py Outdated Show resolved Hide resolved

Pouyanpi reviewed Jan 13, 2025

View reviewed changes

nemoguardrails/library/prompt_security/actions.py Outdated Show resolved Hide resolved

Pouyanpi reviewed Jan 13, 2025

View reviewed changes

nemoguardrails/library/prompt_security/actions.py Outdated Show resolved Hide resolved

Pouyanpi reviewed Jan 13, 2025

View reviewed changes

nemoguardrails/library/prompt_security/actions.py Outdated Show resolved Hide resolved

Pouyanpi reviewed Jan 13, 2025

View reviewed changes

nemoguardrails/library/prompt_security/flows.v1.co Outdated Show resolved Hide resolved

Pouyanpi reviewed Jan 13, 2025

View reviewed changes

nemoguardrails/library/prompt_security/flows.v1.co Outdated Show resolved Hide resolved

Pouyanpi requested changes Jan 13, 2025

View reviewed changes

resolve pull request comments

b4fd073

Pouyanpi reviewed Jan 20, 2025

View reviewed changes

docs/user-guides/community/prompt-security.md Show resolved Hide resolved

Pouyanpi reviewed Jan 20, 2025

View reviewed changes

nemoguardrails/library/prompt_security/actions.py Outdated Show resolved Hide resolved

Pouyanpi added this to the v0.12.0 milestone Jan 21, 2025

lior-ps added 2 commits January 23, 2025 10:05

typo

a129e8e

fix prompt security pytest

e404927

Pouyanpi self-requested a review January 27, 2025 07:55

fix issue found by pre-commit

40eebd0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Prompt_security #920

Prompt_security #920

lior-ps commented Jan 4, 2025

Pouyanpi left a comment

Pouyanpi commented Jan 13, 2025

Pouyanpi left a comment

Pouyanpi commented Jan 20, 2025

lior-ps commented Jan 25, 2025

Pouyanpi commented Jan 27, 2025

Prompt_security #920

Are you sure you want to change the base?

Prompt_security #920

Conversation

lior-ps commented Jan 4, 2025

Description

Related Issue(s)

Checklist

Pouyanpi left a comment

Choose a reason for hiding this comment

Pouyanpi commented Jan 13, 2025

Pouyanpi left a comment

Choose a reason for hiding this comment

Pouyanpi commented Jan 20, 2025

lior-ps commented Jan 25, 2025

Pouyanpi commented Jan 27, 2025