Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prompt_security #920

Open
wants to merge 8 commits into
base: develop
Choose a base branch
from
Open

Prompt_security #920

wants to merge 8 commits into from

Conversation

lior-ps
Copy link

@lior-ps lior-ps commented Jan 4, 2025

Description

Prompt Security is a startup specializing in security services for LLMs and generative AI. We can protect prompts and responses a wide variaty of risks like prompt injection, jailbreak, sensitive data disclosure and inappropriate content by adding guardrails.

Related Issue(s)

None

Checklist

  • [v] I've read the CONTRIBUTING guidelines.
  • [v] I've updated the documentation if applicable.
  • [v] I've added tests if applicable.
  • [v] @mentions of the person or team responsible for reviewing proposed changes.

@Pouyanpi Pouyanpi self-requested a review January 8, 2025 06:24
@Pouyanpi Pouyanpi self-assigned this Jan 8, 2025
@Pouyanpi Pouyanpi added enhancement New feature or request status: in review labels Jan 8, 2025
.gitignore Outdated Show resolved Hide resolved
Copy link
Collaborator

@Pouyanpi Pouyanpi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @lior-ps for the PR. My review is not completed yet, but please feel free to have a look at the comments.

I think my recent comment here also applies to this PR. It'd be great to have the possibility to run live tests.

nemoguardrails/library/prompt_security/actions.py Outdated Show resolved Hide resolved
nemoguardrails/library/prompt_security/actions.py Outdated Show resolved Hide resolved
nemoguardrails/library/prompt_security/actions.py Outdated Show resolved Hide resolved
nemoguardrails/library/prompt_security/flows.co Outdated Show resolved Hide resolved
@Pouyanpi
Copy link
Collaborator

Another feedback:

As long as we have Colang 1.0 one should not use one flow name for both input and output which you are doing (e.g., protect prompt and protect response)

Currently when both input and output rails are activated when the interaction is multi round in the subsequent rounds of interaction both user and bot messages might be available in a context variable (one can argue that this is a bug). So passing them explicitly in action definition is the appropriate way to do it.

I will highlight the code line that need this change.

Copy link
Collaborator

@Pouyanpi Pouyanpi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Applied suggestion in this comment.

Colang 2.0 flows need change but I will provide the code.

@Pouyanpi
Copy link
Collaborator

thank you @lior-ps it looks great, just tried to run the test without the mocks and am facing some issues.

Would you please have a look?

For example once I comment relevant lines of test_prompt_secuirty_protection_input:

@pytest.mark.unit
def test_prompt_security_protection_input():
    config = RailsConfig.from_content(
        yaml_content="""
            models: []
            rails:
              input:
                flows:
                  - protect prompt
        """,
        colang_content="""
            define user express greeting
              "hi"

            define flow
              user express greeting
              bot express greeting

            define bot inform answer unknown
              "I can't answer that."
        """,
    )

    chat = TestChat(
        config,
        llm_completions=[
            "  express greeting",
            '  "Hi! My name is John as well."',
        ],
    )

    # chat.app.register_action(retrieve_relevant_chunks, "retrieve_relevant_chunks")
    # chat.app.register_action(mock_protect_text(True), "protect_text")
    chat >> "Hi! I am Mr. John! And my email is [email protected]"
    chat << "I can't answer that."

I get Hi! My name is John as well., ideally we should mock the actual behavior.

@Pouyanpi Pouyanpi added this to the v0.12.0 milestone Jan 21, 2025
@lior-ps
Copy link
Author

lior-ps commented Jan 25, 2025

Hi @Pouyanp, I fixed the pytest code, can you please check again?

@Pouyanpi Pouyanpi self-requested a review January 27, 2025 07:55
@Pouyanpi
Copy link
Collaborator

Thank you @lior-ps , It looks good (maybe we can add more tests later)

would you please just sign your commits and run pre-commit per contributing guidelines?

You can do an interactive rebase to jus sign them and apply pre-commit hooks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request status: in review
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants