Skip to content
Change the repository type filter

All

    Repositories list

    • ArtPrompt

      Public
      [ACL24] Official Repo of Paper `ArtPrompt: ASCII Art-based Jailbreak Attacks against Aligned LLMs`
      Python
      MIT License
      125000Updated Dec 9, 2024Dec 9, 2024
    • magpie

      Public
      Python
      MIT License
      57000Updated Sep 5, 2024Sep 5, 2024
    • Official Repository for ACL 2024 Paper SafeDecoding: Defending against Jailbreak Attacks via Safety-Aware Decoding
      Jupyter Notebook
      MIT License
      910821Updated Jul 19, 2024Jul 19, 2024
    • CleanGen

      Public
      Official Implementation of CLEANGEN: Mitigating Backdoor Attacks for Generation Tasks in Large Language Models
      Python
      1900Updated Jul 5, 2024Jul 5, 2024
    • ChatBug

      Public
      [AAAI25] Official Repo of Paper `ChatBug: A Common Vulnerability of Aligned LLMs Induced by Chat Templates`
      Python
      MIT License
      0600Updated Jun 24, 2024Jun 24, 2024
    • edc

      Public
      Source Code for "EDC: Effective and Efficient Dialog Comprehension For Dialog State Tracking" (NAACL 2024)
      Python
      0010Updated Jun 18, 2024Jun 18, 2024
    • ACE

      Public
      Official Repository for ACE: A Model Poisoning Attack on Contribution Evaluation Methods in Federated Learning
      MIT License
      1100Updated May 21, 2024May 21, 2024