Skip to content
This repository was archived by the owner on May 16, 2025. It is now read-only.
This repository was archived by the owner on May 16, 2025. It is now read-only.

Proposal: Add Puppetry Policy Detector as Optional Heuristic Module #118

@metawake

Description

@metawake

Hello Rebuff team,
I have developed a small open-source tool called Puppetry Detector. It detects policy puppetry and prompt injection attempts in LLM prompts using regular expressions. The tool is modular and already includes integration with Rebuff, so it can be adapted as an optional heuristic module.
If somebody gives it a look an confirms possibility of integration, I'd be happy.

I would be happy to prepare a pull request to add this as an optional feature, if you think it could be useful. Please let me know if you are open to this idea or if you have any suggestions before I make a PR.
Thank you for your time and for your great work on Rebuff!

The tool itself is here: https://github.com/metawake/puppetry-detector

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions