Skip to content

Conversation

jawad111
Copy link
Contributor

Motivation:
In scenarios where users only want to detect malicious or disallowed HTML content without sanitizing it, a validation mechanism becomes crucial. This feature allows applications to abort processes or take corrective actions when encountering unsafe HTML, ensuring robustness and security.

Use Cases:

  1. Prevent Malicious Submissions:
    Detect malicious content in user-generated inputs, such as comment fields or form submissions, and reject the input early without modifying the HTML.

  2. Abort Application Flow:
    Halt the execution of specific workflows (e.g., data processing or rendering) if unsafe HTML is detected, ensuring that the application does not proceed with invalid data.

  3. Custom Security Workflows:
    Integrate with security pipelines to log, monitor, or analyze the occurrence of unsafe HTML without sanitizing or altering the input.

  4. Audit User Content:
    Validate HTML against custom policies for compliance audits without altering the original content, useful for applications dealing with regulatory constraints or collaborative platforms.

Summary of Changes:

  • Introduced containsDisallowedContent, a validation function to detect prohibited HTML tags, attributes, or links.
  • Allows developers to define custom rules for id, class, and attribute handling.
  • Empowers applications to proactively handle invalid content without performing sanitization.

This feature extends the library's utility by providing a lightweight, focused mechanism for HTML validation.

@jonasfj
Copy link
Member

jonasfj commented Feb 4, 2025

At the moment this package is aimed at aligning with github gfm sanitization rules:
https://github.com/jch/html-pipeline/blob/master/lib/html/pipeline/sanitization_filter.rb

Arguably, these have changed, I don't think github uses that code anymore.

But I'm hesitant to just add features. If there is a lot of community members that would rather have an HTML sanitation package with more advanced features I'm inclined to suggest that you write such a package, publish and maintain it yourself.

Feel free to fork this package and give it a new better name, ideally collaborate with others.
I see that @dab246 has suggestions in #259.

Honestly, I'd be more than happy for package:sanitize_html to have a link in the README.md telling users that if they want advanced features they should consider one of the following packages... (assuming you make a good package obviously 🤣)


On topic of this PR, I might get around to doing a review, but again, I'm hesitant to accept more features. Maybe, if it turns out we need them elsewhere too.

@jawad111
Copy link
Contributor Author

jawad111 commented Apr 8, 2025

At the moment this package is aimed at aligning with github gfm sanitization rules: https://github.com/jch/html-pipeline/blob/master/lib/html/pipeline/sanitization_filter.rb

Arguably, these have changed, I don't think github uses that code anymore.

But I'm hesitant to just add features. If there is a lot of community members that would rather have an HTML sanitation package with more advanced features I'm inclined to suggest that you write such a package, publish and maintain it yourself.

Feel free to fork this package and give it a new better name, ideally collaborate with others. I see that @dab246 has suggestions in #259.

Honestly, I'd be more than happy for package:sanitize_html to have a link in the README.md telling users that if they want advanced features they should consider one of the following packages... (assuming you make a good package obviously 🤣)

On topic of this PR, I might get around to doing a review, but again, I'm hesitant to accept more features. Maybe, if it turns out we need them elsewhere too.

Hi @jonasfj,

Thanks for the feedback! Of course, I’ll see if I can manage to create a new package with more advanced features. I also appreciate the suggestion—it’s definitely going to take a lot of time to make a high-quality package 😁

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants