Skip to content

Add setting to check catalog encoding #134

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from
Draft

Conversation

joshcooper
Copy link

Summary

If RSpec.configuration.strict_catalog_encoding is set to true, check whether the catalog contains binary strings or strings with an invalid encoding. This only checks resources, not other catalog metadata like version, environment or edges.

Binary strings (aka ASCII_8BIT) commonly occur in facts that call Socket.gethostname, Windows registry or partial file reads.

Strings with invalid encodings commonly occur when calling the file function to inline kerberos key tab files or DER encoded keys. Note this check can't detect cases where the underlying byte representation happens to be valid for UTF-8. For example, if a string contains the 3 bytes sequence 0xE2 0x82 0xAC, String#valid_encoding? will return true, since that happens to correspond to the € code point. But if the string contains 0xC0, then valid_encoding? will return false, since C0 must be followed by a second byte in UTF-8.

By default the setting is false, because facter has historically produced facts with ASCII_8BIT and will be detected when running "puppet apply". The behavior can be opted into by setting this in your module's spec/spec_helper.rb:

RSpec.configure do |c|
  c.strict_catalog_encoding = true
end

Additional Context

We can't rely on puppet-lint to check for binary data, because we need to evaluate functions and check the resulting catalog that the evaluator produces.

Related Issues (if any)

Checklist

  • 🟢 Spec tests.
  • 🟢 Acceptance tests.
  • Manually verified.

If RSpec.configuration.strict_catalog_encoding is set to true, check whether the
catalog contains binary strings or strings with an invalid encoding. This only
checks resources, not other catalog metadata like `version`, `environment` or
edges.

Binary strings (aka ASCII_8BIT) commonly occur in facts that call
Socket.gethostname, Windows registry or partial file reads.

Strings with invalid encodings commonly occur when calling the `file` function
to inline kerberos key tab files or DER encoded keys. Note this check can't
detect cases where the underlying byte representation happens to be valid for
UTF-8. For example, if a string contains the 3 bytes sequence 0xE2 0x82 0xAC,
String#valid_encoding? will return true, since that happens to correspond to
the € code point. But if the string contains 0xC0, then valid_encoding? will
return false, since C0 must be followed by a second byte in UTF-8.

By default the setting is false, because facter has historically produced facts
with ASCII_8BIT and will be detected when running "puppet apply". The behavior
can be opted into by setting this in your module's spec/spec_helper.rb:

    RSpec.configure do |c|
      c.strict_catalog_encoding = true
    end
@binford2k
Copy link

Seems like a decent partial solution. It will catch cases in which the encoding errors occur at module test time. This can't hurt. But honestly, I'd prefer a more robust error message in the catalog compilation that gave more hints about where the invalid encoding came from.

@joshcooper
Copy link
Author

I'd prefer a more robust error message in the catalog compilation

Yeah agreed, especially because fact fixtures and stubbed functions can hide these issues. I am working on a complementary PR to do what you're talking about, including file and line info.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants