Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TypeError: %b requires a bytes-like object, or an object that implements __bytes #630

Open
kkmuffme opened this issue Jan 13, 2025 · 4 comments

Comments

@kkmuffme
Copy link

Traceback (most recent call last):
  File "C:\python311\lib\site-packages\git_filter_repo.py", line 4976, in <modul
e>
    main()
  File "C:\python311\lib\site-packages\git_filter_repo.py", line 4973, in main
    filter.run()
  File "C:\python311\lib\site-packages\git_filter_repo.py", line 4892, in run
    self._parser.run(self._input, self._output)
  File "C:\python311\lib\site-packages\git_filter_repo.py", line 1527, in run
    self._parse_commit()
  File "C:\python311\lib\site-packages\git_filter_repo.py", line 1378, in _parse
_commit
    self._commit_callback(commit, aux_info)
  File "C:\python311\lib\site-packages\git_filter_repo.py", line 4125, in _tweak
_commit
    self._insert_into_stream(commit)
  File "C:\python311\lib\site-packages\git_filter_repo.py", line 4865, in _inser
t_into_stream
    self._parser.insert(obj)
  File "C:\python311\lib\site-packages\git_filter_repo.py", line 1505, in insert
    obj.dump(self._output)
  File "C:\python311\lib\site-packages\git_filter_repo.py", line 800, in dump
    file_.write((b'commit %s\n'
                ^^^^^^^^^^^^^^^
TypeError: %b requires a bytes-like object, or an object that implements __bytes
__, not 'NoneType'
fatal: stream ends early
fast-import: dumping crash report to .git/fast_import_crash_69216

For --force --refs "staging~17"..staging --email-callback '...'

@aswild
Copy link

aswild commented Feb 10, 2025

I just ran into this too, looks like it's user error. The callback functions need to return byte-strings. From the man page:

One important thing to note for all callbacks is that filter-repo uses bytestrings (see https://docs.python.org/3/library/stdtypes.html#bytes) everywhere instead of strings.

In my case, I had to use --email-filter 'return b"..."' instead of --email-filter 'return "..."' and then it worked.

To the maintainers: perhaps git-filter-repo could produce a friendlier error message in cases like this?

For example, in RepoFilter._tweak_commit when the name/email callbacks are run, the code could check that the resulting values have a __bytes__ attribute, which (as far as I can tell) is what python uses when formatting b'%s'%(obj), as git-filter-repo later does in Commit.dump() which is seen in OP's backtrace.

Alternatively, git-filter-repo could check if the callbacks return a str and if so, automatically run it through str.encode('utf-8') to get a bytes object. That'd be user-friendly in the simple case, but I understand if that much additional "magic" isn't desirable.

@newren
Copy link
Owner

newren commented Feb 20, 2025

@kkmuffme

TypeError: %b requires a bytes-like object, or an object that implements bytes, not 'NoneType'
[...]
For --force --refs "staging~17"..staging --email-callback '...'

You didn't specify your email callback, but I suspect you forgot to include a return statement in it. If you could specify your email callback, I could verify or look for other problems.

I just ran into this too, looks like it's user error. The callback functions need to return byte-strings. From the man page:

One important thing to note for all callbacks is that filter-repo uses bytestrings (see https://docs.python.org/3/library/stdtypes.html#bytes) everywhere instead of strings.

In my case, I had to use --email-filter 'return b"..."' instead of --email-filter 'return "..."' and then it worked.

Thanks for commenting; my guess from the error they got is actually that they forgot the return statement, but this is a worthwhile problem to be aware of as well.

To the maintainers: perhaps git-filter-repo could produce a friendlier error message in cases like this?

maintainer, actually. There's just one of me. Anyway...

For callbacks like name, filename, message, email, or refname, it might be simple -- though potentially expensive since the callbacks are called so many times. And what about cases like commit or tag callbacks? Do we have to check every field of the resulting object that was operated on, since there's no way to know which field might have been modified? And do so every time the callback is called? While I like the idea of friendlier messages, I don't like the idea of introducing such overhead.

@aswild
Copy link

aswild commented Feb 20, 2025

my guess from the error they got is actually that they forgot the return statement, but this is a worthwhile problem to be aware of as well.

Ah yeah, probably. I mainly saw the approximately-same stack trace here when I went searching for issues. Somewhere along the line seeing the phrase "bytes-like object" jogged my memory enough to remember the byte-strings requirement.

My instinct was that a check along the lines wouldn't be too expensive, but python performance characteristics can be weird. It'd also be a lot of refactoring because as you mention there's a lot of callback fields.

thing = self._thing_callback(thing)
if not hasattr(thing, '__bytes__'):
    raise SomeError(f'be sure return byte-strings from callbacks, not {type(thing)}')

Maybe exception handling would be easier and faster - catch a TypeError at the top and print a maybe-helpful message with the backtrace? Python doesn't give a ton of machine-readable info in its exceptions. (The add_note() method on exceptions would be nice to use, but that's too new and needs 3.11)

Maybe something like this would work? Or maybe localization prevents properly reading the message like this...

filter = RepoFilter(args)
try:
    filter.run()
except TypeError as err:
    if 'bytes-like object' in str(err):
        print('NOTE: callback functions must return bytestrings, not str')
    raise

There's just one of me

Thank you for your time spent making this tool :)

@newren
Copy link
Owner

newren commented Feb 21, 2025

my guess from the error they got is actually that they forgot the return statement, but this is a worthwhile problem to be aware of as well.

Ah yeah, probably. I mainly saw the approximately-same stack trace here when I went searching for issues. Somewhere along the line seeing the phrase "bytes-like object" jogged my memory enough to remember the byte-strings requirement.

My instinct was that a check along the lines wouldn't be too expensive, but python performance characteristics can be weird. It'd also be a lot of refactoring because as you mention there's a lot of callback fields.

thing = self._thing_callback(thing)
if not hasattr(thing, 'bytes'):
raise SomeError(f'be sure return byte-strings from callbacks, not {type(thing)}')

I didn't expect each individual check to be expensive; rather, the sheer number of checks is my primary concern. I am worried about the number of codepaths that would need to be modified as well, but doing the checks for every commit (or multiple times per commit for e.g. name or email checks) is my bigger worry.

Also, this code example you provide only works for name, email, message, and filename callbacks. Other callbacks like commit and tag don't return an object but potentially mutate one of the arguments passed, and that argument has lots of subfields, most of which need to be byte-strings, so any commit callback for example would result in the need for several checks for each invocation of the callback.

Maybe exception handling would be easier and faster - catch a TypeError at the top and print a maybe-helpful message with the backtrace? Python doesn't give a ton of machine-readable info in its exceptions. (The add_note() method on exceptions would be nice to use, but that's too new and needs 3.11)

Maybe something like this would work? Or maybe localization prevents properly reading the message like this...

filter = RepoFilter(args)
try:
filter.run()
except TypeError as err:
if 'bytes-like object' in str(err):
print('NOTE: callback functions must return bytestrings, not str')
raise

This assumes that (a) the user returned a str (which isn't even valid for this bug report; from the error we see that they returned a NoneType, not a string, so your note would have been misleading for them), (b) the user is only using callbacks which are supposed to return something (not all callbacks are designed that way; see the commit and tag callbacks for example) -- though a slight rewording of the note might be able to address this, (c) the "bytes-like object" error triggered in the code was due to a callback function: there have been other "bytes-like object" errors triggered in the past outside of user (or script) callbacks in the past which can be found in this bug tracker, and I think this message could potentially be confusing to users if it were to be shown in such a case.

It does have the appeal of being simple and a small code change, so maybe there's a tweak of an idea here that might help. I'll leave it open for now...

There's just one of me

Thank you for your time spent making this tool :)

I'm glad you like the tool. :-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants