Skip to content

[BUG] #3907

@eachimei

Description

@eachimei

Bug Report: Console buffer corruption on encoding errors causes crash during cleanup

Description

When FileProxy.flush() or console.print() fails with a UnicodeEncodeError due to stream encoding incompatibility (e.g., UTF-8 content on Windows cp1252 stream), the exception can be caught and handled. However, Rich's Console._buffer retains the unencodable content, causing an unhandled UnicodeEncodeError during Progress.__exit__() cleanup.

This violates the context manager contract - if an error is caught during the context, __exit__ should not raise a new exception about the same underlying issue.

Environment

  • Rich version: 14.2.0
  • Python version: 3.10.11
  • OS: Windows 10/11
  • Stream encoding: cp1252 (Windows default for non-TTY subprocess pipes)

Minimal Reproduction

import subprocess
import sys
import os

test_code = """
import sys
from rich.console import Console
from rich.file_proxy import FileProxy
from rich.progress import Progress

# Simulate Windows subprocess with cp1252 encoding
# In real scenario, this happens when script runs as subprocess with redirected stdout/stderr
console = Console()
file_proxy = FileProxy(console, sys.stderr)

with Progress(console=console) as progress:
    try:
        # Write UTF-8 character that cannot be encoded to cp1252
        file_proxy.write("Hello 🌍")
        file_proxy.flush()  # Fails with UnicodeEncodeError (caught)
    except UnicodeEncodeError as e:
        print(f"Exception caught: {e}", file=sys.stderr)
    
    print("Exiting context...", file=sys.stderr)
# Crash happens HERE during Progress.__exit__()
"""

# Clear encoding env vars to ensure cp1252 on Windows
env = os.environ.copy()
for key in list(env.keys()):
    if 'PYTHON' in key.upper() and 'ENCODING' in key.upper():
        del env[key]

result = subprocess.run(
    [sys.executable, "-c", test_code],
    capture_output=True,
    text=True,
    env=env
)

print(result.stderr)
print(f"Exit code: {result.returncode}")

Expected Behavior

  1. flush() raises UnicodeEncodeError
  2. Exception is caught by try-except
  3. Progress.__exit__() completes gracefully
  4. Program exits with code 0

Actual Behavior

  1. flush() raises UnicodeEncodeError
  2. Exception is caught by try-except ✓
  3. "Exiting context..." is printed ✓
  4. Progress.__exit__() raises unhandled UnicodeEncodeError
  5. Program crashes with exit code 1 ✗

Traceback:

Traceback (most recent call last):
  File "<string>", line 17, in <module>
  File ".../rich/progress.py", line 1189, in __exit__
    self.stop()
  File ".../rich/progress.py", line 1175, in stop
    self.live.stop()
  File ".../rich/live.py", line 162, in stop
    with self.console:
  File ".../rich/console.py", line 870, in __exit__
    self._exit_buffer()
  File ".../rich/console.py", line 826, in _exit_buffer
    self._check_buffer()
  File ".../rich/console.py", line 2038, in _check_buffer
    self._write_buffer()
  File ".../rich/console.py", line 2074, in _write_buffer
    legacy_windows_render(buffer, LegacyWindowsTerm(self.file))
  ...
UnicodeEncodeError: 'charmap' codec can't encode character '\U0001f30d' in position 6: character maps to <undefined>

Root Cause

When FileProxy.flush() calls console.print():

  1. console.print() enters a nested with console: context
  2. Content is added to Console._buffer
  3. Write to stream fails with UnicodeEncodeError during nested console.__exit__()
  4. Exception propagates to user's try-except (first exception - handled ✓)
  5. BUT Console._buffer still contains the unencodable UTF-8 content
  6. Later, Progress.__exit__()console.__exit__()_write_buffer()
  7. Tries to write the buffered content → second UnicodeEncodeError (unhandled ✗)

Impact

  • Impossible to recover from encoding errors programmatically
  • Context manager cleanup raises exceptions for already-handled errors
  • No documented API to clear the buffer or prevent this crash
  • Affects any scenario with non-UTF-8 streams (Windows subprocesses, legacy systems)

Proposed Solutions

Option 1 (Defensive): Clear buffer on write failures

def _write_buffer(self) -> None:
    try:
        # existing write logic
        legacy_windows_render(buffer, LegacyWindowsTerm(self.file))
    except UnicodeEncodeError:
        self._buffer.clear()  # Prevent retry of unencodable content
        raise

Option 2 (Comprehensive): Add public buffer management

def clear_buffer(self) -> None:
    """Clear the internal buffer. Useful for recovery from encoding errors."""
    self._buffer.clear()

Option 3 (Validation): Check encoding compatibility before buffering

def print(self, ...):
    # Validate content can be encoded before adding to buffer
    try:
        test_encode = str(renderable).encode(self.file.encoding, errors='strict')
    except UnicodeEncodeError:
        raise  # Fail fast before corrupting buffer
    # ... rest of print logic

Workaround

Currently, the only workaround is to avoid the situation entirely:

  • Ensure streams are UTF-8 compatible
  • Don't use FileProxy with non-UTF-8 streams

Additional Context

This issue manifests commonly in:

  • Windows subprocess pipes (cp1252 encoding by default)
  • Logging handlers wrapped with FileProxy for Progress coordination
  • CI/CD environments with redirected output
  • Legacy terminal emulators with limited encoding support

Please let me know if I should go ahead and submit a PR with one of the proposed solutions

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions