Skip to content

datalab-to/pykatex

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

PyKaTeX

Fast, standalone Python library for validating KaTeX equations.

Features

  • Strict KaTeX Conformance: Uses the actual KaTeX JavaScript library via QuickJS for 100% compatibility with KaTeX
  • High Performance: Up to 15,000+ equations/second with fast mode
  • Batch Validation: Efficiently validate many equations at once
  • Fast Mode: Parse-only validation (skips HTML generation) for maximum throughput
  • Detailed Error Messages: Get the exact KaTeX error for invalid equations
  • Render to HTML: Generate HTML output for valid equations

Installation

pip install quickjs  # Required dependency
pip install -e .     # Install pykatex

Or just install quickjs and add the pykatex directory to your Python path.

Quick Start

from pykatex import validate, validate_batch, render_to_string

# Simple validation
is_valid, error = validate("E = mc^2")
print(f"Valid: {is_valid}")  # True

# Check invalid equation
is_valid, error = validate("\\unknownCommand")
print(f"Valid: {is_valid}, Error: {error}")  # False, with error message

# Batch validation
results = validate_batch(["x^2", "\\frac{a}{b}", "\\sin x"])
for r in results:
    print(f"{r.equation}: {'valid' if r.valid else r.error}")

# Render to HTML
html = render_to_string("E = mc^2", display_mode=True)

API Reference

Functions

validate(equation, display_mode=True) -> (bool, str | None)

Validate a single equation. Returns a tuple of (is_valid, error_message).

is_valid, error = validate("x^2")

validate_batch(equations, display_mode=True, fast=False) -> list[ValidationResult]

Validate multiple equations efficiently.

# Standard batch validation
results = validate_batch(["x^2", "y^2"])
for r in results:
    print(r.equation, r.valid, r.error)

# Fast mode (parse-only, ~4x faster, recommended for validation-only use cases)
results = validate_batch(equations, fast=True)

render_to_string(equation, display_mode=True, output="html", throw_on_error=True) -> str

Render an equation to HTML.

html = render_to_string("E = mc^2")

Classes

KaTeXValidator

Thread-safe validator instance. Each instance has its own QuickJS runtime.

from pykatex import KaTeXValidator

validator = KaTeXValidator(
    time_limit_ms=5000,    # JS execution timeout
    memory_limit_mb=64,    # JS memory limit
)

result = validator.validate("x^2", display_mode=True, return_html=False)
results = validator.validate_batch(["x^2", "y^2"])
html = validator.render_to_string("x^2")
is_valid = validator.is_valid("x^2")

ValidationResult

Dataclass returned by validation methods.

@dataclass
class ValidationResult:
    equation: str        # The input equation
    valid: bool          # Whether it's valid
    error: str | None    # Error message if invalid
    html: str | None     # HTML output if requested

Performance

Run the benchmark script to measure throughput:

python benchmark.py --equations 2000 --iterations 3

Typical results:

  • Sequential validation: ~2,000 equations/second
  • Batch validation: ~4,000 equations/second
  • Fast batch validation: ~15,000 equations/second (recommended)

Recommendation: Use validate_batch(equations, fast=True) for maximum throughput when you only need validation (not HTML output). This is 4x faster than regular batch and 8x faster than sequential.

Thread Safety

  • KaTeXValidator instances are thread-safe (uses internal locking)
  • Module-level functions (validate, validate_batch, render_to_string) use thread-local validators

Testing

# Run all tests
pytest tests/ -v

# Run with coverage
pytest tests/ -v --cov=pykatex

# Run performance tests
pytest tests/test_validator.py::TestPerformance -v -s

How It Works

  1. QuickJS Runtime: Uses the quickjs Python package to embed a JavaScript engine
  2. KaTeX Bundle: Loads the actual KaTeX JavaScript library (katex.min.js)
  3. Thread Isolation: Each thread gets its own JS runtime (QuickJS is thread-hostile)
  4. JSON Bridge: Equations and options are passed via JSON for safe escaping

Limitations

  • Requires the quickjs Python package
  • Each validator instance uses ~50-100MB of memory for the JS runtime
  • The macros KaTeX option is not yet supported
  • The strict option only supports boolean values

License

MIT License

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages