Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Explore PyPy, mypyc, Cython #113

Open
1 task
jpmckinney opened this issue Sep 28, 2024 · 1 comment
Open
1 task

Explore PyPy, mypyc, Cython #113

jpmckinney opened this issue Sep 28, 2024 · 1 comment

Comments

@jpmckinney
Copy link
Member

jpmckinney commented Sep 28, 2024

Original description is from when issue was only about PyPy.

We do a fair bit of JSON deserializing (and serializing). On CPython, we use orjson. The json standard library on PyPy is slower, but not by a huge amount, so using PyPy could still be a performance gain. In principle, orjson could be made to support PyPy. ijl/orjson#90 (comment)

Candidates with heavy processing:

Next steps

  • Do some benchmarks between CPython and PyPy for our own code (keep in mind that JIT needs to be warm – works well for long-running workers)

PyPy profiler support

If PyPy really is faster than CPython for our workloads, then there is no point profiling and optimizing on CPython. So, we'd need profilers on PyPy.

Supported

Memory

  • memory_profiler

TBD

  • austin
  • psrecord

Not supported

Memory


Comparison and discussion: plasma-umass/scalene#423

@jpmckinney jpmckinney added python and removed python labels Sep 28, 2024
@jpmckinney
Copy link
Member Author

jpmckinney commented Oct 14, 2024

PyPy disadvantages:

  • Not all libraries support PyPy (orjson, in our case)
  • Lags Python versions (e.g. it's currently equal to 3.10)
  • Perf tools are not as mature (corroborated above)
  • Uses x64_64 emulation (slow) on M1

Useful review of other compilers, followed by mypyc content https://glyph.twistedmatrix.com/2022/04/you-should-compile-your-python-and-heres-why.html

These tools benefit from comprehensive (and narrow, i.e. few Any) typing. To add types, can try:

Other compilers:

  • Cython
    • Disadvantages: To see the greatest performance improvement (e.g. C-like), we might need to
      • Use its specific types like cython.int instead of int (would a pre-processor to replace type annotations work at the build stage? I assume not), or
      • Write in Cython (increases contribution barrier and maintenance burden – but, if the alternative is rewriting it all in Rust, it might make sense to use Cython).
  • mypyc
  • Codon
  • Nuitka
    • Disadvantages: Performance is not the goal. Its performance improvement is inconsistent or minimal.

Worth trying out PyPy and mypyc, as they can more or less work with existing code. Can still consider Cython in cases where the alternative is a full rewrite in Rust. Can try these on ocds-merge for learning.

@jpmckinney jpmckinney changed the title Consider PyPy for some projects Explore PyPy, mypyc, Cython Oct 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant