https://www.udemy.com/course/python-3-deep-dive-part-1/learn/lecture/7192348#overview
https://www.udemy.com/course/python-3-deep-dive-part-1/learn/lecture/7368670#overview
https://www.udemy.com/course/python-3-deep-dive-part-1/learn/lecture/7368672#overview
https://www.udemy.com/course/python-3-deep-dive-part-1/learn/lecture/7649326#overview
https://learning.oreilly.com/library/view/high-performance-python/9781492055013/ch02.html
- py-spy: Sampling profiler for Python programs
How to approach Python optimization task.
-
Understand the problem: Analyze the problem you want to solve and determine the input-output relationship. Make sure you understand the requirements and constraints of the problem.
-
Write a baseline implementation: Write a simple, working implementation of the problem without focusing on optimization. This will help you to test and compare optimized versions later.
-
Profile the baseline code: Use Python profiling tools like cProfile, py-spy, or built-in Python timeit module to identify bottlenecks and parts of the code that consume the most time or resources.
Optimize the code:
-
a. Algorithmic optimization: Analyze the algorithm used in your implementation and try to find more efficient algorithms to solve the problem. This can have the biggest impact on performance.
-
b. Data structures: Choose appropriate data structures for the problem, such as sets, dictionaries, or lists, depending on the requirements.
-
c. Pythonic optimizations: Apply Python best practices and optimizations such as list comprehensions, using the map() and filter() functions, or functools.lru_cache for caching results of expensive functions.
-
d. Parallelism: Consider using parallelism to divide the problem into smaller tasks and execute them concurrently, using libraries like concurrent.futures, multiprocessing, or asyncio.
-
e. External libraries: Look for third-party libraries that can help optimize specific parts of the code, such as NumPy for numerical computations or Cython for compiling Python code to C.
-
f. Profiling-guided optimizations: Apply the learnings from the profiling step to specifically target the identified bottlenecks and optimize the relevant code sections.
Validate the optimized code: Ensure that the optimized code still provides correct results by comparing its output to the baseline implementation. Consider using unit tests to automate the validation process.
Measure performance improvements: Profile the optimized code and compare its performance to the baseline implementation. If the performance improvement is not satisfactory, return to step 3 and iterate.
Document the optimization process: Keep a record of the optimizations applied, the performance improvements achieved, and any trade-offs made during the optimization process. This will help you and others to understand the rationale behind the changes and make future optimizations easier.
cProfile
: A built-in Python module for deterministic profiling of Python programs.
py-spy
: A sampling profiler for Python programs that can profile running processes without the need for modifying or restarting the code.
line_profiler
: A line-by-line profiler to measure the execution time of individual lines of code.
memory_profiler
: A module for monitoring memory usage of a Python program on a line-by-line basis.
pstats
: A built-in Python module for processing data generated by the cProfile module.
snakeviz
: A web-based viewer for Python profiling data, which can visualize the output from cProfile or py-spy.
- NumPy: A powerful library for numerical computing with a focus on performance.
- SciPy: A library built on NumPy for scientific and technical computing.
- Polars: A high-performance library for data manipulation and analysis.
- Cython: A programming language that makes writing C extensions for Python as easy as writing Python itself. It can significantly speed up Python code by compiling it to C.
- Numba: A Just-In-Time (JIT) compiler for Python that translates a subset of Python and NumPy code into fast machine code.
- Dask: A parallel computing library for out-of-core and distributed computing that can parallelize NumPy, Pandas, and other libraries.
- concurrent.futures: A built-in Python module for asynchronously executing callables, which can be used for parallelism.
- multiprocessing: A built-in Python module for parallelism using multiple processes.
- asyncio: A built-in Python module for asynchronous I/O and concurrency using coroutines.
- joblib: A set of tools for lightweight pipelining and parallelism in Python.
- PyO3: A library for writing Rust code that can be used in Python programs, which can help speed up performance-critical sections.