You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/3. Handling CPU and IO Bound Tasks/index.md
+39-15
Original file line number
Diff line number
Diff line change
@@ -6,12 +6,12 @@ permalink: /chapter3/
6
6
7
7
# Handling CPU and I/O Bound Tasks
8
8
## CPU-bound vs I/O-bound
9
-
So, a very basic question: What are CPU-bound and I/O-bound tasks, and how do they differ from each other?
9
+
So, a very basic question: What are CPU-bound and I/O-bound tasks, and how do they differ from each other?
10
10
It's quite simple. CPU-bound tasks are those that primarily consume CPU resources to be handled, while I/O-bound tasks
11
11
are related to the input/output devices of your system, such as the network card, keyboard, and others.
12
12
13
-
A CPU-bound task typically involves intensive mathematical calculations.
14
-
In contrast, I/O-bound tasks involve operations like calling different APIs and waiting for their responses.
13
+
A CPU-bound task typically involves intensive mathematical calculations.
14
+
In contrast, I/O-bound tasks involve operations like calling different APIs and waiting for their responses.
15
15
For example, opening a text file and reading it into memory is also an I/O-bound task.
16
16
17
17
@@ -20,11 +20,11 @@ For example, opening a text file and reading it into memory is also an I/O-bound
20
20
21
21
## Performing asynchronous I/O operations with asyncio
22
22
23
-
There are several robust libraries that handle I/O-bound tasks for requesting an endpoint,
23
+
There are several robust libraries that handle I/O-bound tasks for requesting an endpoint,
24
24
such as aiohttp, Starlette, urllib3, and HTTPX.
25
25
26
-
I am going to provide some examples from the HTTPX library on how you can handle requests
27
-
to different endpoints concurrently.
26
+
I am going to provide some examples from the HTTPX library on how you can handle requests
27
+
to different endpoints concurrently.
28
28
HTTPX can manage both asynchronous and synchronous requests, allowing us to benchmark them.
29
29
Here is example 3_1 to get started with HTTPX:
30
30
@@ -63,11 +63,11 @@ which takes roughly three times longer than the previous examples.
63
63
64
64
## Performing asynchronous CPU operations
65
65
66
-
In Python, the multiprocessing library is used to parallelize CPU-bound tasks.
66
+
In Python, the multiprocessing library is used to parallelize CPU-bound tasks.
67
67
We achieve this by utilizing just two of our CPU's cores in the following example.
68
-
First, we define a CPU-bound task that simply adds a value to the `_sum` variable.
69
-
To utilize the multiprocessing library, we use partial functions,
70
-
which are the same functions with some variables pre-set.
68
+
First, we define a CPU-bound task that simply adds a value to the `_sum` variable.
69
+
To utilize the multiprocessing library, we use partial functions,
70
+
which are the same functions with some variables pre-set.
71
71
Running the code in the next example, we see the speed double.
72
72
```python
73
73
# ex_3_5
@@ -78,14 +78,38 @@ Running the code in the next example, we see the speed double.
78
78
This is a good article talking about this subject:
79
79
[How to Boost Your App Performance with Asyncio](https://blog.cellenza.com/en/software-development/how-to-boost-your-apps-performance-with-asyncio-a-practical-guide-for-python-developers/)
80
80
81
-
In summary, when dealing with CPU-bound tasks, it's generally advisable to utilize multiprocessing,
82
-
with some exceptions we'll discuss later.
81
+
In summary, when dealing with CPU-bound tasks, it's generally advisable to utilize multiprocessing,
82
+
with some exceptions we'll discuss later.
83
83
For I/O-bound tasks, the choice typically lies between asyncio and the multithreading modules.
84
-
While we didn't cover the multithreading module in this section for simplicity,
85
-
it's worth noting that it can also be used for I/O-bound tasks.
86
-
If feasible, asyncio is often preferred over threading.
84
+
While we didn't cover the multithreading module in this section for simplicity,
85
+
it's worth noting that it can also be used for I/O-bound tasks.
86
+
If feasible, asyncio is often preferred over threading.
87
87
We conclude this section by referencing a table from an article,
88
88
which effectively delineates the nuanced distinctions between threading and asyncio.
89
89
90
90
91
91

92
+
93
+
## Process Creation in Python: Fork vs. Spawn
94
+
When working with CPU-bound tasks in Python, you can parallelize workloads by creating child processes using the multiprocessing module.
95
+
Two common methods for starting processes are:
96
+
-**Fork**: Duplicates the parent process, inheriting its memory space.
97
+
-**Spawn**: Starts a new, fresh process, without sharing memory with the parent process.
98
+
99
+
### Fork
100
+
The `fork` method copies the parent process's memory, including variables, and works with **copy-on-write (COW)**. This makes it efficient and
101
+
[up to 20 times faster than `spawn`](https://superfastpython.com/fork-faster-than-spawn/),
102
+
but it can be buggy, especially on macOS. Also, note that fork is not supported on Windows.
103
+
104
+
**Copy-on-write** here mean that fork uses parent memory when reading but creates a copy of that memory when it needs to modify.
105
+
106
+
To use `fork`, simply call `multiprocessing.set_start_method('fork')` during initialization.
107
+
108
+
### Spawn
109
+
The `spawn` method creates a fresh process, which starts execution from the very beginning.
110
+
This method is slower but avoids the potential pitfalls of shared memory between parent and child processes.
111
+
112
+
This is the default method to create a child process on **Windows** and **macOS**.
113
+
114
+
To use `spawn`, simply call `multiprocessing.set_start_method('spawn')` during initialization.
115
+
This is the default method to create a child process on Windows and MacOS.
0 commit comments