Skip to content

Commit 75b0b3a

Browse files
feat(ch3): fork vs spawn.
1 parent 685413b commit 75b0b3a

File tree

2 files changed

+40
-15
lines changed

2 files changed

+40
-15
lines changed

docs/3. Handling CPU and IO Bound Tasks/index.md

+39-15
Original file line numberDiff line numberDiff line change
@@ -6,12 +6,12 @@ permalink: /chapter3/
66

77
# Handling CPU and I/O Bound Tasks
88
## CPU-bound vs I/O-bound
9-
So, a very basic question: What are CPU-bound and I/O-bound tasks, and how do they differ from each other?
9+
So, a very basic question: What are CPU-bound and I/O-bound tasks, and how do they differ from each other?
1010
It's quite simple. CPU-bound tasks are those that primarily consume CPU resources to be handled, while I/O-bound tasks
1111
are related to the input/output devices of your system, such as the network card, keyboard, and others.
1212

13-
A CPU-bound task typically involves intensive mathematical calculations.
14-
In contrast, I/O-bound tasks involve operations like calling different APIs and waiting for their responses.
13+
A CPU-bound task typically involves intensive mathematical calculations.
14+
In contrast, I/O-bound tasks involve operations like calling different APIs and waiting for their responses.
1515
For example, opening a text file and reading it into memory is also an I/O-bound task.
1616

1717

@@ -20,11 +20,11 @@ For example, opening a text file and reading it into memory is also an I/O-bound
2020

2121
## Performing asynchronous I/O operations with asyncio
2222

23-
There are several robust libraries that handle I/O-bound tasks for requesting an endpoint,
23+
There are several robust libraries that handle I/O-bound tasks for requesting an endpoint,
2424
such as aiohttp, Starlette, urllib3, and HTTPX.
2525

26-
I am going to provide some examples from the HTTPX library on how you can handle requests
27-
to different endpoints concurrently.
26+
I am going to provide some examples from the HTTPX library on how you can handle requests
27+
to different endpoints concurrently.
2828
HTTPX can manage both asynchronous and synchronous requests, allowing us to benchmark them.
2929
Here is example 3_1 to get started with HTTPX:
3030

@@ -63,11 +63,11 @@ which takes roughly three times longer than the previous examples.
6363

6464
## Performing asynchronous CPU operations
6565

66-
In Python, the multiprocessing library is used to parallelize CPU-bound tasks.
66+
In Python, the multiprocessing library is used to parallelize CPU-bound tasks.
6767
We achieve this by utilizing just two of our CPU's cores in the following example.
68-
First, we define a CPU-bound task that simply adds a value to the `_sum` variable.
69-
To utilize the multiprocessing library, we use partial functions,
70-
which are the same functions with some variables pre-set.
68+
First, we define a CPU-bound task that simply adds a value to the `_sum` variable.
69+
To utilize the multiprocessing library, we use partial functions,
70+
which are the same functions with some variables pre-set.
7171
Running the code in the next example, we see the speed double.
7272
```python
7373
# ex_3_5
@@ -78,14 +78,38 @@ Running the code in the next example, we see the speed double.
7878
This is a good article talking about this subject:
7979
[How to Boost Your App Performance with Asyncio](https://blog.cellenza.com/en/software-development/how-to-boost-your-apps-performance-with-asyncio-a-practical-guide-for-python-developers/)
8080

81-
In summary, when dealing with CPU-bound tasks, it's generally advisable to utilize multiprocessing,
82-
with some exceptions we'll discuss later.
81+
In summary, when dealing with CPU-bound tasks, it's generally advisable to utilize multiprocessing,
82+
with some exceptions we'll discuss later.
8383
For I/O-bound tasks, the choice typically lies between asyncio and the multithreading modules.
84-
While we didn't cover the multithreading module in this section for simplicity,
85-
it's worth noting that it can also be used for I/O-bound tasks.
86-
If feasible, asyncio is often preferred over threading.
84+
While we didn't cover the multithreading module in this section for simplicity,
85+
it's worth noting that it can also be used for I/O-bound tasks.
86+
If feasible, asyncio is often preferred over threading.
8787
We conclude this section by referencing a table from an article,
8888
which effectively delineates the nuanced distinctions between threading and asyncio.
8989

9090

9191
![Screenshot from 2024-05-24 19-39-15](https://github.com/aligheshlaghi97/asynchronous-python/assets/121802083/935a265a-aa5f-4e35-b311-d9d810e9f5c1)
92+
93+
## Process Creation in Python: Fork vs. Spawn
94+
When working with CPU-bound tasks in Python, you can parallelize workloads by creating child processes using the multiprocessing module.
95+
Two common methods for starting processes are:
96+
- **Fork**: Duplicates the parent process, inheriting its memory space.
97+
- **Spawn**: Starts a new, fresh process, without sharing memory with the parent process.
98+
99+
### Fork
100+
The `fork` method copies the parent process's memory, including variables, and works with **copy-on-write (COW)**. This makes it efficient and
101+
[up to 20 times faster than `spawn`](https://superfastpython.com/fork-faster-than-spawn/),
102+
but it can be buggy, especially on macOS. Also, note that fork is not supported on Windows.
103+
104+
**Copy-on-write** here mean that fork uses parent memory when reading but creates a copy of that memory when it needs to modify.
105+
106+
To use `fork`, simply call `multiprocessing.set_start_method('fork')` during initialization.
107+
108+
### Spawn
109+
The `spawn` method creates a fresh process, which starts execution from the very beginning.
110+
This method is slower but avoids the potential pitfalls of shared memory between parent and child processes.
111+
112+
This is the default method to create a child process on **Windows** and **macOS**.
113+
114+
To use `spawn`, simply call `multiprocessing.set_start_method('spawn')` during initialization.
115+
This is the default method to create a child process on Windows and MacOS.

docs/index.md

+1
Original file line numberDiff line numberDiff line change
@@ -35,6 +35,7 @@ This repository serves as a comprehensive resource for learning about asynchrono
3535
- Performing asynchronous I/O operations with asyncio and httpx
3636
- Performing asynchronous CPU operations
3737
- Strategies for balancing CPU and I/O-bound workloads in async Python applications
38+
- Process Creation in Python: Fork vs. Spawn
3839

3940
4. * **Synchronization and Coordination**
4041
- Managing shared resources and avoiding race conditions

0 commit comments

Comments
 (0)