Skip to content

Commit 6b0dbcb

Browse files
author
Fabien Coelho
committed
use sphinx doc
1 parent 0ba8371 commit 6b0dbcb

File tree

11 files changed

+305
-162
lines changed

11 files changed

+305
-162
lines changed

.github/workflows/doc.yml

+35
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,35 @@
1+
name: ProxyPatternPool documentation publication on GitHub
2+
3+
on:
4+
push:
5+
branches: [ "main" ]
6+
7+
jobs:
8+
build:
9+
runs-on: ubuntu-latest
10+
steps:
11+
- name: Checkout
12+
uses: actions/checkout@v4
13+
- name: Set up Python
14+
uses: actions/setup-python@v5
15+
run: make check.docs
16+
- name: Generate documentation
17+
run: |
18+
make doc
19+
ln -s docs/_build/html _site
20+
find docs/_build -type d -print | xargs chmod a+rx
21+
find docs/_build -type f -print | xargs chmod a+r
22+
- name: Upload to GitHub Pages
23+
uses: actions/upload-pages-artifact@v3
24+
deploy:
25+
needs: build
26+
environment:
27+
name: github-pages
28+
permissions:
29+
pages: write
30+
id-token: write
31+
runs-on: ubuntu-latest
32+
steps:
33+
- name: Deploy to GitHub Pages
34+
id: deployment
35+
uses: actions/deploy-pages@v4

Makefile

+17
Original file line numberDiff line numberDiff line change
@@ -61,12 +61,18 @@ check.pymarkdown: dev
6161
source venv/bin/activate
6262
pymarkdown scan $(F.md)
6363

64+
.PHONY: check.docs
65+
check.docs: venv/.doc
66+
source venv/bin/activate
67+
sphinx-lint docs/
68+
6469
# check.black check.pyright
6570
.PHONY: check
6671
check: check.pyright check.pymarkdown check.ruff check.pytest check.coverage
6772

6873
.PHONY: clean
6974
clean:
75+
$(MAKE) -C docs clean
7076
$(RM) -r __pycache__ */__pycache__ dist build .mypy_cache .pytest_cache .coverage htmlcov .ruff_cache
7177
$(RM) $(F.pdf)
7278

@@ -90,6 +96,17 @@ venv/.dev: venv
9096
pip install -e .[dev,local]
9197
touch $@
9298

99+
# documentation
100+
venv/.doc: venv
101+
source venv/bin/activate
102+
pip install -e .[doc]
103+
touch $@
104+
105+
.PHONY: doc
106+
doc: venv/.doc check.docs
107+
source venv/bin/activate
108+
$(MAKE) -C docs html
109+
93110
.PHONY: pub
94111
pub: venv/.pub
95112

README.md

+1-158
Original file line numberDiff line numberDiff line change
@@ -49,7 +49,7 @@ This module provides two classes:
4949
This generic pool class can be used independently of the `Proxy` class.
5050

5151
It provides numerous hooks to provide callbacks for creation, deletion,
52-
stats, tracing, health check… which make it ideal to manage any kind
52+
stats, tracing, health check… which makes it ideal to manage any kind
5353
of expensive resources within a process.
5454

5555
```python
@@ -75,163 +75,6 @@ This module provides two classes:
7575
del pool
7676
```
7777

78-
## Documentation
79-
80-
### Proxy
81-
82-
Class `Proxy` manages accesses to one or more objects, possibly using
83-
a `Pool`, depending on the expected scope of said objects.
84-
85-
The `Proxy` constructors expects the following parameters:
86-
87-
- `obj` a *single* object `SHARED` between all threads.
88-
- `fun` a function called for object creation, each time it is needed,
89-
for all other scopes.
90-
- `scope` object scope as defined by `Proxy.Scope`:
91-
- `SHARED` one shared object (process level)
92-
- `THREAD` one object per thread (`threading` implementation)
93-
- `WERKZEUG` one object per greenlet (`werkzeug` implementation)
94-
- `EVENTLET` one object per greenlet (`eventlet` implementation)
95-
- `GEVENT` one object per greenlet (`gevent` implementation)
96-
- `VERSATILE` same as `WERKZEUG`
97-
default is `SHARED` or `THREAD` depending on whether an object
98-
of a function was passed for the object.
99-
- `set_name` the name of a function to set the proxy contents,
100-
default is `set`. This parameter allows to avoid collisions with
101-
the proxied methods, if necessary.
102-
It is used as a prefix to have `set_obj` and `set_fun` functions
103-
which allow to reset the internal `obj` or `fun`.
104-
- `log_level` set logging level, default *None* means no setting.
105-
- `max_size` of pool, default _None_ means **no** pooling.
106-
- `max_size` and _all_ other parameters are forwarded to `Pool`.
107-
108-
When `max_size` is not *None*, a `Pool` is created to store the created
109-
objects so as to reuse them. It is the responsability of the user to
110-
return the object when not needed anymore by calling `_ret_obj` explicitely.
111-
This is useful for code which keeps creating new threads, eg `werkzeug`.
112-
For a database connection, a good time to do that is just after a `commit`.
113-
114-
The proxy has a `_has_obj` method to test whether an object is available
115-
without extracting anything from the pool: this is useful to test whether
116-
returning the object is needed in some error handling pattern.
117-
118-
### Pool
119-
120-
Class `Pool` manages a pool of objects in a thread-safe way.
121-
Its constructor expects the following parameters:
122-
123-
- `fun` how to create a new object; the function is passed the creation number.
124-
- `max_size` maximum size of pool, *0* for unlimited (the default).
125-
- `min_size` minimum size of pool, that many are created and maintained in advance.
126-
- `timeout` maximum time to wait for something, only active under `max_size`.
127-
- `max_use` after how many usage to discard an object.
128-
- `max_avail_delay` when to discard an unused object.
129-
- `max_using_delay` when to warn about object kept for a long time.
130-
- `max_using_delay_kill` when to kill objects kept for a long time.
131-
- `health_freq` run health check this every house keeper rounds.
132-
- `hk_delay` force house keeping delay.
133-
- `log_level` set logging level, default *None* means no setting.
134-
- `opener` function to call when creating an object, default *None* means no call.
135-
- `getter` function to call when getting an object, default *None* means no call.
136-
- `retter` function to call when returning an object, default *None* means no call.
137-
- `closer` function to call when discarding an object, default *None* means no call.
138-
- `stats` function to call to generate a JSON-compatible structure for stats.
139-
- `health` function to call to check for an available object health.
140-
- `tracer` object debug helper, default *None* means less debug.
141-
142-
Objects are created on demand by calling `fun` when needed.
143-
144-
## Proxy Example
145-
146-
Here is an example of a flask application with blueprints and a shared
147-
resource.
148-
149-
First, a shared module holds a proxy to a yet unknown object:
150-
151-
```python
152-
# file "Shared.py"
153-
from ProxyPatternPool import Proxy
154-
stuff = Proxy()
155-
def init_app(s):
156-
stuff.set_obj(s)
157-
```
158-
159-
This shared object is used by module with a blueprint:
160-
161-
```python
162-
# file "SubStuff.py"
163-
from Flask import Blueprint
164-
from Shared import stuff
165-
sub = Blueprint(…)
166-
167-
@sub.get("/stuff")
168-
def get_stuff():
169-
return str(stuff), 200
170-
```
171-
172-
Then the application itself can load and initialize both modules in any order
173-
without risk of having some unitialized stuff imported:
174-
175-
```python
176-
# file "App.py"
177-
from flask import Flask
178-
app = Flask("stuff")
179-
180-
from SubStuff import sub
181-
app.register_blueprint(sub, url_prefix="/sub")
182-
183-
import Shared
184-
Shared.init_app("hello world!")
185-
```
186-
187-
## Notes
188-
189-
This module was initially rhetorical: because of the GIL Python was very bad as
190-
a parallel language, so the point of creating threads which would mostly not
191-
really run in parallel was moot, thus the point of having a clever pool of
192-
stuff to be shared by these thread was even mooter!
193-
However, as the GIL is scheduled to go away in the coming years, starting from
194-
_Python 3.13_ (Fall 2024), it is startng to make sense to have such a thing!
195-
196-
In passing, it is interesting to note that the foremost
197-
[driving motivation](https://peps.python.org/pep-0703/) for getting
198-
read of the GIL is… _data science_. This tells something.
199-
In the past, people interested in parallelism, i.e. performance, say myself,
200-
would probably just turn away from this quite slow language.
201-
People from the networking www world would be satisfied with the adhoc
202-
asynchronous model, and/or just create many processes because
203-
in this context the need to communicate between active workers is limited.
204-
Now come the data scientist, who is not that interested in programming, is
205-
happy with Python and its ecosystem, in particular with the various ML libraries
206-
and the commodity of web-centric remote interfaces such as Jupyter. When
207-
confronted with a GIL-induced performance issue, they are more interested at
208-
fixing the problem than having to learn another language and port their stuff.
209-
210-
Shared object *must* be returned to the pool to avoid depleting resources.
211-
This may require some active cooperation from the infrastructure which may
212-
or may not be reliable. Consider monitoring your resources to detect unexpected
213-
status, eg database connections remaining _idle in transaction_ and the like.
214-
215-
See Also:
216-
217-
- [Psycopg Pool](https://www.psycopg.org/psycopg3/docs/advanced/pool.html)
218-
for pooling Postgres database connexions.
219-
- [Eventlet db_pool](https://eventlet.net/doc/modules/db_pool.html)
220-
for pooling MySQL or Postgres database connexions.
221-
- [Discussion](https://github.com/brettwooldridge/HikariCP/wiki/About-Pool-Sizing)
222-
about database pool sizing (spoiler: small is beautiful: you want threads
223-
waiting for expensive resources used at full capacity rather than
224-
many expensive resources under used).
225-
226-
Example of resources to put in a pool: connections to databases, authentication
227-
services (eg LDAP), search engine…
228-
229-
For a typical REST backend, most requests will require one DB connection, thus
230-
having an in-process pool with less connections is not very usefull, and more is
231-
useless as well, so we may only have _#conns == #threads_ which make sense.
232-
The only point of having a pool is that the thread may be killed independently
233-
and avoiding recreating connections in such cases.
234-
23578
## License
23679

23780
This code is [Public Domain](https://creativecommons.org/publicdomain/zero/1.0/).

docs/DISCUSSION.md

+47
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,47 @@
1+
# Discussion
2+
3+
The `ProxyPatternPool` module was initially rhetorical: because of the GIL
4+
Python was very bad as a parallel language, so the point of creating threads
5+
which would mostly not really run in parallel was moot, thus the point of having
6+
a clever pool of stuff to be shared by these thread was even mooter!
7+
However, as the GIL is scheduled to go away in the coming years, starting from
8+
_Python 3.13_ (Fall 2024), it is starting to make sense to have such a thing!
9+
10+
In passing, it is interesting to note that the foremost
11+
[driving motivation](https://peps.python.org/pep-0703/) for getting
12+
read of the GIL is… _data science_. This tells something.
13+
In the past, people interested in parallelism, i.e. performance, say myself,
14+
would probably just turn away from this quite slow language.
15+
People from the networking www world would be satisfied with the adhoc
16+
asynchronous model, and/or just create many processes because
17+
in this context the need to communicate between active workers is limited.
18+
Now come the data scientist, who is not that interested in programming, is
19+
happy with Python and its ecosystem, in particular with the various ML libraries
20+
and the commodity of web-centric remote interfaces such as Jupyter. When
21+
confronted with a GIL-induced performance issue, they are more interested at
22+
fixing the problem than having to learn another language and port their stuff.
23+
24+
Shared object *must* be returned to the pool to avoid depleting resources.
25+
This may require some active cooperation from the infrastructure which may
26+
or may not be reliable. Consider monitoring your resources to detect unexpected
27+
status, eg database connections remaining _idle in transaction_ and the like.
28+
29+
See Also:
30+
31+
- [Psycopg Pool](https://www.psycopg.org/psycopg3/docs/advanced/pool.html)
32+
for pooling Postgres database connexions.
33+
- [Eventlet db_pool](https://eventlet.net/doc/modules/db_pool.html)
34+
for pooling MySQL or Postgres database connexions.
35+
- [Discussion](https://github.com/brettwooldridge/HikariCP/wiki/About-Pool-Sizing)
36+
about database pool sizing (spoiler: small is beautiful: you want threads
37+
waiting for expensive resources used at full capacity rather than
38+
many expensive resources under used).
39+
40+
Example of resources to put in a pool: connections to databases, authentication
41+
services (eg LDAP), search engine…
42+
43+
For a typical REST backend, most requests will require one DB connection, thus
44+
having an in-process pool with less connections is not very usefull, and more is
45+
useless as well, so we may only have _#conns == #threads_ which make sense.
46+
The only point of having a pool is that the thread may be killed independently
47+
and avoiding recreating connections in such cases.

0 commit comments

Comments
 (0)