Skip to content

Python 3.13 Compatibility (cgi, dbm.sqlite) #978

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 4 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 4 additions & 2 deletions src/saml2/httputil.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,3 @@
import cgi
import hashlib
import hmac
from http.cookies import SimpleCookie
Expand Down Expand Up @@ -182,7 +181,10 @@ def extract(environ, empty=False, err=False):
:param empty: Stops on empty fields (default: Fault)
:param err: Stops on errors in fields (default: Fault)
"""
formdata = cgi.parse(environ["wsgi.input"], environ, empty, err)
input_stream = environ["wsgi.input"]
content_length = int(environ.get("CONTENT_LENGTH", 0))
formdata_bytes = input_stream.read(content_length)
formdata = parse_qs(formdata_bytes.decode('utf-8'))
# Remove single entries from lists
for key, value in iter(formdata.items()):
if len(value) == 1:
Expand Down
9 changes: 2 additions & 7 deletions src/saml2/pack.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,14 +8,9 @@
"""

import base64


try:
import html
except Exception:
import cgi as html # type: ignore[no-redef]

import html
import logging

from urllib.parse import urlencode
from urllib.parse import urlparse
from xml.etree import ElementTree as ElementTree
Expand Down
11 changes: 11 additions & 0 deletions src/saml2/server.py
Original file line number Diff line number Diff line change
Expand Up @@ -55,7 +55,18 @@
}


def _avoid_dbm_sqlite():
"""
Force dbm.gnu to be used instead of dbm.sqlite because of threading issues when
running idp_server.py. The dbm.sqlite is the default dbm module used by Python >= 3.13
"""
import dbm.gnu
import dbm
dbm._defaultmod = dbm.gnu
Copy link

@kushaldas kushaldas Feb 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dbm._defaultmod = dbm.gnu

This should be removed. Other we get errors like following:

❯ python idp.py idp_conf
[2025-02-13 18:46:45,113] [DEBUG] [saml2.assertion.__init__] policy restrictions: None
[2025-02-13 18:46:45,113] [DEBUG] [saml2.assertion.__init__] policy restrictions: None
[2025-02-13 18:46:45,113] [DEBUG] [saml2.assertion.__init__] policy restrictions: {'default': {'lifetime': {'minutes': 15}, 'attribute_restrictions': None, 'name_form': 'urn:oasis:names:tc:SAML:2.0:attrname-format:uri', 'entity_categories': None}}
Traceback (most recent call last):
  File "/home/kdas/code/learning/saml/pysaml2/example/idp2/idp.py", line 1076, in <module>
    IDP = server.Server(args.config, cache=Cache())
  File "/home/kdas/code/learning/saml/pysaml2/example/idp2/.venv/lib64/python3.13/site-packages/saml2/server.py", line 95, in __init__
    self.init_config(stype)
    ~~~~~~~~~~~~~~~~^^^^^^^
  File "/home/kdas/code/learning/saml/pysaml2/example/idp2/.venv/lib64/python3.13/site-packages/saml2/server.py", line 149, in init_config
    idb = _shelve_compat(dbspec, writeback=True, protocol=2)
  File "/home/kdas/code/learning/saml/pysaml2/example/idp2/.venv/lib64/python3.13/site-packages/saml2/server.py", line 71, in _shelve_compat
    return shelve.open(name, *args, **kwargs)
           ~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib64/python3.13/shelve.py", line 250, in open
    return DbfilenameShelf(filename, flag, protocol, writeback)
  File "/usr/lib64/python3.13/shelve.py", line 227, in __init__
    Shelf.__init__(self, dbm.open(filename, flag), protocol, writeback)
                         ~~~~~~~~^^^^^^^^^^^^^^^^
  File "/usr/lib64/python3.13/dbm/__init__.py", line 91, in open
    raise error[0]("db type is {0}, but the module is not "
                   "available".format(result))
dbm.error: db type is dbm.gnu, but the module is not available

If we remove that, I can use the idp2.

Copy link
Author

@ephes ephes Feb 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you all think about these approaches?

  • Switch to dbm.dump as the default across all Python versions
  • Keep berkeleydb as default for Python <3.13, but use dbm.dump for 3.13+ (Side note: Anyone know what happened to berkeleydb support in 3.13?)
  • The nuclear option: Scrap the _avoid_dbm_sqlite function entirely and tackle the underlying SQLite threading issues head-on (definitely the most challenging path)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I've updated the PR to use dbm.dumb for Python 3.13+. Here are some approaches I tried that didn’t work out:

  • Setting sqlite3.threadsafety = 2
  • Using SRV = WSGIServer((HOST, PORT), application, numthreads=1)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should push everyone to use the new format. We have two cases

  1. users that have used pysaml2 and this backend before and have a file available.
  2. users that have not used pysaml2 or are switching to this backend for the first time.

If the file is not available, then we are on case (2) and we can just use Shelf with the new backend. If the file is there, we should try to open it with Shelf assuming it is an sqlite3 db. If that fails, we emit a warning and then convert it to be an sqlite3 db.

Doing the conversion at runtime could be expensive, but it will only be done once. To avoid paying this at runtime, we can provide update instructions with a script that will read the configuration, locate the database and convert it.

At a later update we can remove the conversion code.
Does that sound reasonable to you?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new format would be dbm.dumb then, right? But I'm not really sure if it's a good choice, because of the fail safe issues mentioned in the related cPython-thread.

And the goal of having a conversion script would be that you could re-use your existing db when upgrading lets say from Python 3.12 to 3.13? I don't know if that's possible. It seems that it's not possible to read the Berkeley DB 1.85 files that are written by 3.12 and below by default when calling shelve.open. At least not by using just the standard library.



def _shelve_compat(name, *args, **kwargs):
_avoid_dbm_sqlite()
try:
return shelve.open(name, *args, **kwargs)
except dbm.error[0]:
Expand Down