Skip to content

PYTHON-5309 Async client no longer works on Atlas mongodb Cloud #2286

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 44 commits into
base: master
Choose a base branch
from

Conversation

sleepyStick
Copy link
Contributor

@sleepyStick sleepyStick commented Apr 15, 2025

jira: https://jira.mongodb.org/browse/PYTHON-5309
problem: if a user had pyopenssl installed on their machine, the async client wouldnt configure ssl properly
brief description of my changes:

  • ssl_context will try to load both stdlib ssl and pyopenssl
  • the driver will now dynamically determine which ssl to use (if the user has both) depending on async vs sync context
  • to get this to work, AutoEncryptionOpts delays the resolution of _kms_ssl_contexts

@@ -420,3 +420,21 @@ partially-converted asynchronous version of the same name to the `test/asynchron
Use this generated file as a starting point for the completed conversion.

The script is used like so: `python tools/convert_test_to_async.py [test_file.py]`

## Running PyMongo with SSL
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the motivation for adding this section?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh uh, I was struggling to get it working and asked Noah a bunch of questions, so he suggested I add a section here.

@@ -58,6 +58,7 @@ def __init__(
bypass_query_analysis: bool = False,
encrypted_fields_map: Optional[Mapping[str, Any]] = None,
key_expiration_ms: Optional[int] = None,
is_sync: bool = True,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a public api so we can't add "is_sync" here. What's the motivation for this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I added is_sync as a param to a few functions because in ssl_support.get_ssl_context(), we'll need to know which version of ssl should be used,, is there another way to go about this that is preferred?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, we still can not add "is_sync" here so we need to find another way.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One way would be to lazily init the async SSLContext in kms_request().

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I did it? Not sure if this is what you meant, but what I have now would leave the params to AutoEncryptOpts unchanged.
I basically delayed the parse_kms_tls_options because that's where is_sync was needed. I believe i appropriately delayed the definition of kms_tls_options but let me know if I missed something. I'm not super familiar with how encryption in the driver works >.<

@sleepyStick sleepyStick marked this pull request as ready for review April 21, 2025 17:51
@sleepyStick sleepyStick requested a review from NoahStapp April 21, 2025 17:51
@NoahStapp
Copy link
Contributor

Can you schedule a run of all the pyopenssl variants/tasks?

@sleepyStick
Copy link
Contributor Author

Can you schedule a run of all the pyopenssl variants/tasks?

oops yeah, sorry i'll add it to the latest run with the updated evergreen patch changes from steve's PR

BLOCKING_IO_WRITE_ERROR,
_sslConn,
)
from pymongo.pyopenssl_context import _sslConn as _pysslConn
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the motivation behind the renaming to _pysslConn here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When i was making these changes, I started interpreting py* to mean it would refer to pyopenssl (if it existed, and stdlib ssl if it did not) whereas ssl related vars without py in the front meant it was strictly referring to stdlib ssl. I'm not strongly attached to the renaming and wouldn't mind undoing it, but it helped my brain read the code.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer the old name (_sslConn). A larger refactor could be a good idea but I'd prefer this PR be as minimal as possible to avoid regressions because it's going into a bugfix release.

@@ -57,6 +60,20 @@
BLOCKING_IO_WRITE_ERROR = _ssl.BLOCKING_IO_WRITE_ERROR
BLOCKING_IO_LOOKUP_ERROR = BLOCKING_IO_READ_ERROR

if HAVE_PYSSL:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use Union types here to remove the need for two separate exports of everything? Something like:

BLOCKING_IO_ERRORS = _ssl.BLOCKING_IO_ERRORS | _pyssl.BLOCKING_IO_ERRORS

Are there any situations where we specifically care if an exception type is from PyOpenSSL or stdlib SSL?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ooo good call out. I don't think we particularly care if its pyopenssl vs stdlib ssl error.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think python3.9 didn't love the union types? so i just made them tuples. But its a similar idea.
I couldn't apply this to SSLError though because sometimes we'd raise SSLError so it needed to be one specific type.

IPADDR_SAFE = True

if HAVE_PYSSL:
HAS_SNI = _pyssl.HAS_SNI | _ssl.HAS_SNI
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This logic for HAS_SNI is problematic since this happens at import time but the actual ssl impl at runtime (stdlib vs pyopenssl) may or may not support SNI. I believe we need something like:

        # We have to pass hostname / ip address to wrap_socket
        # to use SSLContext.check_hostname.
        if ssl_context.has_sni:
            ...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm I don't think I understand the problem? HAS_SNI was previously defined in the same spot. Did my changes introduce this problem? Or did the previous code also have the same problem?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new logic is incorrect. If pyopenssl is install and _pyssl.HAS_SNI=False and _ssl.HAS_SNI=True, the sync code will incorrectly think that pyopenssl supports SNI.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, okay i understand how the logic is incorrect now (I previously thought that the two would be the same value regardless of pyopenssl vs stdlibssl)
would you mind elaborating on your suggested fix though? I'm not sure I understand how that would resolve pyopenssl's has_sni and stdlib ssl's has_sni being two different values and how to navigate having those two potentially different values in the rest of the code?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where we actually use HAS_SNI in pool_shared.py, we need to delegate to the appropriate _pyssl.HAS_SNI or _ssl.HAS_SNI property. Perhaps via a new helper function.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if I did it the way you envisioned it, but I created a helper function, has_sni(is_sync) -> bool, and call that everywhere HAS_SNI was being used.

{}, "keyvault.datakeys", mongocryptd_spawn_args=["--idleShutdownTimeoutSecs=88"]
{},
"keyvault.datakeys",
mongocryptd_spawn_args=["--idleShutdownTimeoutSecs=88"],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we revert these format-only changes?

self._bypass_query_analysis = bypass_query_analysis
self._key_expiration_ms = key_expiration_ms

def _parse_kms_tls_options(self, is_sync: bool) -> None:
self._kms_ssl_contexts = _parse_kms_tls_options(self._kms_tls_options, is_sync)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is problematic because a single AutoEncryptionOpts can be shared between multiple clients (even sync + async) at the same time. Currently the _kms_ssl_contexts will clash.

Another problem is that _parse_kms_tls_options is expensive and should only be called at most once for sync and async.

What if we do this at construction time?:

        # Maps KMS provider name to a SSLContext.
        self._kms_ssl_contexts = _parse_kms_tls_options(kms_tls_options, True)
        if pyopenssl_enabled:
            self._async_kms_ssl_contexts = _parse_kms_tls_options(kms_tls_options, False)
        else:
            self._async_kms_ssl_contexts = self._kms_ssl_contexts

Hmm actually, I thought of a better idea. _EncryptionIO can parse and store the contexts:

class _EncryptionIO(AsyncMongoCryptCallback):
    def __init__(
        self,
        client: Optional[AsyncMongoClient[_DocumentTypeArg]],
        key_vault_coll: AsyncCollection[_DocumentTypeArg],
        mongocryptd_client: Optional[AsyncMongoClient[_DocumentTypeArg]],
        opts: AutoEncryptionOpts,
    ):
...
        self.opts = opts
        self._kms_ssl_contexts = _parse_kms_tls_options(opts._kms_tls_options, IS_SYNC)
...

if is_sync and HAVE_PYSSL:
ssl_in_use: types.ModuleType = _pyssl
else:
ssl_in_use = _ssl
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we rename ssl_in_use -> _ssl that way there's less code churn.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think so? because _ssl is referring to stdlib ssl and i don't think we'd want it to be replaced by pyopenssl just because pyopenssl is installed?

Copy link
Member

@ShaneHarvey ShaneHarvey Apr 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a local variable assignment so what's the issue?

        if is_sync and HAVE_PYSSL:
            _ssl: types.ModuleType = _pyssl
        else:
            _ssl = _ssl

Or even just:

        if is_sync and HAVE_PYSSL:
            _ssl: types.ModuleType = _pyssl

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, no you're right -- I've been conditioned to think that shadowing a global var with a local var is generally bad for code readability(?) so my brain forgot that it was something we could do.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Er, am i doing it incorrectly? I'm seeing the error "UnboundLocalError: local variable '_ssl' referenced before assignment"

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Uh, I swore I've done this before. I guess not. The only way around that is like this:

        if is_sync and HAVE_PYSSL:
            _ssl: types.ModuleType = _pyssl
        else:
            _ssl = globals()["_ssl"]

Or by renaming the global "_ssl".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually kinda wanted to rename the global _ssl except i think there were some other modules that imported _ssl from ssl_support and then i wasn't sure if i was allowed to change it...

Copy link
Member

@ShaneHarvey ShaneHarvey Apr 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You know what, I think it's simpler to go back to the the local ssl_in_use var or just rename ssl_in_use to ssl? Apologies for the back and forth.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All good, went with ssl because its shorter.

@@ -12,6 +12,7 @@ Version 4.12.1 is a bug fix release.
- Fixed a bug that caused direct use of ``pymongo.uri_parser`` to raise an ``AttributeError``.
- Removed Eventlet testing against Python versions newer than 3.9 since
Eventlet is actively being sunset by its maintainers and has compatibility issues with PyMongo's dnspython dependency.
- Fixed a bug that would try to use PyOpenSSL with AsyncMongoClient, causing the client to fail.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make this more specific to the exact error reported, something like:

Fixed a bug that would cause AsyncMongoClient to attempt to use PyOpenSSL when available, resulting in errors such as "pymongo.errors.ServerSelectionTimeoutError: 'SSLContext' object has no attribute 'wrap_bio'"

@@ -138,7 +138,7 @@ def _raise_connection_failure(
msg += format_timeout_details(timeout_details)
if isinstance(error, socket.timeout):
raise NetworkTimeout(msg) from error
elif isinstance(error, SSLError) and "timed out" in str(error):
elif isinstance(error, (SSLError, PYSSLError)) and "timed out" in str(error):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do these need to be separate SSLError and PYSSLError types, or can we export a shared one from ssl_support to reduce code churn?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Err, I tried to do something like SSLError = (_ssl.SSLError, _pyssl.SSLError) in ssl_support but i think it was in network_layer where we do a raise SSLError() so it needed to be a specific type. Is there another way to accomplish what you're describing that would work?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The combined type could be here in pool_shared.py, as that's where it would get used.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We only explicitly raise SSLError in network_layer for our socket-based Windows SSL I/O. Those will go away if we do PYTHON-5215.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh i see, makes sense!

@blink1073
Copy link
Member

drivers-pr-bot please backport to v4.12

ShaneHarvey
ShaneHarvey previously approved these changes Apr 23, 2025
Copy link
Member

@ShaneHarvey ShaneHarvey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Could you schedule the all the mac, windows, PyOpenSSL, OCSP, and encryption tasks to check for regressions before merging?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants