-
Notifications
You must be signed in to change notification settings - Fork 1.1k
PYTHON-5309 Async client no longer works on Atlas mongodb Cloud #2286
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
…ure if this is correct or not, we'll seeeeee)
@@ -420,3 +420,21 @@ partially-converted asynchronous version of the same name to the `test/asynchron | |||
Use this generated file as a starting point for the completed conversion. | |||
|
|||
The script is used like so: `python tools/convert_test_to_async.py [test_file.py]` | |||
|
|||
## Running PyMongo with SSL |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the motivation for adding this section?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh uh, I was struggling to get it working and asked Noah a bunch of questions, so he suggested I add a section here.
pymongo/encryption_options.py
Outdated
@@ -58,6 +58,7 @@ def __init__( | |||
bypass_query_analysis: bool = False, | |||
encrypted_fields_map: Optional[Mapping[str, Any]] = None, | |||
key_expiration_ms: Optional[int] = None, | |||
is_sync: bool = True, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a public api so we can't add "is_sync" here. What's the motivation for this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I added is_sync
as a param to a few functions because in ssl_support.get_ssl_context()
, we'll need to know which version of ssl should be used,, is there another way to go about this that is preferred?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, we still can not add "is_sync" here so we need to find another way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One way would be to lazily init the async SSLContext in kms_request().
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I did it? Not sure if this is what you meant, but what I have now would leave the params to AutoEncryptOpts unchanged.
I basically delayed the parse_kms_tls_options because that's where is_sync was needed. I believe i appropriately delayed the definition of kms_tls_options but let me know if I missed something. I'm not super familiar with how encryption in the driver works >.<
Can you schedule a run of all the pyopenssl variants/tasks? |
oops yeah, sorry i'll add it to the latest run with the updated evergreen patch changes from steve's PR |
pymongo/network_layer.py
Outdated
BLOCKING_IO_WRITE_ERROR, | ||
_sslConn, | ||
) | ||
from pymongo.pyopenssl_context import _sslConn as _pysslConn |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the motivation behind the renaming to _pysslConn
here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When i was making these changes, I started interpreting py* to mean it would refer to pyopenssl (if it existed, and stdlib ssl if it did not) whereas ssl related vars without py in the front meant it was strictly referring to stdlib ssl. I'm not strongly attached to the renaming and wouldn't mind undoing it, but it helped my brain read the code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer the old name (_sslConn). A larger refactor could be a good idea but I'd prefer this PR be as minimal as possible to avoid regressions because it's going into a bugfix release.
@@ -57,6 +60,20 @@ | |||
BLOCKING_IO_WRITE_ERROR = _ssl.BLOCKING_IO_WRITE_ERROR | |||
BLOCKING_IO_LOOKUP_ERROR = BLOCKING_IO_READ_ERROR | |||
|
|||
if HAVE_PYSSL: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we use Union
types here to remove the need for two separate exports of everything? Something like:
BLOCKING_IO_ERRORS = _ssl.BLOCKING_IO_ERRORS | _pyssl.BLOCKING_IO_ERRORS
Are there any situations where we specifically care if an exception type is from PyOpenSSL or stdlib SSL?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ooo good call out. I don't think we particularly care if its pyopenssl vs stdlib ssl error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think python3.9 didn't love the union types? so i just made them tuples. But its a similar idea.
I couldn't apply this to SSLError though because sometimes we'd raise SSLError so it needed to be one specific type.
pymongo/ssl_support.py
Outdated
IPADDR_SAFE = True | ||
|
||
if HAVE_PYSSL: | ||
HAS_SNI = _pyssl.HAS_SNI | _ssl.HAS_SNI |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This logic for HAS_SNI is problematic since this happens at import time but the actual ssl impl at runtime (stdlib vs pyopenssl) may or may not support SNI. I believe we need something like:
# We have to pass hostname / ip address to wrap_socket
# to use SSLContext.check_hostname.
if ssl_context.has_sni:
...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm I don't think I understand the problem? HAS_SNI was previously defined in the same spot. Did my changes introduce this problem? Or did the previous code also have the same problem?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The new logic is incorrect. If pyopenssl is install and _pyssl.HAS_SNI=False
and _ssl.HAS_SNI=True
, the sync code will incorrectly think that pyopenssl supports SNI.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, okay i understand how the logic is incorrect now (I previously thought that the two would be the same value regardless of pyopenssl vs stdlibssl)
would you mind elaborating on your suggested fix though? I'm not sure I understand how that would resolve pyopenssl's has_sni and stdlib ssl's has_sni being two different values and how to navigate having those two potentially different values in the rest of the code?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where we actually use HAS_SNI in pool_shared.py, we need to delegate to the appropriate _pyssl.HAS_SNI or _ssl.HAS_SNI property. Perhaps via a new helper function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if I did it the way you envisioned it, but I created a helper function, has_sni(is_sync) -> bool
, and call that everywhere HAS_SNI
was being used.
test/asynchronous/test_encryption.py
Outdated
{}, "keyvault.datakeys", mongocryptd_spawn_args=["--idleShutdownTimeoutSecs=88"] | ||
{}, | ||
"keyvault.datakeys", | ||
mongocryptd_spawn_args=["--idleShutdownTimeoutSecs=88"], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we revert these format-only changes?
pymongo/encryption_options.py
Outdated
self._bypass_query_analysis = bypass_query_analysis | ||
self._key_expiration_ms = key_expiration_ms | ||
|
||
def _parse_kms_tls_options(self, is_sync: bool) -> None: | ||
self._kms_ssl_contexts = _parse_kms_tls_options(self._kms_tls_options, is_sync) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is problematic because a single AutoEncryptionOpts can be shared between multiple clients (even sync + async) at the same time. Currently the _kms_ssl_contexts
will clash.
Another problem is that _parse_kms_tls_options is expensive and should only be called at most once for sync and async.
What if we do this at construction time?:
# Maps KMS provider name to a SSLContext.
self._kms_ssl_contexts = _parse_kms_tls_options(kms_tls_options, True)
if pyopenssl_enabled:
self._async_kms_ssl_contexts = _parse_kms_tls_options(kms_tls_options, False)
else:
self._async_kms_ssl_contexts = self._kms_ssl_contexts
Hmm actually, I thought of a better idea. _EncryptionIO can parse and store the contexts:
class _EncryptionIO(AsyncMongoCryptCallback):
def __init__(
self,
client: Optional[AsyncMongoClient[_DocumentTypeArg]],
key_vault_coll: AsyncCollection[_DocumentTypeArg],
mongocryptd_client: Optional[AsyncMongoClient[_DocumentTypeArg]],
opts: AutoEncryptionOpts,
):
...
self.opts = opts
self._kms_ssl_contexts = _parse_kms_tls_options(opts._kms_tls_options, IS_SYNC)
...
pymongo/ssl_support.py
Outdated
if is_sync and HAVE_PYSSL: | ||
ssl_in_use: types.ModuleType = _pyssl | ||
else: | ||
ssl_in_use = _ssl |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we rename ssl_in_use
-> _ssl
that way there's less code churn.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think so? because _ssl is referring to stdlib ssl and i don't think we'd want it to be replaced by pyopenssl just because pyopenssl is installed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a local variable assignment so what's the issue?
if is_sync and HAVE_PYSSL:
_ssl: types.ModuleType = _pyssl
else:
_ssl = _ssl
Or even just:
if is_sync and HAVE_PYSSL:
_ssl: types.ModuleType = _pyssl
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, no you're right -- I've been conditioned to think that shadowing a global var with a local var is generally bad for code readability(?) so my brain forgot that it was something we could do.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Er, am i doing it incorrectly? I'm seeing the error "UnboundLocalError: local variable '_ssl' referenced before assignment"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Uh, I swore I've done this before. I guess not. The only way around that is like this:
if is_sync and HAVE_PYSSL:
_ssl: types.ModuleType = _pyssl
else:
_ssl = globals()["_ssl"]
Or by renaming the global "_ssl".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I actually kinda wanted to rename the global _ssl
except i think there were some other modules that imported _ssl
from ssl_support
and then i wasn't sure if i was allowed to change it...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You know what, I think it's simpler to go back to the the local ssl_in_use
var or just rename ssl_in_use
to ssl
? Apologies for the back and forth.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All good, went with ssl
because its shorter.
doc/changelog.rst
Outdated
@@ -12,6 +12,7 @@ Version 4.12.1 is a bug fix release. | |||
- Fixed a bug that caused direct use of ``pymongo.uri_parser`` to raise an ``AttributeError``. | |||
- Removed Eventlet testing against Python versions newer than 3.9 since | |||
Eventlet is actively being sunset by its maintainers and has compatibility issues with PyMongo's dnspython dependency. | |||
- Fixed a bug that would try to use PyOpenSSL with AsyncMongoClient, causing the client to fail. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make this more specific to the exact error reported, something like:
Fixed a bug that would cause AsyncMongoClient to attempt to use PyOpenSSL when available, resulting in errors such as "pymongo.errors.ServerSelectionTimeoutError: 'SSLContext' object has no attribute 'wrap_bio'"
pymongo/pool_shared.py
Outdated
@@ -138,7 +138,7 @@ def _raise_connection_failure( | |||
msg += format_timeout_details(timeout_details) | |||
if isinstance(error, socket.timeout): | |||
raise NetworkTimeout(msg) from error | |||
elif isinstance(error, SSLError) and "timed out" in str(error): | |||
elif isinstance(error, (SSLError, PYSSLError)) and "timed out" in str(error): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do these need to be separate SSLError
and PYSSLError
types, or can we export a shared one from ssl_support
to reduce code churn?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Err, I tried to do something like SSLError = (_ssl.SSLError, _pyssl.SSLError)
in ssl_support
but i think it was in network_layer
where we do a raise SSLError()
so it needed to be a specific type. Is there another way to accomplish what you're describing that would work?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The combined type could be here in pool_shared.py
, as that's where it would get used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We only explicitly raise SSLError
in network_layer
for our socket-based Windows SSL I/O. Those will go away if we do PYTHON-5215.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahh i see, makes sense!
drivers-pr-bot please backport to v4.12 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Could you schedule the all the mac, windows, PyOpenSSL, OCSP, and encryption tasks to check for regressions before merging?
jira: https://jira.mongodb.org/browse/PYTHON-5309
problem: if a user had pyopenssl installed on their machine, the async client wouldnt configure ssl properly
brief description of my changes:
AutoEncryptionOpts
delays the resolution of_kms_ssl_contexts