Skip to content

Conversation

@berelian1
Copy link

@berelian1 berelian1 commented Apr 23, 2025

Handle non-ASCII recipients by including includes rcpt_options=("SMTPUTF8")

More detail: issue #143 brings up a bug where mailmerge would crash with a UnicodeEncodeError if an email address contained non-ASCII characters. I added logic to automatically detect non-ASCII characters in sender or recipient email addresses using a needs_smtputf8() helper function. If non-ASCII characters are detected, the sendmail() call includes rcpt_options=("SMTPUTF8"), as some smtp servers support the UTF8SMTP extension and are able to support non-ASCII characters. This ensures that mailmerge no longer crashes, as problematic emails can be skipped or handled in the future. In terms of the code, I updated the sendmail_ssltls() method to call needs_smtputf8(sender, *recipients) before sending.

Closes #143

@awdeorio
Copy link
Owner

Thanks for your help @berelian1 ! Would you be willing to add a unit test for this? Ideally, add a new unit test to test_sendmail_client.py and also one in test_main.py.

@berelian1
Copy link
Author

Hi! Yes, I will create a new pull request with the changes reflected

@berelian1
Copy link
Author

I had to remove some of the white space in other test cases in test_main.py as the pylint was giving me issues about the number of lines in the file, however, test case functionalities remain the same. Added test test_utf8smtp_trigger in test_main.py and test_utf8smtp_trigger in test_sendmail_client.py

@awdeorio awdeorio changed the title automation of adding config parameter when email contains non-ASCII Fix non-ASCII recipients Apr 24, 2025
Copy link
Owner

@awdeorio awdeorio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did a more detailed review. This is a great start!

""") # noqa: E501


def test_utf8smtp_trigger(tmpdir):
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about renaming to test_utf8_recipient and cut-paste right after test_utf8_headers?

[smtp_server]
host = open-smtp.example.com
"""), encoding="utf8")

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of changing the whitespace, how about making a new file test_database.py and cut-paste the test_database_* unit tests into there?

SUBJECT: UTF8SMTP Test
Hello{{name}}!
"""), encoding="utf8")
database_path = Path(tmpdir/"mailmerge_database.csv")
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding some comments would be nice

username = awdeorio
"""))

sendmail_client = SendmailClient(config_path, dry_run=False)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comments please!

sendmail_client = SendmailClient(config_path, dry_run=False)

message = email.message_from_string(textwrap.dedent("""\
TO: mü[email protected]
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Include a different CC and BCC to check those too

"""Verify triggers to rcpt_options=['UTF8SMTP']."""
template_path = Path(tmpdir/"mailmerge_template.txt")
template_path.write_text(textwrap.dedent("""\
TO: mü[email protected]
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Include CC and BCC to check those too

template_path = Path(tmpdir/"mailmerge_template.txt")
template_path.write_text(textwrap.dedent("""\
TO: mü[email protected]
FROM: [email protected]
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you advise what should happen if the FROM address contains a non-ASCII character?

Copy link
Author

@berelian1 berelian1 Apr 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, I have been reading on the SMTP documentation, and although it seems somewhat unclear, I think the sender must be checked along with the recipients.

3.2. The SMTPUTF8 Extension

" An SMTP server that announces the SMTPUTF8 extension MUST be prepared to accept a UTF-8 string [RFC3629] in any position in which RFC 5321 specifies that a can appear."

"An SMTP client that receives the SMTPUTF8 extension keyword in response to the EHLO command MAY transmit mailbox names within SMTP commands as internationalized strings in UTF-8 form. "

3.2. Client Initiation

"Once the server has sent the greeting (welcoming) message and the
client has received it, the client normally sends the EHLO command to
the server, indicating the client's identity. In addition to opening
the session, use of EHLO indicates that the client is able to process
service extensions and requests that the server provide a list of the
extensions it supports."

Since the sender must also initiate itself and verify identity which includes indicating its service extensions, this includes verifying whether its server can support non-ascii chars

with smtplib.SMTP_SSL(host, port, context=ctx) as smtp:
smtp.login(self.config.username, self.password)
smtp.sendmail(sender, recipients, message_flattened)
if needs_smtputf8(sender, *recipients):
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See earlier comment about whether sender should be here or not. I'm not sure.

host, port = (self.config.host, self.config.port)

def needs_smtputf8(*string):
return any(any(ord(c) > 127 for c in part) for part in string)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Leave a note in a comment that this can be improve if/when we move to Python 3.7. Currently mailmerge supports Python 3.6+

https://stackoverflow.com/questions/196345/how-to-check-if-a-string-in-python-is-in-ascii

raise exceptions.MailmergeError(f"SSL Error: {err}")
host, port = (self.config.host, self.config.port)

def needs_smtputf8(*string):
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about any_non_ascci?

Move to "vanilla" function outside the class.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Crashes on email adresses with illegal characters

2 participants