Skip to content

Commit df978fd

Browse files
cclaussTaylor Robie
authored andcommitted
Use six and feature detection in string conversion (#4740)
* Use six and feature detection in string conversion Leverage [__six.ensure_text()__](https://github.com/benjaminp/six/blob/master/six.py#L890) to deliver Unicode text in both Python 2 and Python 3. Follow Python porting best practice [use feature detection instead of version detection](https://docs.python.org/3/howto/pyporting.html#use-feature-detection-instead-of-version-detection) in ___unicode_to_native()__. * Revert the use of six.ensure_text() Thanks for catching that! I jumped the gun. It is I who have brought shame...
1 parent 7515032 commit df978fd

File tree

1 file changed

+6
-6
lines changed

1 file changed

+6
-6
lines changed

official/transformer/utils/tokenizer.py

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -202,17 +202,17 @@ def _load_vocab_file(vocab_file, reserved_tokens=None):
202202

203203
def _native_to_unicode(s):
204204
"""Convert string to unicode (required in Python 2)."""
205-
if six.PY2:
206-
return s if isinstance(s, unicode) else s.decode("utf-8") # pylint: disable=undefined-variable
207-
else:
205+
try: # Python 2
206+
return s if isinstance(s, unicode) else s.decode("utf-8")
207+
except NameError: # Python 3
208208
return s
209209

210210

211211
def _unicode_to_native(s):
212212
"""Convert string from unicode to native format (required in Python 2)."""
213-
if six.PY2:
214-
return s.encode("utf-8") if isinstance(s, unicode) else s # pylint: disable=undefined-variable
215-
else:
213+
try: # Python 2
214+
return s.encode("utf-8") if isinstance(s, unicode) else s
215+
except NameError: # Python 3
216216
return s
217217

218218

0 commit comments

Comments
 (0)