Skip to content

Commit f0eb6cf

Browse files
PKEuSjeffreystarr
authored andcommitted
StataWriter: Replace missing values in string columns by an empty string
1 parent 8314b4f commit f0eb6cf

File tree

3 files changed

+15
-3
lines changed

3 files changed

+15
-3
lines changed

doc/source/release.rst

+3-3
Original file line numberDiff line numberDiff line change
@@ -156,7 +156,8 @@ API Changes
156156
- ``DataFrame.sort`` now places NaNs at the beginning or end of the sort according to the ``na_position`` parameter. (:issue:`3917`)
157157

158158
- all offset operations now return ``Timestamp`` types (rather than datetime), Business/Week frequencies were incorrect (:issue:`4069`)
159-
159+
- ``Series.iteritems()`` is now lazy (returns an iterator rather than a list). This was the documented behavior prior to 0.14. (:issue:`6760`)
160+
- ``Panel.shift`` now uses ``NDFrame.shift``. It no longer drops the ``nan`` data and retains its original shape. (:issue:`4867`)
160161

161162
Deprecations
162163
~~~~~~~~~~~~
@@ -305,8 +306,7 @@ Bug Fixes
305306
- Bug in downcasting inference with empty arrays (:issue:`6733`)
306307
- Bug in ``obj.blocks`` on sparse containers dropping all but the last items of same for dtype (:issue:`6748`)
307308
- Bug in unpickling ``NaT (NaTType)`` (:issue:`4606`)
308-
- Bug in ``DataFrame.replace()`` where regex metacharacters were being treated
309-
as regexs even when ``regex=False`` (:issue:`6777`).
309+
- Bug in setting a tz-aware index directly via ``.index`` (:issue:`6785`)
310310

311311
pandas 0.13.1
312312
-------------

pandas/io/stata.py

+2
Original file line numberDiff line numberDiff line change
@@ -1319,6 +1319,8 @@ def _write_data_nodates(self):
13191319
for i, var in enumerate(row):
13201320
typ = ord(typlist[i])
13211321
if typ <= 244: # we've got a string
1322+
if var is None or var == np.nan:
1323+
var = _pad_bytes('', typ)
13221324
if len(var) < typ:
13231325
var = _pad_bytes(var, typ)
13241326
self._write(var)

pandas/io/tests/test_stata.py

+10
Original file line numberDiff line numberDiff line change
@@ -508,6 +508,16 @@ def test_date_export_formats(self):
508508
tm.assert_frame_equal(written_and_read_again.set_index('index'),
509509
expected)
510510

511+
def test_write_missing_strings(self):
512+
original = DataFrame([["1"], [None]], columns=["foo"])
513+
expected = DataFrame([["1"], [""]], columns=["foo"])
514+
expected.index.name = 'index'
515+
with tm.ensure_clean() as path:
516+
original.to_stata(path)
517+
written_and_read_again = self.read_dta(path)
518+
tm.assert_frame_equal(written_and_read_again.set_index('index'),
519+
expected)
520+
511521

512522
if __name__ == '__main__':
513523
nose.runmodule(argv=[__file__, '-vvs', '-x', '--pdb', '--pdb-failure'],

0 commit comments

Comments
 (0)