Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pandas None/NaN mappings #251

Open
LuchiLucs opened this issue Feb 4, 2025 · 1 comment
Open

pandas None/NaN mappings #251

LuchiLucs opened this issue Feb 4, 2025 · 1 comment

Comments

@LuchiLucs
Copy link

I'm using write_dataframe function to write a pandas DataFrame. My context is that this dataframe containts columns of three different types:

  1. Object (string) in pandas has None as missing data
  2. Int64 in pandas has np.nan as missing data
  3. Floats64 in pandas has np.nan as missing data

When writing to Redshift, these values are converted as such:

  • None as NULL using varchar(20) with bytedict encoding
  • NaN as -9223372036854775808 using BIGINT with az64 encoding
  • NaN as "NaN" using DOUBLE PRECISION with RAW encoding

When I try to query using SQL, based on the column, I have to filter with:

  1. IS NULL
  2. = -9223372036854775808
  3. ::text = "NaN"

Is this intended? I wish to map all None/NaN values of pandas into NULL values. Is this possible?

@LuchiLucs
Copy link
Author

When reading back into pandas DataFrame, for instance, the int64 columns has missing data with -9223372036854775808 instead of np.nan, resulting in a NON reproducible mapping.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant