Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

an error I do not understand #19

Open
albangabillon opened this issue Mar 5, 2020 · 5 comments
Open

an error I do not understand #19

albangabillon opened this issue Mar 5, 2020 · 5 comments

Comments

@albangabillon
Copy link

Unable to add relationship because LotArea_LandContour in LotArea_LandContour is Pandas dtype int32 and LotArea_LandContour in index is Pandas dtype int64.

@rwedge
Copy link
Contributor

rwedge commented Mar 5, 2020

Hi @albangabillon , thanks for the error report.

Could you post the full stack trace of the error you encountered?

@albangabillon
Copy link
Author

housing_df = load_housing_data("train.csv") housing_df=housing_df.drop(columns=housing_df.columns[10:],axis=1) an.auto_entityset(housing_df, accuracy=1, name="esHousing")
`100%|████████████████████████████████████████████████████████████████████████████████████| 8/8 [00:00<00:00, 33.33it/s]

ValueError Traceback (most recent call last)
in
----> 1 an.auto_entityset(housing_df, accuracy=1, name="esHousing")

c:\users\alban\anaconda3\envs\geron\lib\site-packages\autonormalize\autonormalize.py in auto_entityset(df, accuracy, index, name, time_index)
133 entityset (ft.EntitySet) : created entity set
134 """
--> 135 return make_entityset(df, find_dependencies(df, accuracy, index), name, time_index)
136
137

c:\users\alban\anaconda3\envs\geron\lib\site-packages\autonormalize\autonormalize.py in make_entityset(df, dependencies, name, time_index)
108 relationships.append((child.index[0], child.index[0], current.index[0], child.index[0]))
109
--> 110 return ft.EntitySet(name, entities, relationships)
111
112

c:\users\alban\anaconda3\envs\geron\lib\site-packages\featuretools\entityset\entityset.py in init(self, id, entities, relationships)
86 child_variable = self[relationship[2]][relationship[3]]
87 self.add_relationship(Relationship(parent_variable,
---> 88 child_variable))
89 self.reset_data_description()
90

c:\users\alban\anaconda3\envs\geron\lib\site-packages\featuretools\entityset\entityset.py in add_relationship(self, relationship)
265 if not is_dtype_equal(parent_dtype, child_dtype):
266 raise ValueError(msg.format(parent_v, parent_e.id, parent_dtype,
--> 267 child_v, child_e.id, child_dtype))
268
269 self.relationships.append(relationship)

ValueError: Unable to add relationship because LandContour_LotArea in LandContour_LotArea is Pandas dtype int32 and LandContour_LotArea in index is Pandas dtype int64.`

@rwedge
Copy link
Contributor

rwedge commented Mar 6, 2020

It looks to be an issue with handling the underlying datatypes while creating the entityset. Are you able to share this data so I could try to replicate?

@albangabillon
Copy link
Author

albangabillon commented Mar 6, 2020 via email

@rwedge
Copy link
Contributor

rwedge commented Mar 9, 2020

Hi,
I didn't get the attached file but I presume it was the "House Prices: Advanced Regression Techniques" contest.

Two more questions

  1. Is load_housing_data("train.csv") a pd.read_csv call?
  2. Can you run
featuretools info

from the command line and share the ouput. It'll help make sure I've got the same environment for testing

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants