Skip to content

Add chapter 11 of Rethinking 2 #76

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 20 commits into from
Apr 23, 2020

Conversation

AlexAndorra
Copy link
Contributor

@AlexAndorra AlexAndorra commented Mar 21, 2020

Ready for review.
There are two main points of attention:

  1. The scientific model of the number of tools for the Pacific Islands (page 356 in book, cell Code 11.49 in notebook) runs smoothly but makes terrible predictions, especially for high-contact islands. I suspect it's because the b coefficient for high contact societies is unidentifiable at high population ranges (because there are no samples), but I tried a bunch of different Exponential and HalfNormal priors for this parameter and nothing seems to work. The weird thing is that I think I used the same parametrization as the book.

  2. The first multinomial model behaves weirdly and doesn't replicate the book's results -- possible bug? I added more comments in the notebook.

Thanks a lot for the reviews ✌️ I'll re-run the whole NB once we've fixed all the issues.

@review-notebook-app
Copy link

Check out this pull request on  ReviewNB

You'll be able to see Jupyter notebook diff and discuss changes. Powered by ReviewNB.

@review-notebook-app
Copy link

review-notebook-app bot commented Mar 22, 2020

View / edit / reply to this conversation on ReviewNB

canyon289 commented on 2020-03-22T18:58:37Z
----------------------------------------------------------------

Nice trick


@review-notebook-app
Copy link

review-notebook-app bot commented Mar 22, 2020

View / edit / reply to this conversation on ReviewNB

canyon289 commented on 2020-03-22T18:58:38Z
----------------------------------------------------------------

Did the chapters switch here?


AlexAndorra commented on 2020-03-22T19:08:50Z
----------------------------------------------------------------

If you're referring to all the cells Code 10.XX, they are just code from the first edition that I have here for reference -- as I'm still working on the end of this chapter they could prove useful. I'll delete them when I'm finished.

@AlexAndorra AlexAndorra changed the title [WIP] -- Add chapter 11 of Rethinking 2 Add chapter 11 of Rethinking 2 Mar 25, 2020
@AlexAndorra
Copy link
Contributor Author

@aloctavodia and @canyon289 I found a way to improve the scientific model of Oceanic tools 🎉
I used a different trick to scale the population data, and it seems to work -- although I'm not sure why the difference is so big compared to the previous method 🤷
I would very much appreciate @junpenglao's opinion on this, to make sure we didn't miss something -- see the PR's description for more details or tell me if you have questions. And a big thank you in advance!

@canyon289 canyon289 self-requested a review April 14, 2020 03:29
@review-notebook-app
Copy link

review-notebook-app bot commented Apr 14, 2020

View / edit / reply to this conversation on ReviewNB

canyon289 commented on 2020-04-14T03:30:50Z
----------------------------------------------------------------

Is the +1 missing here because we changed actor to 0 index?


AlexAndorra commented on 2020-04-14T09:19:14Z
----------------------------------------------------------------

Yes, we don't need it here because of zero-indexing

@review-notebook-app
Copy link

review-notebook-app bot commented Apr 14, 2020

View / edit / reply to this conversation on ReviewNB

canyon289 commented on 2020-04-14T04:20:42Z
----------------------------------------------------------------

I only did cursory googling but for 11.3 xtabs would pandas cross tabs reproduce the same result? https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.crosstab.html


AlexAndorra commented on 2020-04-14T10:17:27Z
----------------------------------------------------------------

Don't know why I forgot this cell -- it's done now ;)

@review-notebook-app
Copy link

review-notebook-app bot commented Apr 14, 2020

View / edit / reply to this conversation on ReviewNB

canyon289 commented on 2020-04-14T04:20:43Z
----------------------------------------------------------------

What is m11_1bis? It looks like the dnorm(0, 1.5) model but since it doesnt appaer in the code block its a bit confusing. Maybe add comment saying not part of code block but used to recreate figure 11.3?

Also out of curiosity was does bis mean?


AlexAndorra commented on 2020-04-14T10:22:39Z
----------------------------------------------------------------

You're right -- just added this precision.

"bis" is a latin prefix we use in French, Spanish, Italian, etc. to indicate something is repeated, or is the second instance of something.

@review-notebook-app
Copy link

review-notebook-app bot commented Apr 14, 2020

View / edit / reply to this conversation on ReviewNB

canyon289 commented on 2020-04-14T04:20:43Z
----------------------------------------------------------------

Similar note here. The model is defined in 11.7. Would add comment or label to explain some of the code has switched blocks from the book


AlexAndorra commented on 2020-04-14T10:25:43Z
----------------------------------------------------------------

Done

@review-notebook-app
Copy link

review-notebook-app bot commented Apr 14, 2020

View / edit / reply to this conversation on ReviewNB

canyon289 commented on 2020-04-14T04:20:44Z
----------------------------------------------------------------

This is my own curiosity , what does the intX and pm.Data do in these lines? actor_id = pm.intX(pm.Data("actor_id", actor_idx))


AlexAndorra commented on 2020-04-14T09:28:24Z
----------------------------------------------------------------

Currently, you can't use shared variables for indexing in a model, because index variables have to be ints but pm.Data casts them to floats under the hood.

There is a PR currently working on that, but in the meantime the workaround is what I'm doing above.

@review-notebook-app
Copy link

review-notebook-app bot commented Apr 14, 2020

View / edit / reply to this conversation on ReviewNB

canyon289 commented on 2020-04-14T04:20:45Z
----------------------------------------------------------------

Nothing you need to change but ths margin in this forest plot on top is excessive. Something we should address in arviz code


AlexAndorra commented on 2020-04-14T09:29:14Z
----------------------------------------------------------------

Yeah I agree -- Osvaldo made the same comment.

@review-notebook-app
Copy link

review-notebook-app bot commented Apr 14, 2020

View / edit / reply to this conversation on ReviewNB

canyon289 commented on 2020-04-14T04:20:45Z
----------------------------------------------------------------

Truly a masterpiece


AlexAndorra commented on 2020-04-14T09:30:15Z
----------------------------------------------------------------

Ha ha ha, thanks! This was a monster!

@review-notebook-app
Copy link

review-notebook-app bot commented Apr 14, 2020

View / edit / reply to this conversation on ReviewNB

canyon289 commented on 2020-04-14T04:20:46Z
----------------------------------------------------------------

Being picky here this would be easier to read if the data load occurred again like it is in the book even if the outcome is the same


AlexAndorra commented on 2020-04-14T10:31:48Z
----------------------------------------------------------------

Done

@review-notebook-app
Copy link

review-notebook-app bot commented Apr 20, 2020

View / edit / reply to this conversation on ReviewNB

aloctavodia commented on 2020-04-20T12:57:00Z
----------------------------------------------------------------

seaborn: unused import


@review-notebook-app
Copy link

review-notebook-app bot commented Apr 20, 2020

View / edit / reply to this conversation on ReviewNB

aloctavodia commented on 2020-04-20T12:57:01Z
----------------------------------------------------------------

Here you can also add one dimension to avoid the warnings as I mentioned in my other review


@AlexAndorra
Copy link
Contributor Author

  • Blackified the code and adressed @aloctavodia's and @canyon289's comments -- about the ArviZ warning, I chose not to add one dimension, as it clutters the code and it probably comes from a bug. Once this bug is resolved I can rerun the NB.

  • I solved the problem with the first Multinomial model by taking the first category as pivot instead of the last one (as the last one is very probable compared to the others, it switches the sign of the slope when taken as the pivot). However, this is still a mystery to me why this works in the book -- is it a typo in the book or a bug on PyMC3? 🤷‍♂️ I think we can merge, but maybe we'll have to revisit that.

If all this is good for you, I think this is ready to merge 🎉

@aloctavodia
Copy link
Member

aloctavodia commented Apr 23, 2020

It would be a good idea to open issues so we do not forget about this stuff.

@aloctavodia aloctavodia merged commit 1e21f8c into pymc-devs:master Apr 23, 2020
@AlexAndorra
Copy link
Contributor Author

True, but I don't know if these are still open issues: maybe what I did solved the problems 🤷‍♂️ I'm counting on the community's peer-reviewing to find out 😅

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants