Skip to content

Analysing ordinal data in PyMC #277

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
drbenvincent opened this issue Feb 4, 2022 · 5 comments
Open

Analysing ordinal data in PyMC #277

drbenvincent opened this issue Feb 4, 2022 · 5 comments
Labels
proposal New notebook proposal still up for discussion

Comments

@drbenvincent
Copy link
Contributor

drbenvincent commented Feb 4, 2022

Notebook proposal

Title: Analysing ordinal data in PyMC

Why should this notebook be added to pymc-examples?

Ordinal outcome variables are common in many data analysis situations. Example measures include:

  • BMI: underweight, normal, overweight, obese
  • Likert scale data, eg. strongly agree, agree, neutral, disagree, strongly disagree.

Often people can be lazy in their analysis of ordinal data, and fall back to treating it as continuous.

The goal of this example is to demonstrate current best practice for ordinal regression in PyMC. In particular, it will make use of the new pm.OrderedProbit and pm.OrderedLogit distributions. Once #5418 is merged, then we can go ahead with an example notebook.

The plan is to put it in the GLM section. Current rough outline would be something like:

  • What is ordinal data?
  • Why is it crucial to analyse it properly?
  • Priors over cutpoints: This could be an involved topic, but long story short is that some constraints on the cutpoint parameters are needed (see Discussion #5055). It will probably use my proposed ConstrainedUniform distribution (see propose new ConstrainedUniform distribution pymc-extras#32). We can always circle back and update this if a more polished solution presents itself.
  • Testing for group differences. E.g. response ~ group are useful for testing for differences in response distributions between groups
  • When you have a continuous predictor. E.g. response ~ continuous_predictor
  • Maybe include the combination, response ~ continuous_predictor + group if the notebook is not getting bloated, and if it seems necessary.

Related notebooks

As far as I understand there are no existing notebooks which provide examples for the analysis of ordinal data. The closest I can find is an old PyMC port of Chapter 23 of Kruschke, but that's totally independent of pymc-examples.

References

  • Liddell, T. M. & Kruschke, J. K. Analyzing ordinal data with metric models: What could possibly go wrong? J Exp Soc Psychol 79, 328–348 (2018).
  • Bürkner, P.-C. & Vuorre, M. Ordinal Regression Models in Psychology: A Tutorial. Advances in Methods and Practices in Psychological Science 42, 251524591882319–25 (2019).
@drbenvincent drbenvincent added the proposal New notebook proposal still up for discussion label Feb 4, 2022
@drbenvincent drbenvincent self-assigned this Feb 4, 2022
@drbenvincent drbenvincent removed their assignment Oct 28, 2022
@NathanielF
Copy link
Contributor

NathanielF commented Feb 25, 2023

Is there anything blocking this one? I'm interested in this class of models. I couldn't see if there was still an issue with setting priors on the cut points? It seems it is possible to pass in a vector now.... Happy to pick this one if you like @drbenvincent but also conscious that you seem to have done allot of work on it already....?

@drbenvincent
Copy link
Contributor Author

I initially wanted to work on it, but my plate is full at the moment. So no objections from me. No major blocker as far as I can tell.

@NathanielF
Copy link
Contributor

Cool. I'll pick it up after the longitudinal one is done.

@NathanielF
Copy link
Contributor

Just had a quick look at this one. It seems that even the example docstring for ordered logistic breaks now. Seems related to the shape attribute of the random variable.

image

I'm on the latest version i think:
image

@NathanielF
Copy link
Contributor

NathanielF commented Mar 17, 2023

Opened a ticket: pymc-devs/pymc#6610

In the mean time i'll experiment a bit more with your constrained uniform function.

NathanielF added a commit to NathanielF/pymc-examples that referenced this issue Mar 20, 2023
NathanielF added a commit to NathanielF/pymc-examples that referenced this issue Mar 20, 2023
@NathanielF NathanielF mentioned this issue Mar 20, 2023
3 tasks
NathanielF added a commit to NathanielF/pymc-examples that referenced this issue Mar 20, 2023
NathanielF added a commit to NathanielF/pymc-examples that referenced this issue Mar 20, 2023
NathanielF added a commit to NathanielF/pymc-examples that referenced this issue Mar 22, 2023
NathanielF added a commit to NathanielF/pymc-examples that referenced this issue Mar 22, 2023
NathanielF added a commit to NathanielF/pymc-examples that referenced this issue Apr 1, 2023
NathanielF added a commit to NathanielF/pymc-examples that referenced this issue Apr 2, 2023
NathanielF added a commit to NathanielF/pymc-examples that referenced this issue Apr 2, 2023
NathanielF added a commit to NathanielF/pymc-examples that referenced this issue May 22, 2023
NathanielF added a commit to NathanielF/pymc-examples that referenced this issue May 22, 2023
NathanielF added a commit to NathanielF/pymc-examples that referenced this issue May 22, 2023
NathanielF added a commit to NathanielF/pymc-examples that referenced this issue May 22, 2023
NathanielF added a commit to NathanielF/pymc-examples that referenced this issue May 22, 2023
NathanielF added a commit to NathanielF/pymc-examples that referenced this issue May 31, 2023
NathanielF added a commit to NathanielF/pymc-examples that referenced this issue Jun 1, 2023
drbenvincent pushed a commit that referenced this issue Jun 2, 2023
* [Ordinal Regression #277] example ordinal regression using ordered univariate transform

* [Ordinal Regression #277] example ordinal regression using ordered univariate transform

* [Ordinal Regression #277] added some text

* [Ordinal Regression #277] small update to text

* [Ordinal Regression #277] added movie data analysis

* [Ordinal Regression #277] added some write up

* [Ordinal Regression #277] tidied final plot

* [Ordinal Regression #277] changed movie comparison to ppc

* [Ordinal Regression #277] added forward sampling interrogating the implications of the model

* [Ordinal Regression #277] added more write up about design v model based inference

* [Ordinal Regression #277] added bibtex ref

* [Ordinal Regression #277] fixed bib merge conflict

* [Ordinal Regression #277] trying to fix bib file for pre-commit checks

* [Ordinal Regression #277] Small text changes and updated bib

* [Ordinal Regression #277] added note about non-collapsability

* [Ordinal Regression #277] updated with Ben's comments
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
proposal New notebook proposal still up for discussion
Projects
None yet
Development

No branches or pull requests

2 participants