-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Error in using contrasts and model matrix as shown in the tutorial #6
Comments
Thanks for your question @hok-lee and for providing a nicely reproducible example. I believe you may have missed one important aspect of interpreting the coefficients of models with two factors. results(dds_age_treatment, contrast = c("age", "30", "20")) You are comparing the difference between ages 30 and 20 for the reference level of treatment, i.e. for treatment A only. To define these contrasts with numeric vectors, you should create numeric vectors for every combination of both factors you want to account for: matrix_model <- model.matrix(design(dds_age_treatment), colData(dds_age_treatment))
age_20_treatment_a <- colMeans(matrix_model[dds_age_treatment$age == "20" & dds$treatment == "A", ])
age_30_treatment_a <- colMeans(matrix_model[dds_age_treatment$age == "30" & dds$treatment == "A", ])
results(dds_age_treatment, contrast = age_30_treatment_a - age_20_treatment_a) The same logic then applies to other comparisons. The comparison you were doing: results(dds_age_treatment, contrast = array_for_contrast_30 - array_for_contrast_20 ) Is for the average difference between 30 and 20 across all the levels of treatment. I'll leave this issue open, as I may add a note to the materials to highlight this for future readers. |
Thank you for the detailed explanation. I understand the issue now, it makes a lot more sense with the subsetting of the model matrix on both the age and the treatment condition. I appreciate you taking the time to explain it to me! |
Hi @tavareshugo , I have a quick question about interpreting the coefficient. """ The interaction terms genotypeII.conditionB and genotypeIII.conditionB give the difference between the condition effect for a given genotype and the condition effect for the reference genotype. This is from The design, This contradicts what the vignettes says... I'm a bit confused and am making a wild guess that this is because the design is an imbalanced design. (6 samples are age10 and 7 samples are age30)
|
Yes, it can be a bit confusing and I might have made it more confusing with my earlier solution. Anyway, let's take a step back and consider what the terms in the model mean. You have the following levels in each variable:
So, your model will have the following coefficients:
Now, because there is no interaction, the coefficient Equally, the coefficient If that assumption doesn't seem right to you, then you should fit a model with an interaction. |
Thank you for taking the time to explain it to me. I should probably review your tutorial materials as well as my linear regression book again. Thanks again and have a great weekend. |
First of all, your tutorial is great. It gave me great insights about the inner working of DESeq2.
However, when I tried to follow the tutorial but did not get the result expected when I used the model matrix. I wrote some sample codes (R markdown and the resulting HTML) to show the disagreement.
https://github.com/hok-lee/error_tutorial_deseq_contrast.git
I am quite puzzled. I hope you could help. Thank you!
The text was updated successfully, but these errors were encountered: