Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue using combine_CpG_sites.py with supplied example #16

Open
nickyph opened this issue Aug 31, 2023 · 1 comment
Open

Issue using combine_CpG_sites.py with supplied example #16

nickyph opened this issue Aug 31, 2023 · 1 comment

Comments

@nickyph
Copy link

nickyph commented Aug 31, 2023

Hi @jsh58

Before tyring with my own data, I wanted to use DMRfinder with the supplied examples to make sure I understand the workflow. The sample data I used is

C1.cov C2.cov chrZ 100 0 1 chrZ 100 1 2 chrZ 120 1 2 chrZ 120 2 3 chrZ 200 0 4 chrZ 200 1 5 chrZ 300 3 3 chrZ 300 0 2 chrZ 401 2 6 chrZ 401 3 3 chrZ 450 3 5 chrZ 450 4 5 chrZ 600 5 2 chrZ 600 6 3 chrZ 625 4 2 chrZ 625 8 0 chrZ 650 5 1 chrZ 650 7 1 chrZ 700 3 2 chrZ 700 3 4

From Appendix A. Illustrative examples with combine_CpG_sites.py. I acknowledge that column 3 and 4 are missing. I created the C1.cov and C2.cov file by putting them in seperate .txt files and then running the following commands to make sure the file is tab-delimited.

$ cat C1.txt | tr ' ' '\t' > C1.cov
cat C2.txt | tr ' ' '\t' > C2.cov

When running combine_CpG_sites.py I run into the following error:

python combine_CpG_sites.py -o result_31aug C1.cov C2.cov path/combine_CpG_sites.py:59: DeprecationWarning: 'U' mode is deprecated f = open(filename, 'rU') Error! Poorly formatted record: chrZ 100 0 1

What I have tried is using my own data, .bed file data, or just the C1.txt/C2.txt, or the .cov's I made but then space-delimited, but to no avail. Reason for wanting to know is that for my own study I use ONT data and I have all the needed columns as mentioned in Extracting methylation counts which are:

chrom | chromosome name chromStart | 1-based position of the cytosine in the CpG chromEnd | chromStart + 1 percent | percent methylation at this site methylated | count of methylated cytosines unmethylated | count of unmethylated cytosines

So I do acknowlegde my starting point differs. Do you have any guesses on what the issue might be? I do apologize if I am missing something very obvious. This area of research is new to me so I am exploring for now :)

@jsh58
Copy link
Owner

jsh58 commented Sep 4, 2023

The input files for combine_CpG_sites.py must be of the correct format, which is described in the README:

These headerless files must have six tab-delimited fields per line, with the first, second, fifth, and sixth columns being chromosome, position, methylated counts, and unmethylated counts, respectively

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants