Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

updating gene pages #29

Open
raynamharris opened this issue Jul 14, 2022 · 0 comments
Open

updating gene pages #29

raynamharris opened this issue Jul 14, 2022 · 0 comments
Labels
gene cv term

Comments

@raynamharris
Copy link
Contributor

raynamharris commented Jul 14, 2022

as per #28 (comment)

the GETex, Transcripts, and UCSC browsers on only a few gene pages, as indicated by these files

  • data/input/gene_IDs_for_expression_widget.txt
  • data/input/gene_IDs_for_transcript_widget.txt
  • data/input/gene_IDs_for_UCSC_genome_browser_widget.txt

the alias table and the Appyter widgets are displayed on nearly all gene pages. they are different for dev and for staging

  • data/input/DEV_PORTAL__available_genes__2022-07-01.txt
  • data/input/STAGING_PORTAL__available_genes__2022-07-13.txt

Here's what I've done to create these

  1. Visit the Gene page on the CFDE Search Portal
  2. Export the search results to a csv
  3. Read the CSV file into R. Extract only the ENSG IDs. Save the list as a .txt doc
  4. Update the Snakefule with the new data. Attempt to submit. Get error messages about failed genes.
  5. Make list of failed genes. Remove failed genes for list of ENSG IDs in the .txt doc. Update Snakemake file. Resubmit. Repeat until successful.

Screen Shot 2022-07-14 at 2 58 56 PM

library(tidyverse)

failedgenes <- c("ENSG00000093134",  "ENSG00000164393", "ENSG00000184293",
            "ENSG00000184293", "ENSG00000203812", "ENSG00000188707",
            "ENSG00000221995",  "ENSG00000214534", "ENSG00000225932",
            "ENSG00000244693", "ENSG00000256374", "ENSG00000263464",
            "ENSG00000105501", "ENSG00000161133")

missinggenes <-  read.table("./data/inputs/missing.txt")

df <- read.csv("/data/inputs/Gene.csv") %>%
  arrange(id) %>%
  filter(!id %in% failedgenes) %>%
  filter(!id %in% missinggenes$V1) %>%
  select(id)
names(df) <- NULL
head(df)

write.table(df, 
            "./data/inputs/STAGING_PORTAL__available_genes__2022-07-13.txt",
            row.names = F, quote = F)
@raynamharris raynamharris added the gene cv term label Aug 13, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
gene cv term
Projects
None yet
Development

No branches or pull requests

1 participant