Skip to content

v0.4b

Latest
Compare
Choose a tag to compare
@hitz hitz released this 04 Feb 00:29
· 8 commits to dev since this release

This is the first clickhouse ready release of the IGVF Catalog backend.

Notable changes:
regulatory_regions renamed genomic_elements. BREAKING CHANGE
numeric fields with :long have been changed to remove this string BREAKING CHANGE
underscores replaced with dash in API endpoints BREAKING CHANGE

Full rewrite of loading to JSONL
ClickhouseDB in alpha release, https://datastore.catalog.igvf.org/
partial release of Starita VAMP-seq data
All edges have “name” and “inverse name” representing semantic nature of connection
Bug fixes and API updates

  • varaints/coding-variants API endpoint added
  • coding-variants/phenotypes (VAMP-seq data, only for CYP2C19) API endpoint added
  • pathways and pathways/pathways API endpoints added
  • genes/predictions API endpoint added
  • variant rsid added to /variants/phenotypes response
  • motif endpoints return protein complexes
  • added ClinGen allele registry numbers to variant nodes

LONG VERSION:

Release notes - Catalog - v0.4b

Story

DSERV-504 change coding_variants name/API

DSERV-507 coding_variants_proteins edge names

DSERV-508 complexes_proteins edge names

DSERV-510 GO annotations (complexes_terms, go_terms_annotations) edge names

DSERV-511 diseases_genes edge names

DSERV-512 genes_genes & mm_genes_mm_genes edge names

DSERV-514 genes_mm_genes edge names

DSERV-515 genes_pathways edge names

DSERV-516 genes_terms (depMap, rename to genes_biosamples) edge names

DSERV-517 genes_transcripts, transcripts_proteins edge names

DSERV-519 motifs_proteins edge names

DSERV-521 ontology_terms edge names

DSERV-523 pathways_pathways edge names

DSERV-524 proteins_proteins edge names

DSERV-527 fix all regulatory_region biosample hyper edges

DSERV-531 variants_coding_variants edge names

DSERV-532 variants_diseases + _genes edge names

DSERV-533 variants_drugs +_genes edge names

DSERV-534 variants_genes edge names + _terms (biosamples) (QTLs)

DSERV-535 variants_phenotypes (+_studies) GWAS edge names

DSERV-536 variants_proteins (+terms/biosamples) pQTLs/ASB edge names

DSERV-538 variants_variants edge names

DSERV-541 Add ClinGen allele registry to variants JSONL

DSERV-545 data loading inconsistence in eQTL, sQTL, caQTL and pQTL

DSERV-558 Create /variants/coding_variants endpoint

DSERV-564 Make gene query filters consistent for gene edge endpoints

DSERV-565 Add query filters for proteins/proteins endpoint

DSERV-567 Add biosample filter for regulatory_regions_genes endpoints

DSERV-569 set up monitoring for catalog servers

DSERV-570 load clickhouse db from JSONL

DSERV-571 test load arangodb from JSONL

DSERV-580 deduplicate pathway data

DSERV-582 ESLint not working from pre-commit

DSERV-589 Find adapters that use biocypher yield pattern and replace with JSON writing and loading

DSERV-590 gene related endpoints should avoid override behavior

DSERV-595 Create /genes/predictions endpoint similar to /variants/predictions

DSERV-596 Create aggregate allele frequency per region endpoint

DSERV-597 Build API for genes structure

DSERV-598 need edges from gene_structure collections to transcripts

DSERV-601 load SEM predictions from Boyle lab

DSERV-605 Create API for pathway

DSERV-609 update organism field for pathways, pathways_pathways, and genes_pathways.

DSERV-611 Load JSONLs into S3 for each collection and create a source file with data.igvf.org links for each dataset

DSERV-613 API for genes_pathways

DSERV-614 API for pathways_pathways

DSERV-615 load coding variants abundance scores from Starita

DSERV-619 adjust motifs/proteins end points to accomdate complexes

DSERV-623 remove proteins_transcripts from transcripts_proteins collection

DSERV-631 adjust variants/proteins end points to accommodate complexes

DSERV-632 Include variant rsid in the /variants/phenotype response

DSERV-637 create system for release tags or branches

DSERV-640 Refactor SEM motif and SEM prediction adapters.

DSERV-666 get rid of :long fields in adapters

DSERV-674 reload starita data while ennumerating amino acid variants based on 2x and 3x mutations

DSERV-680 get rid of :long fields in APIs

DSERV-705 update index used in API

DSERV-717 Rename regulatory regions API endpoints to genomic elements

DSERV-729 import genomic elements data into clickhouse

DSERV-740 simple coding_variants/phenotypes API

Task

DSERV-618 document JSONL loading process accounts, buckets, instances

Epic

DSERV-649 CATALOG: actually load all arrango collections from JSONL

Bug

DSERV-566 Fix orphanet_association_type filter for diseases/genes endpoint

DSERV-591 remove unused gnomad adapter

DSERV-592 ReactomePathway adapter failing due to 404 from Reactome API

DSERV-606 Variants/predictions endpoint failing for retired genes AND missing pagination query param

DSERV-621 genes/diseases working differently on filters from clinGen vs orphanet

DSERV-624 fix limit bug in query code

DSERV-625 in API for coding_variants coding_variants/variants changes

DSERV-629 replace underscores with dash in API

DSERV-702 fix collection variants_proteins and variants_genes index

DSERV-732 coding_variants_proteins collection points to ENSP not UNIPROT