-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Postgress database query question #634
Comments
Here another example. You have two tables "rnc_sequence_regions_active_mapped" and "rnc_sequence_regions_active_provided" within the rnacen schema. I am assuming one or both of these is related to "SequenceRegionActive" in the online schema, but it is not clear which, and the tables contain a column not listed online, "providing_databases." |
Hi, the query we use is: SELECT
json_build_object(
'assembly_id', :'assembly_id',
'region_id', max(regions.region_name),
'rna_id', max(pre.id),
'description', max(pre.short_description),
'rna_type', max(pre.rna_type),
'databases', regexp_split_to_array(max(pre."databases"), ','),
'providing_databases', array_agg(ac.database),
'chromosome', max(regions.chromosome),
'strand', max(regions.strand),
'identity', max(regions.identity),
'was_mapped', bool_or(regions.was_mapped),
'exons', array_agg(distinct exons.*)
)
FROM rnc_rna_precomputed pre
JOIN rnc_sequence_regions_active regions
ON
regions.urs_taxid = pre.id
JOIN rnc_sequence_exons exons
ON
exons.region_id = regions.id
join rnc_accession_sequence_region sra
on sra.region_id = regions.id
join rnc_accessions ac
on sra.accession = ac.accession
WHERE
pre.is_active = true
AND regions.assembly_id = :'assembly_id'
GROUP BY regions.id
ORDER BY max(regions.chromosome), max(regions.region_start), regions.id The result of this query is turned into GFF/BED as needed. In the database we have a table of provided coordinates, and of mapped coordinates. We combine these to produce the final coordinates. Note that the table name in the schema diagram follows python naming conventions, while the table names do not. We probably should update the diagram to indicate table names, but they are generally a simple modification of the table names. |
Could you provide an example query for how you create the GFF or GTF download for the human annotations.
Please could you also confirm that https://rnacentral.org/static/img/rnacentral_latest_schema.png is the latest schema. For example, the schema shows "RnaPrecomputed" as the main table, yet the table available through the postgress interface is called "rnc_rna_precomputed".
Thank you!
The text was updated successfully, but these errors were encountered: