All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog and as of version 1.0.0, follows semantic versioning.
- Replace Pandas
NaN
with PythonNone
. When sending to MISO,None
gets converted tonull
, which is what MISO expects.
- Fix MISO URL formatting for MISO 2.23.0
- Update Werkzeug package due to security issue
- Bump qc-etl to v1.28
- Remove crosscheckfingerprints kludge from previous release
- Hide closest library from swap view and show expected library instead
- Insert a annoying kludge because gsi-qc-etl does not handle crosscheckfingerprints correctly
- Bump gevent version due to security issue
- Fixed serious logic bug during swap filtering. Before this, the closest LOD was used to filter for swaps. This was wrong. The furthers away LOD needs to be looked at for swap filtering. This bug hid orphan swaps where the closest match was in the "can't make a call" ambiguous zone of LOD -20 to 20.
- Additional columns to the export table. Needed for top-up prevention calculations.
- Add comments to requirements.txt to explain
~=
operator - Fixed deprecated
DataFrame.max
parameter - Bumped gsi-qc-etl version to 1.27 and removed temporary PyYAML version fix
- One more column for the All Samples table: Coverage for single lane TAR
- Bcl2barcode view has been greatly simplified by using the
bcl2barcodecaller
gsiqcetl cache - Downgrade PyYAML dependency to sidestep Cython 3.0 breaking change
- Yet more columns for the All Samples tables
- Removed unnecessary Docker instructions
- Update gsiqcetl to V1.24, which allows for Flask to be updated to latest version
(removes
click
dependency conflict)
- Remove all ichorcna usage.
- Made
dnaseqqc
an optional archival source - Update Dashi to gsiqcetl V1.23
- More columns in All Samples tables
- Fixed dependency versions to speed up
pip install
- Allow Dashi to pull from more than one cache source
- Update Dashi to gsiqcetl V1.22
- Edited "All Samples" tables after user feedback
- Allow
crosscheckfingerprints
cache to be missing from archival source - Remove the Raw Data Table
- Reverted to rnaseqqc2 V2 cache due to column naming bug
- Fixed Merged Pinery Lims ID being blank in exported csv
- Introduce All Samples table
- Removed SARS-CoV-2 view
- Dash -> 2.8.1
- werkzeug -> 2.2.3
- dash_bootstrap_components -> 1.4.0
- Remove JIRA buttons due to compatibility issue
- Corrected caches displayed in view footer
- Removed dangling ichorcna usage in single lane and call ready WG
- Add the PG code (Plasma Whole Genome) to the WG view
- Changed
Total Clusters (Passed Filter)
toPipeline Filtered Clusters
- Removed
Total Reads (Passed Filter)
, as it's not used and (like clusters) does not count total reads.
- Switched Call Ready TAR Median Target Coverage to Mean Bait Coverage
- No data for swap view doesn't cause crash (stage cache has become empty)
- Remove 'Purity' from Call-Ready and Single-Lane WGS
- Fix bug where libraries without swaps were completely excluded from swap view
- Switch Dashi to qc-etl v1 caches
- Update Python Docker version to 3.10
- Swap view no longer loads huge data into memory and does heavy computation. Done by qc-etl.
- Use gsi-qc-etl v1.9, which uses a Pandas version supported by Python 3.10
- Remove bamqc3 caches from Dashi
- Fixed werkzeug version
- Change on target graphs for call-ready and single-lane tar to use HsMetrics PCT_SELECTED_BASES
- Switch median insert size to mean insert size
- Display Shallow Whole Genome libraries
- dnaSeqQC cache loading alongside BamQC4 (picking the newest record if its found in both caches)
- Renamed columns in swap view
- Unticking 'only show swaps' shows all projects in swap view
- Added project name to swap view (does not always match Alias)
- Showing meta data along library alias in swap view
- Sample provenance is loaded from cache on disk rather than Mongo DB
- TS acronym is TAR now
- Mongo Provenance can be supplied as a hd5 file via MONGO_FILE env
- Wrong swap was being shown if patient has had only one library sequenced.
- Special columns for Single Lane TS were not exported in CSV table.
- Added "Coverage per Gb" column to Call Ready TS data table
- User messages can be displayed in each view.
- Add checkbox to swap view to enable seeing all comparisons, not just those marked as swaps.
- Added sample hierarchy information to swap view (hidden by default to avoid clutter)
- Made swap view more compact to see all columns on the screen
- Fixed swap view bug where libraries from patients with only a single library were ignored
fill_in_color_col
andfill_in_shape_col
take an arbitrary column for color/shape
- Added Runscanner Illumina Flow Cell view (not turned on until GDI-2080 is solved)
- Remove bcl2barcode WIP and fix x-axis being cut off/missing tick labels
- Clicking on Processing count on front page shows library names
- Sample Swap view shows LOD scores of closet libraries + better column formatting
- Dashi license
- Sample Swap algorithm. The original approach used LOD cutoff, which produced too many false positives (especially with WG/WT comparisons). New algorithm looks for the most similar libraries and tags a swap if those are not from the same patient.
- Shesmu input link to the
status
page to view JSON that's passed to ETL
- QC-ETL caches are now loaded in functions. Failure to load a cache will only impact views that call function. Previously, failure to load cache crashed all views.
- Sample swap view
- Gracefully dealing with caches that fail to load. Affected view shows error.
- Bumped QC-ETL to 0.53.0
- Fixed broken status page
- Bumped QC-ETL to 0.51.0
/status
page showing date of the latest cache and cache errors- Alert pop up when run requested via URL query does not exist
- Fix calculation of tumour/normal coverage cutoff values for Call Ready WGS
- Add " + Intron" to Mean Insert Size fields on RNA reports
- 'view all' link leading to page listing all runs on front page
- Any Call-Ready graphs or variables based on unmapped or non-primary reads. These are filtered out during BAM merging.
- Unmapped and non-primary read percentages are calculated using the BamQC
meta
columns (https://github.com/oicr-gsi/bam-qc-metrics/blob/master/metrics.md#summary-of-fields).non primary reads
column will always be 0, withnon primary reads meta
having the actual number - Use BamQC Total Reads as denominator for On Target Percentage rather than FastQC. This is because BamQC On Target Reads calculations cannot be directly compared to the actual total (FastQC) machine reads (due to BamQC filtering and non-primary reads)
- Total PF Clusters in cfmedip come from fastqc
- Spelling fixes
- Fix colour scheme on png export
- Sort by run date
- Fix data labels
- Bumped gsi-qc-etl to 0.44.2 (has correct median coverage calculations)
- Total PF Clusters to cfmedip view
- Bumped gsi-qc-etl to 0.43.2 (has correct median calculations)
- Switched cfmedip insert size median percentile to correct columns
- MISO URLs now stored in config repo
- 'QC in MISO' button now points to MISO Prod
- Added insert size graph to cfmedip
- Bumped gsi-qc-etl to 0.42.0
- Added 'QC in MISO' button
- MISO_URL in .env required
- Add clusters-per-sample thresholds
- 'Missing' info no longer appears in Failed Samples table
- Bumped gsi-qc-etl to 0.41.0
- Call ready graphs new use clusters instead of reads
- Single lane TS cutoffs were switched to clusters
- Use Total Clusters instead of Total Reads in single lane views
- New thresholds for cfMeDIP report
- Added Tissue Origin as colour-by criteria
- Updated thresholds in preparation for QC Handoff feature
- Tumor purity graph for Call Ready WGS
- Replace BamQC's TotalReads with FastQC's TotalSequences in applicable reports
- Renamed 'Approve this run in MISO' button to 'View run in MISO' for clarity
- Every data point now has a unique x-axis value, no longer stack vertically
- Row count at bottom of data tables
- Bumped ETL version to 0.38.0
bamqc
cache dependency
- Callability graphs show normal and tumor coverage cutoffs
- Coverage graphs added to Single Lane WG and TS
- Remaining Call-Ready reports converted to use subplots
- Call-Ready TS 1 and 2 combined into one report via subplots
- Single-lane pages converted to use subplots
- Added columns to SARS-CoV-2 raw data table
- Requires grouped_run_status and grouped_project_status files for front page
- Displays only date instead of datetimes on front page
- Bumped to ETL v0.37.0 (bcl2barcode fix)
- Home button on navbar
- Removed Insert Size columns from raw data table on RNAseq reports
- Updated Dash to 1.13 (supports subplot sorting)
- Navbar now appears on front page, eliminates need for Reports list
- Converted bcl2fastq report to bcl2barcode
- New required environment variable 'BARCODES_STREXPAND'
- Call ready pages use median coverage
- Adjusted JIRA button behaviour and wording
- Empty graphs start at 0
- Sorting is independent of colour/shape by
- First and second sort options are the same
- The Call Ready linking Color to First Sort
- Alphabetical second sort option (Sample Name, Merged Lane)
- 2 SARS-CoV-2 mapping % graphs: % of host depleted, % of total reads
- SARS-CoV-2 On Target % graph
- Removed FastQC dependency for Total Reads from Single Lane RNA-Seq
- Made date range clearable
- Help button to projects list
- Cutoffs are medians where Median Insert Size graph
- Single-Lane RNA-seq Correct Strand % is a percentage
- Removed Purity from Call-Ready TS 2
- Pulls run and project dump from Shesmu to create tables on front page
- Switched Insert Mean to Insert Median with 10/90 percentiles
- y-axis on graphs is always set to auto-scale, including on % graphs
- Projects may now be specified through the url for all reports. Add 'project=' to the query portion of the URL.
- Style changes to bar plots
- Single Lane RNAseq now uses RNASeqQC2
- JIRA link will prompt for login if not logged into JIRA
- Show only On & Near Bases on TS Bar graph
- Use BamQC4 as well as BamQC3
- cfMeDIP report
- locks down dependency versions to specific versions to avoid breaking changes, please run 'pip install -r requirements.txt --upgrade --no-cache-dir`
- Improved styling for graphs
- SARS-CoV-2 coverage graph uses median coverage with 10/90 percentile error bars
- No legends for cutoff lines nor highlighted samples, to preserve graph widths
- SARS-CoV-2 On Target bar chart no longer shows unmapped numbers
- Graphs now have white background colour
- SARS-CoV-2 report now more accurately gets all Samples
- SARS-CoV-2 data table works
- SARS-CoV-2 Coverage Percentiles graph has x axis labels
- SARS-CoV-2 Percentile graph is colourable
- SARS-CoV-2 has adjustable cutoff lines
- SARS-CoV-2 report sortable, colourable by Sequencing Control Type
- SARS-CoV-2 sorting fixed
- SARS-CoV-2 report
- Purity & Ploidy removed from Single Lane WGS
- Pinery URL in docker
- Proof-of-principle Genome Build features for Single Lane reports
- RNA-seq 5/3 bias is on log scale
- Single-Lane TS uses bamqc3
- RNA-seq uses FASTQC for Total REads
- Colour palette now colourblindness-friendly
- add cr-WGS Coverage per Gb graph
- add stacked bar chart for Percentage On/Near/Off Bait to Call-Ready TS 2
- Standardized graph axes and titles
- Updated some underlying data calculations
- Removed Purity & Ploidy graphs from call-ready WGS
- Wrapped graphs and tables in tabs for better table access
- Buttons to create JIRA tickets
- Update how % rRNA contamination is calculated for call-ready RNA
- Use merged library fields as x-axis on call-ready graphs
- On call-ready pages, pin "Colour By" value to "First Sort" value
- Bcl2Fastq Index QC graph
- Set
DASHI_LOG_TO_CONSOLE=True
in.env
file to log to console - Set
USE_BLEEDING_EDGE_ETL=1
in.env
file to use gsi-qc-etl@master (development only)
- Filter logs now remove
end_date
if it is the current date - Filter logs now report
end_date
as a date rather than datetime - Filter logs now report
["all_runs"]
when all runs have been selected; etc for other dropdowns - Pinery entries with no QC data are excluded from plots and data table
- Show Names drop down menu can display info from multiple fields
- Put newline in
plot_builder.generate
to seperate color by and shape by
nav_handler
andcontent_handler
no longer throw exception on empty pathAdd All
button for Library Designs on WGS report now works- Remove blank Run from TS and RNA dropdowns
content_handler
returns two values, matching the callback promise
- Added footer containing version number
- Added ability to search for data of runs from last X days using URL query
?last=Xdays
- Log filter parameters for each search
- Added AS, CH, CM, NN libraries to pre-WGS page
- Loading animation no longer freezes during page load
- Changed 'Pre-X' page titles to 'Single-Lane X' and updated URLs to match
- Negative numbers are no longer valid for cutoff inputs
- pre-WGS: BamQC and ichorCNA data are now reported independently
- pre-RNA: rRNA contamination now fails above cutoff (not below)
- pre-RNA: Get total reads data from FastQC (unique reads)
Dashi alpha release. Many features were added, and many changes were made. See commit history for more details...