Skip to content

Latest commit

 

History

History
172 lines (115 loc) · 5.26 KB

CHANGELOG.md

File metadata and controls

172 lines (115 loc) · 5.26 KB

Changelog

All notable changes to this project will be documented in this file.

The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.

Added

  • Improved V&V interface

    • Plugin support for protocols & checks
    • Specification generation via dpt validation spec
    • Generalized validation runner via dpt validation run
    • Manual check interface via dpt validation manual-checks
    • CLI interface implemented using click
  • Started to use logging via loguru

Removed

  • Microarray specific protocols/checks (now implemented as plugins outside dp_tools)

Added

  • Ability to inject columns during runsheet generation

Added

  • Microarray (Agilent 1 Channel) V&V protocol
  • Pandera as dependency for better validation tooling

Changed

  • BulkRNASeq runsheet validation enhanced
    • Upgraded from Schema to Pandera
    • Added checks for dataset metadata columns like 'paired_end'
    • Added sanity check for 'read2_path' column optional nature

Fixed

Added

  • Runsheet generation for methlySeq ISA archives

Changed

  • GLDS API usage now considers the 'OSD' accession ID as the study ID instead of 'GLDS'. This is consistent with the recent release of the OSDR

Fixed

Fixed

  • Fixes incorrect unit detection for runsheet generation #14

Added

  • Stdout logging for scripts, this better explains what is happening during the script

Fixed

  • Missing Microarray technology valid combination and handling of multiple valid combinations

Fixed

  • Staging runsheets failing to extract unit columns
  • V&V crash related to factor columns being inferred as numeric. Now correctly inferring as string values.

Added

  • Integrity check for gzipped files to bulkRNASeq checks and protocol

Changed

  • Pinned Pandas version to 1.4.4 (prior: no pin, most recent version installed)
    • Version 1.5 causes changes to checksum for pandas objects and would require updating all tests that include a checksum (planned for future)

Fixed

  • Fixing false V&V halt flagging: Add in micro sign as whitelisted (better in sync with r make.names function)
  • Expected location of SampleTable.csv and ERCC_SampleTable.csv in

Fixed

  • Fixing false V&V halt flagging: Add in greek characters as whitelisted (better in sync with r make.names function)

Fixed

  • Incorrect detection of has_ERCC from ISA Archives
    • Example Impacted GLDS: 161,162,163,173
  • Runsheet generation failing for different variations of raw reads data column names
    • Example Impacted GLDS: 105,138

1.1.0

First Production Release

  • Prior 1.0.0 tagged versions were actually develop style releases
  • Moving ahead only production releases will have tags without 'rc' (release candidate) in the name

Quality Updates

  • Various flag messages improved
  • Documentation updated

Added

  • Check related to multiQC samples inclusion

1.0.8rc

Changed

  • Updated GeneLab filename to url mapping to utilize the GeneLab public API
    • Addresses removal of prior-used deprecated endpoints

1.0.7rc

Dockerfile

  • Added samtools as needed for certain checks

Checks

Fixed

  • check_contrasts_table_rows: message no longer introduces extra newlines into log

Planned

rc1.0.6

Added

BulkRNASeq V&V Reporting

  • A validation protocol that runs on a BulkRNASeq dataset model
  • Includes generation of report files

BulkRNASeq Data Model From Nextflow RNASeq Concensus Pipeline

  • A set of multi-stage loaders to create a data model
  • Includes: validation system and multiQC powered data extraction

Fixed

BulkRNASeq Reporter File Generation

  • Data assets tagged with file categories for reporter file export including:
    • md5sum table
    • curation tables [GeneLab internal use]