An integration analysis of bulk RNA-seq data from human skeletal muscles (1221 muscles x 9231 genes).
- Date size: 1221 muscles x 9231 genes
- Data source: GTEx database (n = 803), GEO datbase (n = 291), Helsinki (n = 127)
- Phenotype: 292 myopathies, 929 controls
- Sequencing method: 930 sequenced in mRNA (polyA) and 291 sequenced in total RNA (ribo)
- Only human skeletal muscle tissue (no cell lines or organoids)
- Bulk-RNA sequencing by high throughput technics (no chip arrays or single-cell data)
- Raw count data preserved (datasets shared in transformed count format were excluded)
- DEG: Differential expression analysis (DEG) results exported from edgeR.
- Meta: Meta data (Data_source/Geo_accession/Author_Date/PMID/Sample_id/Gsm_accession/Casual_gene/Phenotype/Biopsy site/sequencing method/sequencing platform/Se/Age range) for the integration dataset.
- TAPE: Tissue deconvoluation results annotated with two human skeletal msuclse single-cell datasets (Tabula Sapiens and GSE143704).
- Helsinki data: 127 skeletal muscle bulk RNA-seq data from Helsinki (Group Udd, Folkhälsan Research Center, University of Helsinki). Among these samples, 39 have also been reported as GSE15175717.
- GEO data: 291 skeletal muscle bulk RNA-seq data downloaded from the GEO database (GSE115650, GSE140261, GSE175861, GSE184951, GSE201255, GSE202745).
- GTEx data: 803 skeletal muscle bulk RNA-seq data downloaded from the GTEx Analysis V8 (dbGaP Accession phs000424.v8.p2). The main biopsy site is the gastrocnemius muscle, 2 cm below the patella.
- Integration data: Processed data during the Integration process.
- Validation: validation data downloaded from the supplementary files from the used GEO datasets or generated from the integration dataset.
- Python (3.8.1): Scanpy (high-dimensional data processing), gseapy (pathway analysis), TAPE (celltype deconvolution), conorm (count normalization).
- R (4.2.2): EdgeR (DEG analysis), ComBat-seq (batch adjustment), DescTools (Jonckheere trend test).
Phenotype | Sample size | Data source | Sequencing method |
---|---|---|---|
Control (accident death) | 31 | GTEx | mRNA |
Control (unexpected death) | 203 | GTEx | mRNA |
Control (intermediate death) | 46 | GTEx | mRNA |
Control (ventilator case) | 424 | GTEx | mRNA |
Control (slow death) | 87 | GTEx | mRNA |
Control (others) | 111 | GTEx/GEO | mRNA/total RNA |
Control (amputee) | 24 | Helsinki | mRNA |
Control (hyperCkemia) | 3 | Helsinki | mRNA |
FSHD | 61 | GEO | total RNA |
DM1 | 44 | GEO | total RNA |
LGMD R12 | 41 | GEO | total RNA |
CDM | 36 | GEO | total RNA |
Titinopathy | 31 | Helsinki | mRNA |
IBM | 28 | Helsinki (GEO) | mRNA |
DMD | 5 | GEO | total RNA |
BMD | 5 | GEO | total RNA |
Actinin-2 myopathy | 5 | Helsinki | mRNA |
Myopathy (HNRNPA1) | 5 | Helsinki | mRNA |
SMPX myopathy | 4 | Helsinki | mRNA |
Myopathy (OBSCN) | 1 | Helsinki | mRNA |
Myopathy (TNPO3) | 1 | Helsinki | mRNA |
Distal ADB-filaminopathy | 1 | Helsinki | mRNA |
Myopathy (Unsolved) | 24 | Helsinki | mRNA |
- Control (accident death): Violent and fast death Deaths due to accident, blunt force trauma or suicide, terminal phase estimated at < 10 min [Healthy].
- Control (unexpected death): Fast death of natural causes Sudden unexpected deaths of people who had been reasonably healthy, after a terminal phase estimated at < 1 hr (with sudden death from a myocardial infarction as a model cause of death for this category) [Healthy].
- Control (intermediate death): Intermediate death Death after a terminal phase of 1 to 24 hrs (not classifiable as 2 or 4); patients who were ill but death was unexpected [Diseased].
- Control (ventilator case): Ventilator Case All cases on a ventilator immediately before death [Diseased].
- Control (slow death) : Slow death Death after a long illness, with a terminal phase longer than 1 day (commonly cancer or chronic pulmonary disease); deaths that are not unexpected [Wasting].
We thank the participants and their families who donated their muscle tissues for research purposes. We would also like to extend special thanks to the authors of these publicly available muscle datasets, which will facilitate further research in the future.
Zhong, H., Sian, V., Johari, M., Katayama, S., Oghabian, A., Jonson, P. H., Hackman, P., Savarese, M., & Udd, B. (2024). Revealing myopathy spectrum: integrating transcriptional and clinical features of human skeletal muscles with varying health conditions. Communications biology, 7(1), 438. https://doi.org/10.1038/s42003-024-06143-3