symbol column dropped from rowData of SegerstolpePancreasData (devel) #45

lgeistlinger · 2024-03-07T03:00:51Z

In Bioc release:

> library(scRNAseq)
> sce.seger <- SegerstolpePancreasData()
> rowData(sce.seger)
DataFrame with 26179 rows and 2 columns
                symbol                 refseq
           <character>            <character>
SGIP1            SGIP1              NM_032291
AZIN2            AZIN2 NM_052998+NM_001293562
CLIC4            CLIC4              NM_013943
AGBL4            AGBL4              NM_032785
NECAP2          NECAP2 NM_001145277+NM_0011..
...                ...                    ...
KIR2DL4        KIR2DL4 NM_001080772+NM_0022..
KIR2DS3        KIR2DS3              NM_012313
KIR2DS2        KIR2DS2 NM_001291696+NM_0123..
BIVM-ERCC5  BIVM-ERCC5           NM_001204425
eGFP              eGFP                   eGFP

In Bioc devel:

> library(scRNAseq)
> sce.seger <- SegerstolpePancreasData()
> rowData(sce.seger)
DataFrame with 26179 rows and 1 column
                           refseq
                      <character>
SGIP1                   NM_032291
AZIN2      NM_052998+NM_001293562
CLIC4                   NM_013943
AGBL4                   NM_032785
NECAP2     NM_001145277+NM_0011..
...                           ...
KIR2DL4    NM_001080772+NM_0022..
KIR2DS3                 NM_012313
KIR2DS2    NM_001291696+NM_0123..
BIVM-ERCC5           NM_001204425
eGFP                         eGFP

I think this causes OSCA.advanced and OSCA.workflows to break in devel @PeteHaitch @alanocallaghan

The text was updated successfully, but these errors were encountered:

LTLA · 2024-03-07T03:11:30Z

Hm. I think I must have deemed the row names to be redundant with the symbol column and removed the latter to reduce the file size. To avoid breaking stuff, I can dynamically add it back in for the SegerstolpePancreasData function; however, fetchDataset() will still return the sans-symbol version, so people loading the dataset directly from the files (i.e., not through the per-dataset getters) will get a slightly different version of the dataset.

FYI fetchDataset() is going to be the way forward as it (i) avoids the need for contributors to write a getter function and (ii) eliminates the involvement of dataset-specific logic that can't be easily replicated in other frameworks like Python or JS.

Is Segerstolpe the only one? FWIW you can set legacy=TRUE and it'll pull from ExperimentHub for now.

lgeistlinger · 2024-03-07T03:19:48Z

If that's the way forward we can also adapt the corresponding parts of the OSCA book to look up the symbols from the rownames. I can't tell you whether this also happens to other datasets at this point. But the breakage comes from looking up the symbol column for ID mapping purposes, and this can be replaced by providing the rownames instead then.

LTLA · 2024-03-07T03:50:00Z

Added back symbol in 2.19.4. Only for SegerstolpePancreasData, so fetchDataset will still be missing symbol.

alanocallaghan · 2024-03-07T13:49:35Z

Yeah seems sensible to just use the rownames for OSCA purposes moving forward

alanocallaghan · 2024-06-11T14:04:43Z

Think this is resolved now?

PeteHaitch mentioned this issue Apr 11, 2024

colData change for GrunPancreasData in BioC 3.19; worth making legacy = TRUE the default in BioC 3.19 #47

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

symbol column dropped from rowData of SegerstolpePancreasData (devel) #45

symbol column dropped from rowData of SegerstolpePancreasData (devel) #45

lgeistlinger commented Mar 7, 2024 •

edited

Loading

LTLA commented Mar 7, 2024 •

edited

Loading

lgeistlinger commented Mar 7, 2024

LTLA commented Mar 7, 2024 •

edited

Loading

alanocallaghan commented Mar 7, 2024 •

edited

Loading

alanocallaghan commented Jun 11, 2024

symbol column dropped from rowData of SegerstolpePancreasData (devel) #45

symbol column dropped from rowData of SegerstolpePancreasData (devel) #45

Comments

lgeistlinger commented Mar 7, 2024 • edited Loading

LTLA commented Mar 7, 2024 • edited Loading

lgeistlinger commented Mar 7, 2024

LTLA commented Mar 7, 2024 • edited Loading

alanocallaghan commented Mar 7, 2024 • edited Loading

alanocallaghan commented Jun 11, 2024

lgeistlinger commented Mar 7, 2024 •

edited

Loading

LTLA commented Mar 7, 2024 •

edited

Loading

LTLA commented Mar 7, 2024 •

edited

Loading

alanocallaghan commented Mar 7, 2024 •

edited

Loading