pathway_heatmap can't extract columns past the end #118

luigallucci · 2024-09-18T12:40:10Z

Describe the Bug
Hi, I'm trying to make the heatmap directly from picrust2 file. I tried to modify the sample column to sample_name or other modification, but nothing worked.
Error in pull():
! Can't extract columns past the end.
ℹ Location 1 doesn't exist.
ℹ There are only 0 columns.
Reproducible Example

annotated_kegg <- pathway_annotation(file = abundance_file, pathway = "KO", ko_to_kegg = TRUE)

heat <- pathway_heatmap(annotated_kegg, metadata, "Type")

Environment Information:

Operating System: MAC OS - osx-arm64
R Version: 4.4.0
Package Version: latest

The text was updated successfully, but these errors were encountered:

cafferychen777 · 2024-09-18T14:04:43Z

Dear l.gallucci,

Thank you for reporting this issue with the pathway_heatmap function in the ggpicrust2 package. To better assist you, I'll need some additional information:

Could you please share the first few lines of your abundance_file and metadata file? This will help me understand the structure of your data.
What are the dimensions (number of rows and columns) of your annotated_kegg and metadata dataframes?
Can you provide the full error message and traceback you're receiving?
To facilitate debugging, it would be extremely helpful if you could send your abundance_file and metadata file to [email protected]. Please ensure to remove any sensitive information before sharing.
Could you also share the output of sessionInfo() to provide more details about your R environment?

Once I have this information, I'll be able to reproduce the issue and work on a solution more effectively.

Thank you for your patience and cooperation in resolving this issue.

Best regards,
Chen Yang

luigallucci · 2024-09-18T14:12:09Z

sure.

<error/vctrs_error_subscript_oob>
Error in `pull()`:
! Can't extract columns past the end.
ℹ Location 1 doesn't exist.
ℹ There are only 0 columns.
---
Backtrace:
     ▆
  1. ├─ggpicrust2::pathway_heatmap(annotated_kegg, metadata, "Type")
  2. │ └─metadata %>% select(all_of(c(sample_name_col))) %>% pull()
  3. ├─dplyr::pull(.)
  4. ├─dplyr:::pull.data.frame(.)
  5. │ └─tidyselect::vars_pull(names(.data), !!enquo(var))
  6. │   └─tidyselect:::pull_as_location2(...)
  7. │     ├─tidyselect:::with_subscript_errors(...)
  8. │     │ └─base::withCallingHandlers(...)
  9. │     └─vctrs::num_as_location2(...)
 10. │       ├─vctrs:::result_get(...)
 11. │       └─vctrs:::vec_as_location2_result(...)
 12. │         ├─base::tryCatch(...)
 13. │         │ └─base (local) tryCatchList(expr, classes, parentenv, handlers)
 14. │         │   └─base (local) tryCatchOne(expr, names, parentenv, handlers[[1L]])
 15. │         │     └─base (local) doTryCatch(return(expr), name, parentenv, handler)
 16. │         └─vctrs::vec_as_location(i, n, names = names, arg = arg, call = call)
 17. └─vctrs (local) `<fn>`()
 18.   └─vctrs:::stop_subscript_oob(...)
 19.     └─vctrs:::stop_subscript(...)
 20.       └─rlang::abort(...)

2,173 entries, 41 total columns for annotated kegg
39 entries, 19 columns metadata

sessionInfo:

R version 4.3.2 (2023-10-31)
Platform: aarch64-apple-darwin20 (64-bit)
Running under: macOS Sonoma 14.4

Matrix products: default
BLAS:   /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib 
LAPACK: /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/lib/libRlapack.dylib;  LAPACK version 3.11.0

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

time zone: Europe/Berlin
tzcode source: internal

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] ALDEx2_1.28.0         zCompositions_1.5.0-4 truncnorm_1.0-9       NADA_1.6-1.1          survival_3.7-0       
 [6] MASS_7.3-60.0.1       patchwork_1.3.0       ggprism_1.0.5         lubridate_1.9.3       forcats_1.0.0        
[11] stringr_1.5.1         dplyr_1.1.4           purrr_1.0.2           tidyr_1.3.1           tidyverse_2.0.0      
[16] tibble_3.2.1          readr_2.1.5           ggpicrust2_1.7.3      ggthemes_5.1.0        ggplot2_3.5.1        

loaded via a namespace (and not attached):
  [1] splines_4.3.2               later_1.3.2                 bitops_1.0-8                lifecycle_1.0.4            
  [5] edgeR_4.0.16                doParallel_1.0.17           vroom_1.6.5                 lattice_0.22-6             
  [9] magrittr_2.0.3              limma_3.58.1                remotes_2.5.0               httpuv_1.6.15              
 [13] Wrench_1.20.0               sessioninfo_1.2.2           pkgbuild_1.4.4              metagenomeSeq_1.43.0       
 [17] DBI_1.2.3                   RColorBrewer_1.1-3          ade4_1.7-22                 multcomp_1.4-26            
 [21] abind_1.4-8                 pkgload_1.4.0               zlibbioc_1.48.2             quadprog_1.5-8             
 [25] GenomicRanges_1.54.1        BiocGenerics_0.48.1         RCurl_1.98-1.16             TH.data_1.1-2              
 [29] phyloseq_1.48.0             sandwich_3.1-1              circlize_0.4.16             GenomeInfoDbData_1.2.11    
 [33] IRanges_2.36.0              S4Vectors_0.40.2            vegan_2.6-8                 permute_0.9-7              
 [37] codetools_0.2-20            getopt_1.20.4               coin_1.4-3                  DelayedArray_0.28.0        
 [41] tidyselect_1.2.1            shape_1.4.6.1               farver_2.1.2                matrixStats_1.4.1          
 [45] stats4_4.3.2                jsonlite_1.8.8              GetoptLong_1.0.5            multtest_2.58.0            
 [49] ellipsis_0.3.2              iterators_1.0.14            foreach_1.5.2               tools_4.3.2                
 [53] Rcpp_1.0.13                 glue_1.7.0                  SparseArray_1.2.4           DESeq2_1.42.1              
 [57] mgcv_1.9-1                  MatrixGenerics_1.14.0       usethis_3.0.0               GenomeInfoDb_1.38.8        
 [61] withr_3.0.1                 BiocManager_1.30.25         fastmap_1.2.0               GGally_2.2.1               
 [65] latticeExtra_0.6-30         rhdf5filters_1.14.1         fansi_1.0.6                 Maaslin2_1.16.0            
 [69] caTools_1.18.3              digest_0.6.37               timechange_0.3.0            R6_2.5.1                   
 [73] mime_0.12                   colorspace_2.1-1            gtools_3.9.5                jpeg_0.1-10                
 [77] utf8_1.2.4                  generics_0.1.3              data.table_1.16.0           robustbase_0.99-4          
 [81] httr_1.4.7                  htmlwidgets_1.6.4           S4Arrays_1.2.1              ggstats_0.6.0              
 [85] pkgconfig_2.0.3             gtable_0.3.5                modeltools_0.2-23           ComplexHeatmap_2.18.0      
 [89] XVector_0.42.0              pcaPP_2.0-5                 htmltools_0.5.8.1           profvis_0.3.8              
 [93] biomformat_1.30.0           clue_0.3-65                 scales_1.3.0                Biobase_2.62.0             
 [97] png_0.1-8                   optparse_1.7.5              rstudioapi_0.16.0           tzdb_0.4.0                 
[101] reshape2_1.4.4              rjson_0.2.23                curl_5.2.2                  nlme_3.1-166               
[105] zoo_1.8-12                  cachem_1.1.0                rhdf5_2.46.1                GlobalOptions_0.1.2        
[109] KernSmooth_2.23-24          parallel_4.3.2              miniUI_0.1.1.1              libcoin_1.0-10             
[113] RcppZiggurat_0.1.6          pillar_1.9.0                grid_4.3.2                  vctrs_0.6.5                
[117] gplots_3.1.3.1              urlchecker_1.0.1            promises_1.3.0              xtable_1.8-4               
[121] cluster_2.1.6               mvtnorm_1.3-1               cli_3.6.3                   locfit_1.5-9.10            
[125] compiler_4.3.2              rlang_1.1.4                 crayon_1.5.3                lefser_1.12.1              
[129] labeling_0.4.3              interp_1.1-6                plyr_1.8.9                  fs_1.6.4                   
[133] stringi_1.8.4               deldir_2.0-4                BiocParallel_1.36.0         munsell_0.5.1              
[137] Biostrings_2.70.3           devtools_2.4.5              glmnet_4.1-8                Matrix_1.6-5               
[141] hms_1.1.3                   bit64_4.0.5                 Rhdf5lib_1.24.2             KEGGREST_1.42.0            
[145] statmod_1.5.0               shiny_1.9.1                 SummarizedExperiment_1.32.0 Rfast_2.1.0                
[149] igraph_2.0.3                memoise_2.0.1               RcppParallel_5.1.9          biglm_0.9-3                
[153] bit_4.0.5                   DEoptimR_1.1-3              directlabels_2024.1.21      ape_5.8

cafferychen777 · 2024-09-18T14:42:37Z

Dear l.gallucci,

Thank you for reporting this issue with the pathway_heatmap function in the ggpicrust2 package. I believe I understand the problem now:

The column names in your abundance_file don't match the sample IDs in your metadata file. Specifically:

Your metadata file has sample IDs like "sample_id", "Ex2", "Ex4", "Ex_6", "Ex_7", etc.
Your abundance_file has column names like "1", "10", "11", "12", "13", "15", "16", "17", etc.

This mismatch is likely causing the error you're seeing. To resolve this, you need to modify the column names in your abundance_file to match the sample IDs in your metadata file.

Here's a suggested solution:

First, check your metadata file to confirm the exact sample IDs.
Then, modify your abundance_file column names to match these sample IDs.

You can do this using the colnames() function in R. Here's an example of how you might do this:

# Assuming your abundance_file is loaded into a dataframe called 'abundance_df'
# and your metadata is loaded into a dataframe called 'metadata_df'

# Get the sample IDs from your metadata
sample_ids <- metadata_df$sample_id  # or whatever column contains your sample IDs

# Make sure the number of samples matches
if(length(sample_ids) == ncol(abundance_df) - 1) {  # -1 because the first column is likely feature IDs
  # Set the column names of abundance_df
  colnames(abundance_df)[-1] <- sample_ids
} else {
  stop("The number of samples in metadata doesn't match the number of columns in abundance file")
}

After making this change, try running your original code again:

annotated_kegg <- pathway_annotation(file = abundance_df, pathway = "KO", ko_to_kegg = TRUE)
heat <- pathway_heatmap(annotated_kegg, metadata_df, "Type")

If you're still encountering issues after making these changes, please let me know and provide:

The first few lines of your abundance_file and metadata file (after making the changes).
The dimensions of your annotated_kegg and metadata_df dataframes.
Any error messages you're still seeing.

This should help resolve the "Can't extract columns past the end" error you were experiencing. Let me know if you need any further assistance!

Best regards,
Chen Yang

luigallucci · 2024-09-18T15:38:55Z

Dear @cafferychen777 , thank you for the reply.

This is what I performed. Sorry I forgot to specify that I'm using dada_id as names for sampleID.

Unlikely, even changing this the result is still the same.

Apparently, the problems seems to be related to this:

metadata %>% select(all_of(c(sample_name_col))) %>% pull()

cafferychen777 · 2024-09-18T18:29:58Z

Hi @luigallucci ,

Could you sent the data file to [email protected]?

Best,

Niyuh04 · 2024-11-13T06:24:58Z

I have the same problem. I made sure that the row names in the sample.id column of my metadata and the column names in the abundance file match, but I’m still getting the same error. Sorry if something isn’t clear; I’m not very fluent in English, and I’m a bioinformatics enthusiast. Thanks, and great work!

Backtrace:
▆

├─ggpicrust2::pathway_heatmap(...)
│ └─metadata %>% select(all_of(c(sample_name_col))) %>% pull()

cafferychen777 · 2024-11-13T21:10:37Z

Hi @l.gallucci and @Niyuh04,

Thank you for reporting this issue. Based on the error messages and screenshots you've provided, I can help resolve the sample name matching problem in the pathway_heatmap function.

The error occurs because the function cannot find matching sample names between your abundance data and metadata. Here's how to fix it:

First, please check that your sample names match exactly between your abundance data and metadata:

# Check your data
head(colnames(annotated_kegg))  # These should match your metadata sample IDs
head(metadata$sample_id)        # Or whatever column contains your sample IDs

Make sure your metadata has one of these column names for sample IDs:

sample_id
SampleID
Sample_ID
sample_name
Sample
dada_id

Example of correct format:

# Metadata format
metadata <- data.frame(
  sample_id = c("sample1", "sample2", "sample3"),  # Must match abundance colnames
  Type = c("control", "treatment", "treatment")
)

# Then call the function
pathway_heatmap(
  abundance = annotated_kegg,
  metadata = metadata,
  group = "Type"
)

If you're still experiencing issues, please share:

The output of head(colnames(annotated_kegg))
The output of head(metadata)
The exact column name in your metadata that contains sample IDs

I'll be implementing a fix in the next package update to make the sample name matching more robust and provide clearer error messages.

Best regards,
Caffery

P.S. @Niyuh04 - Your English is perfectly clear, no worries! Thank you for providing the detailed error information.

luigallucci added the bug Something isn't working label Sep 18, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

pathway_heatmap can't extract columns past the end #118

pathway_heatmap can't extract columns past the end #118

luigallucci commented Sep 18, 2024

cafferychen777 commented Sep 18, 2024

luigallucci commented Sep 18, 2024

cafferychen777 commented Sep 18, 2024

luigallucci commented Sep 18, 2024 •

edited

Loading

cafferychen777 commented Sep 18, 2024

Niyuh04 commented Nov 13, 2024

cafferychen777 commented Nov 13, 2024

pathway_heatmap can't extract columns past the end #118

pathway_heatmap can't extract columns past the end #118

Comments

luigallucci commented Sep 18, 2024

cafferychen777 commented Sep 18, 2024

luigallucci commented Sep 18, 2024

cafferychen777 commented Sep 18, 2024

luigallucci commented Sep 18, 2024 • edited Loading

cafferychen777 commented Sep 18, 2024

Niyuh04 commented Nov 13, 2024

cafferychen777 commented Nov 13, 2024

luigallucci commented Sep 18, 2024 •

edited

Loading