Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Include the percentage values in the stacked plots #100

Open
antoniobio opened this issue Aug 18, 2023 · 5 comments
Open

Include the percentage values in the stacked plots #100

antoniobio opened this issue Aug 18, 2023 · 5 comments

Comments

@antoniobio
Copy link

Hello,
It is possible do include the percentage values in the stacked plots for each OTU or at least extract those values?

Best regards

@xiangpin
Copy link
Member

Yes, since the many plot object was a ggplot or ggtree object. You can add other layers (from ggplot2 ecosystem) to the original plot. For example. to do what your want. First, you can use mp_plot_abundance to object the ggplot object, then use geom_fit_text of ggfittext to display the percentage values.

> library(ggplot2)
> library(ggfittext)
> library(MicrobiotaProcess)
MicrobiotaProcess v1.13.2.993 For help:
https://github.com/YuLab-SMU/MicrobiotaProcess/issues

If you use MicrobiotaProcess in published research, please cite the
paper:

Shuangbin Xu, Li Zhan, Wenli Tang, Qianwen Wang, Zehan Dai, Lang Zhou,
Tingze Feng, Meijun Chen, Tianzhi Wu, Erqiang Hu, Guangchuang Yu.
MicrobiotaProcess: A comprehensive R package for deep mining
microbiome. The Innovation. 2023, 4(2):100388. doi:
10.1016/j.xinn.2023.100388

Export the citation to BibTex by citation('MicrobiotaProcess')

This message can be suppressed by:
suppressPackageStartupMessages(library(MicrobiotaProcess))

Attaching package: ‘MicrobiotaProcess’

The following object is masked from ‘package:stats’:

    filter

> data(mouse.time.mpse)
> mouse.time.mpse %>% mp_rrarefy(.abundance=Abundance) %>% mp_plot_abundance(.abundance=RareAbundance, .group=time, taxa.class=Phylum, topn = 20, order.by.feature = "p__Firmicutes", width = 4/5) -> p1
> p2 <- p1 + geom_fit_text(aes(label = paste0(round(RelRareAbundanceBySample,1), "%")), position=position_stack(vjust=.5), show.legend=F, color='white')
> p1 / p2

image

@xiangpin
Copy link
Member

Of course, you can use mp_cal_abundance and mp_extract_abundance to obtain the relative abundance or abundance of specific taxa level.

> # This will calculate the relative abundance (argument: relative = T) with `Abundance` directly without rarefraction (force = T)
> mouse.time.mpse %>% mp_cal_abundance(.abundance=Abundance, force=T, relative=T) -> mpse2
> mpse2 %>% mp_extract_abundance(taxa.class=OTU, topn='all')
# A tibble: 218 × 3
   label   nodeClass AbundanceBySample
   <fct>   <chr>     <list>
 1 OTU_67  OTU       <tibble [19 × 4]>
 2 OTU_231 OTU       <tibble [19 × 4]>
 3 OTU_188 OTU       <tibble [19 × 4]>
 4 OTU_150 OTU       <tibble [19 × 4]>
 5 OTU_207 OTU       <tibble [19 × 4]>
 6 OTU_5   OTU       <tibble [19 × 4]>
 7 OTU_1   OTU       <tibble [19 × 4]>
 8 OTU_2   OTU       <tibble [19 × 4]>
 9 OTU_3   OTU       <tibble [19 × 4]>
10 OTU_4   OTU       <tibble [19 × 4]>
# ℹ 208 more rows
# ℹ Use `print(n = ...)` to see more rows
> mpse2 %>% mp_extract_abundance(taxa.class=OTU, topn='all') %>% tidytree::unnest(AbundanceBySample)
# A tibble: 4,142 × 6
   label  nodeClass Sample Abundance RelAbundanceBySample time
   <fct>  <chr>     <chr>      <int>                <dbl> <chr>
 1 OTU_67 OTU       F3D0          24               0.368  Early
 2 OTU_67 OTU       F3D1           0               0      Early
 3 OTU_67 OTU       F3D141        16               0.329  Late
 4 OTU_67 OTU       F3D142        28               1.11   Late
 5 OTU_67 OTU       F3D143        10               0.397  Late
 6 OTU_67 OTU       F3D144        21               0.602  Late
 7 OTU_67 OTU       F3D145         7               0.120  Late
 8 OTU_67 OTU       F3D146         3               0.0773 Late
 9 OTU_67 OTU       F3D147        29               0.223  Late
10 OTU_67 OTU       F3D148        68               0.684  Late
# ℹ 4,132 more rows
# ℹ Use `print(n = ...)` to see more rows
>

The Abundance is the original abundance to calculate the relative abundance, the RelAbundanceBySample is the relative abundance according the Abundance. and the tibble is longer format, which can be processed and visualized by the tidyverse ecosystem.

@antoniobio
Copy link
Author

Thank you very much.
when I use the plot.group = TRUE I get an error: object 'RelRareAbundanceBySample' not found

Is it possible plot by group with the replicate means of the relative abundance?

best regards,

@xiangpin
Copy link
Member

RelRareAbundanceBySample changed to RelRareAbundanceBy + YourGroupName. For example. In the demo datasets.
YourGroupName is time. This was to distinguish the sample or different group name.

> library(ggplot2)
> library(ggfittext)
> library(MicrobiotaProcess)
MicrobiotaProcess v1.13.2.993 For help:
https://github.com/YuLab-SMU/MicrobiotaProcess/issues

If you use MicrobiotaProcess in published research, please cite the
paper:

Shuangbin Xu, Li Zhan, Wenli Tang, Qianwen Wang, Zehan Dai, Lang Zhou,
Tingze Feng, Meijun Chen, Tianzhi Wu, Erqiang Hu, Guangchuang Yu.
MicrobiotaProcess: A comprehensive R package for deep mining
microbiome. The Innovation. 2023, 4(2):100388. doi:
10.1016/j.xinn.2023.100388

Export the citation to BibTex by citation('MicrobiotaProcess')

This message can be suppressed by:
suppressPackageStartupMessages(library(MicrobiotaProcess))

Attaching package: ‘MicrobiotaProcess’

The following object is masked from ‘package:stats’:

    filter

> data(mouse.time.mpse)
> mouse.time.mpse %>% mp_rrarefy(.abundance=Abundance) %>% mp_plot_abundance(.abundance=RareAbundance, .group=time, taxa.class=Phylum, topn = 20, width = 4/5, plot.group=T) -> p1
> p1$data
# A tibble: 18 × 5
   Phylum             nodeClass time  RareAbundanceBytime RelRareAbundanceBytime
   <fct>              <chr>     <chr>               <int>                  <dbl>
 1 p__Actinobacteria  Phylum    Early                  31                0.137
 2 p__Actinobacteria  Phylum    Late                  127                0.504
 3 p__Bacteroidetes   Phylum    Early               13489               59.5
 4 p__Bacteroidetes   Phylum    Late                18333               72.8
 5 p__Cyanobacteria   Phylum    Early                  15                0.0662
 6 p__Cyanobacteria   Phylum    Late                    5                0.0199
 7 p__Deinococcus-Th… Phylum    Early                   0                0
 8 p__Deinococcus-Th… Phylum    Late                    1                0.00397
 9 p__Firmicutes      Phylum    Early                8473               37.4
10 p__Firmicutes      Phylum    Late                 6471               25.7
11 p__Patescibacteria Phylum    Early                  69                0.304
12 p__Patescibacteria Phylum    Late                   32                0.127
13 p__Proteobacteria  Phylum    Early                  71                0.313
14 p__Proteobacteria  Phylum    Late                    8                0.0318
15 p__Tenericutes     Phylum    Early                 508                2.24
16 p__Tenericutes     Phylum    Late                  203                0.806
17 p__Verrucomicrobia Phylum    Early                   6                0.0265
18 p__Verrucomicrobia Phylum    Late                    0                0

the RelRareAbundanceBySample had been replaced by RelRareAbundanceBytime in the data of ggplot object. So the mapping of geom_fit_text also needs to be adjusted.

> p2 <- p1 + geom_fit_text(aes(label = paste0(round(RelRareAbundanceBytime,1), "%")), position=position_stack(vjust=.5), show.legend=F, color='white')
> p1 + p2

image

@antoniobio
Copy link
Author

Many thanks,

Best regards

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants