sORFdb - A database for sORFs, small proteins, and small protein families in bacteria

sORFdb is a comprehensive, taxonomically independent database dedicated to sORF and small protein sequences, small protein families and related information in bacteria. It aims to improve the findability and classification of sORFs, small proteins, and their functions in bacteria, thereby supporting their future detection and consistent annotation.

The website of sORFdb is available at https://sorfdb.computational.bio/.

The database can be dowloaded from Zenodo

Description

This repository contains the workflows used to create the sORFdb database starting from the data aggregation (01_download), the data processing (02_processing), the clustering of small proteins and identification of small protein families (03_autoclust) and helper scripts to prepare the database for the server (04_website-helper). The Jupyter notebook used for conducting the analysis of the data for the manuscript is also available (05_analysis).

To create the database from scratch, it is highly recommended to have access to a SLURM cluster or expand the nextflow config files with another nextflow executor.

Installation

Clone this GitHub repository to your local system. Run the according Nextflow scripts in the subdirectories. Their usage is described in their respective README files.

Requirements

Nextflow
Conda or Mamba

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
01_download		01_download
02_processing		02_processing
03_autoclust		03_autoclust
04_website_helper		04_website_helper
05_analysis		05_analysis
shared		shared
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

sORFdb - A database for sORFs, small proteins, and small protein families in bacteria

Contents

Description

Installation

About

Releases

Packages

Languages

License

ag-computational-bio/sorfdb

Folders and files

Latest commit

History

Repository files navigation

sORFdb - A database for sORFs, small proteins, and small protein families in bacteria

Contents

Description

Installation

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages