Hierarchical learning reveals ketogenic diet reprograms cancer metabolism through lysine β-hydroxybutyrylation

We developed a hierarchical learning framework, namely prediction of functionally important lysine modification sites (pFunK), to address the challenge of identifying functionally important lysine β-hydroxybutyrylation (Kbhb) sites in cancer metabolism.

Model Architecture

pFunK utilizes a novel hierarchical learning framework implemented in three steps to minimize bias and over-fitting:

pFunK-P:
- This pre-training model employs the state-of-the-art Transformer algorithm to learn “in-context” information from a large training dataset of 145,657 non-redundant lysine modification sites spanning 29 types of lysine modifications.
- By focusing on short sequences around lysine modification sites, pFunK-P captures contextual sequence features crucial for downstream predictions.
pFunK-T:
- The transformer-based model from pFunK-P is fine-tuned to capture Kbhb-specific characteristics through transfer learning.
- Additionally, pFunK-T integrates 10 types of sequence and structural features, further refined using the Model-Agnostic Meta-Learning (MAML) algorithm.
- This step uses a benchmark dataset of 6,932 non-redundant Kbhb sites, obtained from 6,318 reported and 5,304 identified sites after homologous elimination.
pFunK:
- In the final step, MAML is employed to fine-tune the model using only 9 functionally important Kbhb sites, ensuring the functional relevance of the predictions.

If you are interested in pFunK and want to know more details, please check the article.

Usage

models: Pre-trained models referenced in the article.
models_buildings_codes: Scripts to train your own models.

Quick Start

To predict using your fasta file, simply run predict.py.

Instructions for Use

Clone the Repository:

git clone https://github.com/your-repo/pFunK.git
cd pFunK

Install Dependencies:
```
pip install -r requirements.txt
```

Run Predictions:

Prepare your fasta file.

Run the prediction script:

python predict.py --input your_fasta_file.fasta --output results.txt

Training Your Own Model:
- Modify the configuration files as needed.
- Run the training scripts located in models_buildings_codes directory.

Additional Files

Omics Analysis:
- We have provided scripts and data for comprehensive omics analysis to complement the predictions made by pFunK.
- These resources are located in the omics_analysis directory.
Feature Extraction:
- The feature_extraction directory contains scripts for extracting various sequence and structural features used in model training and prediction.
- These scripts ensure the reproducibility of our feature extraction process.

Code Availability

In addition to pFunK, other custom code used in the study, including scripts for omics analysis and feature extraction, are available in the respective directories. This ensures transparency and facilitates other researchers to reproduce and build upon our work.

Tutorial

Clone the Repository:

git clone https://github.com/your-repo/pFunK.git
cd pFunK

Install Dependencies:
```
pip install -r requirements.txt
```

Run Predictions:

Prepare your fasta file.

Run the prediction script:

python predict.py --input your_fasta_file.fasta --output results.txt

Training Your Own Model:
- Modify the configuration files as needed.
- Run the training scripts located in models_buildings_codes directory.
Omics Analysis:
- Navigate to the omics_analysis directory for scripts and data related to comprehensive omics analysis.
Feature Extraction:
- Use the scripts in the feature_extraction directory for extracting sequence and structural features.

Contact

Xinhe Huang: [email protected]
Dr. Yu Xue: [email protected]

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Hierarchical learning reveals ketogenic diet reprograms cancer metabolism through lysine β-hydroxybutyrylation

Model Architecture

Usage

Quick Start

Instructions for Use

Additional Files

Code Availability

Tutorial

Contact

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
Feature Extraction		Feature Extraction
Omics Analysis		Omics Analysis
models		models
models_buildings_codes		models_buildings_codes
Omics Analysis.zip		Omics Analysis.zip
README.md		README.md
predict.py		predict.py

biocuckooHXH/pFunK

Folders and files

Latest commit

History

Repository files navigation

Hierarchical learning reveals ketogenic diet reprograms cancer metabolism through lysine β-hydroxybutyrylation

Model Architecture

Usage

Quick Start

Instructions for Use

Additional Files

Code Availability

Tutorial

Contact

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages