From 21d61e3769b4f98a260a19181798b5d9392612a6 Mon Sep 17 00:00:00 2001 From: Adibvafa Fallahpour <90617686+Adibvafa@users.noreply.github.com> Date: Sat, 21 Sep 2024 15:17:48 -0400 Subject: [PATCH 1/2] Update README.md --- README.md | 13 ++++++++++++- 1 file changed, 12 insertions(+), 1 deletion(-) diff --git a/README.md b/README.md index 598c7bb..5554bfe 100644 --- a/README.md +++ b/README.md @@ -61,6 +61,7 @@ output = predict_dna_sequence( tokenizer=tokenizer, model=model, attention_type="original_full", + deterministic=True ) print(format_model_output(output)) ``` @@ -86,13 +87,23 @@ M_UNK A_UNK L_UNK W_UNK M_UNK R_UNK L_UNK L_UNK P_UNK L_UNK L_UNK A_UNK L_UNK L_ ----------------------------- ATGGCTTTATGGATGCGTCTGCTGCCGCTGCTGGCGCTGCTGGCGCTGTGGGGCCCGGACCCGGCGGCGGCGTTTGTGAATCAGCACCTGTGCGGCAGCCACCTGGTGGAAGCGCTGTATCTGGTGTGCGGTGAGCGCGGCTTCTTCTACACGCCCAAAACCCGCCGCGAAGCGGAAGATCTGCAGGTGGGCCAGGTGGAGCTGGGCGGCTAA ``` + +### Generating Multiple Variable Sequences + +Set `deterministic=False` to generate variable sequences. Control the variability using `temperature`: + +- `temperature`: (recommended between 0.2 and 0.8) + - Lower values (e.g., 0.2): More conservative predictions + - Higher values (e.g., 0.8): More diverse predictions + +Using very high temperatures might result in prediction of DNA sequences that do not translate to the exact input protein.
+Generate multiple sequences by setting `num_sequences` to a value greater than 1.
**You can use the [inference template](https://github.com/Adibvafa/CodonTransformer/raw/main/src/CodonTransformer_inference_template.xlsx) for batch inference in [Google Colab](https://adibvafa.github.io/CodonTransformer/GoogleColab).**
- ## Installation Install CodonTransformer via pip: From c452219421d36229187b0cebebbab44f12e5e70f Mon Sep 17 00:00:00 2001 From: Adibvafa Fallahpour <90617686+Adibvafa@users.noreply.github.com> Date: Sat, 21 Sep 2024 15:18:12 -0400 Subject: [PATCH 2/2] Update README.md --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index 5554bfe..e39da48 100644 --- a/README.md +++ b/README.md @@ -96,7 +96,7 @@ Set `deterministic=False` to generate variable sequences. Control the variabilit - Lower values (e.g., 0.2): More conservative predictions - Higher values (e.g., 0.8): More diverse predictions -Using very high temperatures might result in prediction of DNA sequences that do not translate to the exact input protein.
+Using high temperatures might result in prediction of DNA sequences that do not translate to the input protein.
Generate multiple sequences by setting `num_sequences` to a value greater than 1.