steps for training #18

brianwalenz · 2019-05-08T21:46:20Z

I like this approach to simulation; it lets me easily control the sequences simulated (e.g., letting me make chimeric reads, reads with garbage in the middle, etc) and, in theory, lets me try different base callers on those signals. My use case is to generate reads with various levels of junk in them to test assembly algorithms.

I'm confused about how training is accomplished. Two questions:

If I have a big pile of fast5 files, what steps do I need to convert that into inputs for training? I see I need to supply a 'rawsig' file and a 'fasta' file. Where do those come from?

In particular, if the fasta is the result of base calling the fast5, aren't you then training to make signal that will result in the correct sequence for that particular base caller?

realbigws · 2019-05-29T23:45:06Z

Dear Brian. Our team is now struggling to update our simulator to version 2.0. In this updated version, we will release the complete step-by-step instructions on how to train this simulator, by a much simplified machine learning model. We will let you know at the first time when we finalize the update. Please be patient. Best, -Sheng

…

On Wed, May 8, 2019 at 4:46 PM Brian Walenz ***@***.***> wrote: I like this approach to simulation; it lets me easily control the sequences simulated (e.g., letting me make chimeric reads, reads with garbage in the middle, etc) and, in theory, lets me try different base callers on those signals. My use case is to generate reads with various levels of junk in them to test assembly algorithms. I'm confused about how training is accomplished. Two questions: If I have a big pile of fast5 files, what steps do I need to convert that into inputs for training? I see I need to supply a 'rawsig' file and a 'fasta' file. Where do those come from? In particular, if the fasta is the result of base calling the fast5, aren't you then training to make signal that will result in the correct sequence *for that particular base caller*? — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#18>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ACD6EWKAEMK6LIP3BBNMG2TPUNC2ZANCNFSM4HLVPHYA> .

Merritt-Brian · 2020-01-31T19:33:00Z

I am having the same problem as you, brianwalenz where I need to retrain from fast5 and fastq files to rawsig and a fasta. Has there been any progress on this particular issue?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

steps for training #18

steps for training #18

brianwalenz commented May 8, 2019

realbigws commented May 29, 2019 via email

Merritt-Brian commented Jan 31, 2020

steps for training #18

steps for training #18

Comments

brianwalenz commented May 8, 2019

realbigws commented May 29, 2019 via email

Merritt-Brian commented Jan 31, 2020