-
Notifications
You must be signed in to change notification settings - Fork 40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
steps for training #18
Comments
Dear Brian.
Our team is now struggling to update our simulator to version 2.0.
In this updated version, we will release the complete step-by-step
instructions on how to train this simulator, by a much simplified machine
learning model.
We will let you know at the first time when we finalize the update.
Please be patient.
Best,
-Sheng
…On Wed, May 8, 2019 at 4:46 PM Brian Walenz ***@***.***> wrote:
I like this approach to simulation; it lets me easily control the
sequences simulated (e.g., letting me make chimeric reads, reads with
garbage in the middle, etc) and, in theory, lets me try different base
callers on those signals. My use case is to generate reads with various
levels of junk in them to test assembly algorithms.
I'm confused about how training is accomplished. Two questions:
If I have a big pile of fast5 files, what steps do I need to convert that
into inputs for training? I see I need to supply a 'rawsig' file and a
'fasta' file. Where do those come from?
In particular, if the fasta is the result of base calling the fast5,
aren't you then training to make signal that will result in the correct
sequence *for that particular base caller*?
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#18>, or mute the thread
<https://github.com/notifications/unsubscribe-auth/ACD6EWKAEMK6LIP3BBNMG2TPUNC2ZANCNFSM4HLVPHYA>
.
|
I am having the same problem as you, brianwalenz where I need to retrain from fast5 and fastq files to rawsig and a fasta. Has there been any progress on this particular issue? |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
I like this approach to simulation; it lets me easily control the sequences simulated (e.g., letting me make chimeric reads, reads with garbage in the middle, etc) and, in theory, lets me try different base callers on those signals. My use case is to generate reads with various levels of junk in them to test assembly algorithms.
I'm confused about how training is accomplished. Two questions:
If I have a big pile of fast5 files, what steps do I need to convert that into inputs for training? I see I need to supply a 'rawsig' file and a 'fasta' file. Where do those come from?
In particular, if the fasta is the result of base calling the fast5, aren't you then training to make signal that will result in the correct sequence for that particular base caller?
The text was updated successfully, but these errors were encountered: