-
Notifications
You must be signed in to change notification settings - Fork 5
Discussion points
Hannes Hauswedell edited this page Feb 9, 2022
·
7 revisions
A list of things we should discuss before a 1.0 release (create individual pages with details if more than a paragraph is needed):
- Should all
bio::field
enum entries be distinct for simplicity? - "Deep records":
- Shallow records are default and recommended.
- Deep records are required sometimes, e.g. in combination with `views::async_input_buffer
- Currently, one can select deep records via template parameter and the options. This means all formats need to implement it (not that much work actually) and that the options and dynamic_type are more complicated (a little annoying).
- An alternative design would be to have the formats always output shallow records and offer a generic
.make_deep_copy()
on the record that returns a self-contained record (this would automatically turn views into vectors...). - PRO: the overall design becomes easier to understand; a little less work for format input handlers
- CON: you cannot specify the specific "deep" types anymore, the record always picks e.g. vector for views; certain optimisations are no longer possible (e.g. deep FASTA reading into std::string can currently avoid a copy by swapping buffers with output strings; this wouldn't be possible in the changed design)
- Should all concepts that constrain public interfaces be public? Especially the field_types have lots of requirements. I am afraid this will clutter the documentation significantly.
- What happens when a writer receives no record?
- Currently, nothing happens. But it should write the header to create a valid "empty" file. → the destructor should write the header. this is now implemented for VCF and BCF
- What happens when there is no header, because e.g. you stream
reader | std::views::filter(foo) | writer;
and the filter removes all records? → The assignment/pipe operator should "unpeel" the input range and try to find out if the most-underlying range is a reader and if yes, get that that reader's header and set it. not yet implemented
- Do we want to gracefully accept
char const *
as a string type? I am trying to do this everywhere, but it is a real hassle, because the type is not a range. - What to do about non-io exceptions thrown within IO, e.g. failure to convert string to number? Currently results in error without context. Proper solution: catch exception and rethrow with context. But we don't want to
catch
inside the library. Alternative solution: check in destructor of the respectiveformat_input_handler
whether stack is unwinding and a non-io exception is being thrown, if yes, print context to stderr.