-
Notifications
You must be signed in to change notification settings - Fork 238
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: mvp for lance version 0.2 reader / writer #1965
Conversation
protos/file2.proto
Outdated
// | Global Buffers | | ||
// | |C| Global Meta Buffer 0* | | ||
// | ... | | ||
// | Global Meta Buffer GN* | | ||
// ├──────────────────────────────────┤ | ||
// | Global Buffers Offset Table | | ||
// | |D| Global Buffer 0 Position* | | ||
// | Global Buffer 0 Size | | ||
// | ... | | ||
// | Global Buffer GN Position | | ||
// | Global Buffer GN Size | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My initial thought here is that it's unclear why this is preferable to just having a protobuf message for global data. Protobuf (or some other serialization framework) gives us an easy way to describe messages and handling changes to the message. I worry we are taking on some engineering overhead to add new types of metadata to the format.
cc32490
to
d6eb25f
Compare
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #1965 +/- ##
==========================================
+ Coverage 81.02% 81.08% +0.05%
==========================================
Files 179 181 +2
Lines 50933 51357 +424
Branches 50933 51357 +424
==========================================
+ Hits 41271 41645 +374
+ Misses 7277 7263 -14
- Partials 2385 2449 +64
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
d6eb25f
to
5fe31cb
Compare
4c6cb7d
to
6be30bb
Compare
6be30bb
to
956709b
Compare
rust/lance-file/src/v2/reader.rs
Outdated
// We don't use this today because we get here by subtracting backward from global_buf_start | ||
#[allow(dead_code)] | ||
column_meta_offsets_start: u64, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason not to use this? Is there any risk this becomes a bug later?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can update this comment. We will need this if/when we want to support "true column projection" where we don't load any of the metadata for columns we aren't interested in.
I'm not sure we will want this anytime soon because then:
- If you are not using cached metadata then you are adding at least one extra IOP (first read footer, then read column metadata offsets table, then read column metadatas of interest, this last step is only one IOP of you coalesce which is non trivial)
- If you are using cached metadata then your metadata caching gets trickier (what happens if you have cached metadata for the file for columns X, and Z and the user now wants column Y?) You will need to cache on a per-column basis and I wanted to avoid that complexity for now.
let (tx, rx) = mpsc::unbounded_channel(); | ||
|
||
let scheduler = self.scheduler.clone() as Arc<dyn EncodingsIo>; | ||
// FIXME: spawn this, change this method to sync |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's your philosophy here on what should be sync and what should be async?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Decode happens by polling the returned stream. This gives a rust user "what they expect" when it comes to parallelism / readahead. If the user does not call buffered
then there is one decode thread and no "batch parallelism". If the user calls buffered
then there are N decode threads / batch parallelism.
Right now that decode won't start until the scheduling has completely finished which is not ideal but not terrible (scheduling is pretty fast). It'll be more of an issue when there is a chance of backpressure being needed (e.g. decoder falls behind I/O).
If this method is async then it means some kind of "potentially slow" task is happening before the decode thread starts and I can't think of any valid reason for that.
Also, the schedule_range method itself is roughly synchronous. Ideally it would be entirely synchronous. Scheduling should never have to pause and wait for I/O (we go through great lengths to avoid this in the list scheduler). This is because any pause here runs the risk of starving our I/O parallelism.
I haven't yet concluded if I can safely make schedule_range
a synchronous method though. My only remaining concern is backpressure. The scheduler may need to pause if it gets too far ahead of the decoder. On the other hand, I don't think this is very important because the size of "all scheduled tasks" should be roughly equivalent to the size of the file metadata (e.g. not too large) and so backpressure is more of a concern for the I/O scheduler than it is for the scheduling thread.
Prerequisites: - [x] #1965
The motivation and bigger picture are covered in more detail in #1929
This PR builds on top of #1918 and #1964 to create a new version of the Lance file format.
There is still much to do, but this end-to-end MVP should provide the overall structure for the work.
It can currently read and write primitive columns and list columns and supports some very basic encodings.