-
Notifications
You must be signed in to change notification settings - Fork 327
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Table names for multi-part inserts #186
base: master
Are you sure you want to change the base?
Conversation
…t reflected on the shark UI
Hi Sundeep, the current Shark master doesn't include support for partitioned cached tables. |
Hi Harvey, The current patch is meant to allow users to track the storage/memory usage on Shark Storage UI per table as opposed to 'rdd_###'. Inserts/overwrites to the cached tables render the current Storage UI quite hard to follow. It does not handle drop parititions and overwrites in any special way, but it does guarantee that each block of data is identified by a unique number and has the table name associated with it on the UI. I am planning on submitting another patch once we have partition support that has naming conventions derived from hive's partition information. |
Can one of the admins verify this patch? |
Yeah, the storage UI is a bit confusing right now :( |
Based on hive's documentation, shouldn't the insert overwrite on table unpersist the existing RDDs? (partitions just unpersist the overwritten partitions). If this is the case, I can push a fix on that front. |
Yeah, that sounds good - created a ticket for that here: https://spark-project.atlassian.net/browse/SHARK-202. |
Sure. I do not seem to have permissions to assign myself the ticket. If you can help with that, I will take on the ticket. :) |
Done - assigned it to you. Thx! |
Oh, it looks like the assignments were concurrent.... |
What's the status of this pr? |
Fix to ensure tablenames for multi-insert/partitioned cached table get reflected on the shark UI.