Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Interactive usage of csv_export2 #308

Closed
berland opened this issue Mar 17, 2021 · 8 comments
Closed

Interactive usage of csv_export2 #308

berland opened this issue Mar 17, 2021 · 8 comments
Assignees
Labels
enhancement New feature or request

Comments

@berland
Copy link
Collaborator

berland commented Mar 17, 2021

Allow interactive usage of csv_export2 outside ERT
For interactive usage outside of ERT, a command line API to csv_export2 can be a good idea

Describe the solution you'd like
csv_export2 already has a command line API, but it requires an existing RUNPATHFILE which is non-trivial to create ad-hoc. Suggest to allow a glob-path to be supplied instead of RUNPATHFILE, so that if csv_export2 encounters a non-existing runpath file, it will attempt to interpret it as a glob-path that can be supplied to EnsembleSets initialization code as a frompath argument.

@berland berland self-assigned this Mar 17, 2021
@berland berland added the enhancement New feature or request label Mar 17, 2021
@berland berland changed the title Interactive u sage Interactive usage of csv_export2 Mar 17, 2021
@markusdregi
Copy link
Contributor

This will move the export further away from the current direction of ERT development and no shared disk for data communication. I'm suggesting that this command line tool is instead implemented closer to the EnsembleSet, maybe in fmu-ensemble?

@berland
Copy link
Collaborator Author

berland commented Mar 17, 2021

Fair point, an endpoint for this has always been relevant to make for fmu-ensemble, but has not been made as the scope for such a tool in fmu-ensemble is potentially so big.

csv_export2.py is the path of least resistance, the unpublished PR to support this starts with only three extra lines, and turns out to be handy for debugging/testing.

@markusdregi
Copy link
Contributor

Could you please elaborate on why the situation arise when you do not have a runpath file?

@berland
Copy link
Collaborator Author

berland commented Mar 18, 2021

Could you please elaborate on why the situation arise when you do not have a runpath file?

  1. If a user requests support, but in the first pass only provide the scratch directory, this PR will make it easier to generate a CSV-file quickly for simple tests/analysis/debugging. When a user needs support on csv exporting, it is also possible that the problem is in the runpathfile, so a possibility to skip that is handy, if not to solve the problem, but at least isolate it.
  2. For situations where the user has rerun only a failed part of an ensemble, there are situations where the runpathfille becomes incorrect (or at least not what you want), and the ability to work around using the command line can be handy.
  3. Yesterday I needed it (and used this PR) to test if Concurrency for 2.0. Based on ecl2df and no eclsum caching fmu-ensemble#206 gave any speedup on csv_export2.py on real-life ensembles from /scratch/. Digging up the associated runpathfiles for a random directory on /scratch/ is a no-go.

This functionality is not intended to be used in ERT-config, so it is not documented as such.

@markusdregi
Copy link
Contributor

Those are good user stories that I understand that you are motivated to cater for! Still, semeio is exactly for functionality that is intended to be used within ERT setups. Hence, I still believe this tool would fit better within fmu-ensemble, or possibly subscript...

@berland
Copy link
Collaborator Author

berland commented Mar 18, 2021

User story 1 is to aid debuggers for problems related to csv_export in an ERT setup, so this points to including this in semeio.
User story 2 is for users, when they need something outside what a normal ERT setup can provide, this points to not having this in semeio, so f.ex subscript.
User story 3 gives a reason to implement a similar tool in fmu-ensemble.

Lets ignore user story 3 for now. And add that we don't want to duplicate this code anywhere. ERT setups is the core target for csv_export2.py which means it should be in semeio, not in subscript. Claim: User story 2 is not important enough to move csv_export2.py to subscript.

I am left with dropping this dead, or merging into semeio.

@markusdregi
Copy link
Contributor

My main concern here is that we are building upon the export capability where the user will have to work explicitly on the runpath(file), instead of exporting a case. This is in my opinion an anti-pattern that is orthogonal to the introduction of a data API and starting to depend less on the shared disks; which are key to the upcoming FMU efforts.

My hope earlier was that we could actually start backing CSV_EXPORT2 already by the data API. However, this will require the user to stop point to the runpath(file) and instead to the case. Which is breaking with the current approach of CSV_EXPORT2 to the extent that perhaps this should be introduced as CSV_EXPORT(3) together with a deprecation of CSV_EXPORT2. Notice that this should also solve the use cases described above.

If we agree that CSV_EXPORT2 is export by runpath, and the next iteration of export is export by case, then a feature in the direction sketched here makes sense...

Any thoughts on this @lars-petter-hauge or @oyvindeide?

@oyvindeide
Copy link
Contributor

Think at this point we can close this issue, we are hopefully close to implementing a csv_export in ert, which will make the csv_export2 workflow superflous. If the functionality scetched out here is still needed it would make sense to implement that elsewhere.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants