Skip to content
This repository has been archived by the owner on Jun 18, 2023. It is now read-only.

Release planning: v0.7.0 #90

Closed
3 tasks done
ceholden opened this issue May 12, 2016 · 0 comments
Closed
3 tasks done

Release planning: v0.7.0 #90

ceholden opened this issue May 12, 2016 · 0 comments
Milestone

Comments

@ceholden
Copy link
Owner

ceholden commented May 12, 2016

Pinging @valpasq @bullocke @parevalo
I've created a maintenance branch for the v0.6.x as the next steps I want to take with this project will be very backward incompatible. Any bugs we discover or (very) minor features we want to add can be added to v0.6.x while we work toward this new goal. I'd like to discuss most of this in person, but to get the ball rolling I've written up a bit of a roadmap to eliminate a lot of the technical debt blocking progress toward what we want to be able to do. I'd appreciate any comments or questions you might have or at least this could serve to get everyone toward being on the same page.

This release will be focused on incorporation of new tech that will greatly enhance the possibilities within YATSM.

These improvements can be structured into three general categories:

  1. Increase dataset IO flexibility by utilizing xarray for labeled nd-arrays
    • Define one or more datasets that provide a set of labeled dataset bands (e.g., Landsat provides 'red', 'nir', 'swir' while PRISM provides 'ppt' and 'tmean')
    • xarray dataset will allow multiple datasets to be analyzed in one object, easing attempts at fusing data from multiple sensors
    • Labeling the band dimension of our datasets will ease how we refer to these time series data (e.g., CCDC will need bands labeled 'green' and 'swir' for cloud masking and we won't have to provide the index of these bands)
    • xarray datasets are easily to serialize and will resolve many of the sticking points we currently have with "caching" our time series
  2. Enhance result storage capabilities by using an indexed, hierarchical data storage format (likely pytables)
    • Indexing our results storage format will greatly increase the speed with which we can extract information from these results (see Result file IO abstractions #69)
    • Hierarchical data storage will provide separation of science model results from one another while still allowing these results to nest within another (e.g., temporal segmentation at the top of the hierarchy with long term phenology estimates and classification labels nested within the segmentation results)
    • A more robust serialization format will allow for "picking up" or "resuming" of model runs if a user wants to add another step in their analysis
  3. Allow users to easily chain together science models in a data analysis pipeline (likely with luigi) - Topic: pipeline framework #91
    • Right now we have something of a pipeline -- running CCDCesque, fixing change results with the "commission test", re-estimating time series model attributes using the refitting steps, estimating phenology attributes for each segment, and then classifying the land cover condition for each segment
    • Unfortunately, this existing pipeline is very much hard coded and is a poor framework for adding additional steps
    • Leverage existing pipeline technology if possible (luigi) to allow users to define pipeline step "requirements" and "outputs"
      • Requirements and outputs may be either "data" (time series observations) or "record" (time series segment model outputs) information
    • Once a new storage format is implemented, the pipeline will be able to resume from existing model results by checking if any given step has its requirements satisfied in these stored results

Each of these general tasks will be discussed in further detail as "YATSM Enhancement Proposals" (YEPs 😜) issues

@ceholden ceholden added this to the v0.7.0 milestone May 12, 2016
@ceholden ceholden closed this as completed Apr 3, 2017
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant