Skip to content

Commit

Permalink
updated
Browse files Browse the repository at this point in the history
  • Loading branch information
dpressel committed Nov 29, 2018
1 parent 0efec53 commit 010af90
Showing 1 changed file with 3 additions and 2 deletions.
5 changes: 3 additions & 2 deletions docs/v1.md
Original file line number Diff line number Diff line change
Expand Up @@ -29,13 +29,14 @@ The underlying changes have simplified mead considerably, making it easier to de
- The encoder-decoder (seq2seq) task has been overhauled to split the encoders and decoders and allow them to be configurable from mead. Its possible to create your own encoders and decoders. For RNN-based decoders, which need to support some sort of policy of how/if to transfer hidden-state, we have added the concept of an `arc state policy` which is also extensible. We also now enforce a tensor ordering on inputs and outputs of batch as first dimension, and temporal length as second dimension

- **Services**: The API support a user-friendly concept of a service (vs a component like the `ClassifierModel`) that has access to all 3 components required for inference (the vocabularies, the vectorizers and the models). It delegates deserialization to each component and can load any backend framework model with a simple `XXXService.load(model)` API. Inference is done using the `predict` method on the service class. Full examples can be found for [classify](https://github.com/dpressel/baseline/blob/feature/v1/api-examples/classify-text.py), [tagging](https://github.com/dpressel/baseline/blob/feature/v1/api-examples/tag-text.py), [encoder-decoders](https://github.com/dpressel/baseline/blob/feature/v1/api-examples/ed-text.py) and [language modeling](https://github.com/dpressel/baseline/blob/feature/v1/api-examples/lm-text.py)
- Services can also provide an abstraction to TF serving remote models (in production). These can be accessed by passing a `host` and `port`. The vectorizer and vocab handling works exactly as for local services, but the actual model is executed on the server instead.
- Services can also provide an abstraction to TF serving remote models (in production). These can be accessed by passing a `remote`. The vectorizer and vocab handling works exactly as for local services, but the actual model is executed on the server instead.
- Both HTTP/REST and gRPC are supported. HTTP/REST requires the `requests` package. To use gRPC, `grpc` package is required, and there are stubs included in Baseline to support
- **Data/Reader Simplifications**: The readers have been simplified to use the `vectorizers` and the `DataFeed` and underlying support components have been greatly simplifed
- The readers delegate counting to the vectorizers as well as featurization. This is more DRY than in the previous releases. This means that readers can handle much more complex features without special casing
- The `Examples` and `DataFeed` objects are largely reused between tasks except when this is not possible
- The batching operation on the examples is now completely generalized whih makes adding custom features simple
- **Easier Extension Points**: We have removed the complexity of `addon` registration, preferring instead simple decorators to the previous method of convention-based plugins. Documentation can be found [here](https://github.com/dpressel/baseline/blob/feature/v1/docs/addons.md)
- **Training Simplifications**: A design goal was that a user should easily be able to train a model without using `mead`. [It should be easier use the Baseline API to train](https://github.com/dpressel/baseline/blob/feature/v1/api-examples/tf-train-classifier-from-scratch.py)
- **Training Simplifications**: A design goal was that a user should easily be able to train a model without using `mead`. It should be easier use the Baseline API to [train directly](https://github.com/dpressel/baseline/blob/feature/v1/api-examples/tf-train-classifier-from-scratch.py) or to [use external software to train a Baseline model](https://github.com/dpressel/baseline/blob/feature/v1/api-examples/tf-estimator.py)
- Multi-GPU support is consistent, defaults to all `CUDA_VISIBLE_DEVICES`
- **More Documentation**: There is more code documentation, as well as API examples that show how to use the **Baseline** API directly. These are also used to self-verify that the API is as simple to use as possible. There is forthcoming documentation on the way that `addons` work under the hood, as this has been a point of confusion for some users
- **Standardized Abstractions**: We have attempted to unify a set of patterns for each model/task and to try and ensure that the routines making up execution share a common naming convention and flow across each framework
Expand Down

0 comments on commit 010af90

Please sign in to comment.