forked from madlib/archived_madlib
-
Notifications
You must be signed in to change notification settings - Fork 0
To Do List (by Method)
agorajek edited this page Mar 22, 2011
·
17 revisions
Do Not edit his document! It's content has been moved to jira.madlib.net as Tasks or New Features. If you want to add a new TODO item add it on JIRA http://jira.madlib.net
- Add info output about target db and schema during installation/updates etc.
- DONE: Allow dense arrays as input (not SVEC only)
- DONE: Allow the method to start processing a source table w/o a "PointID" column (now it needs both PID and POSITION columns)
- Process dense arrays w/o rewriting them into SVEC.
- Implement other distance measures for k-means.
- Fix the "goodness of fit" test to be scale, rotation and transition invariant
- Rewrite the following plpgsql stored procedure into C:
- udf: __kmeans_bestCentroid
- uda: __kmeans_meanPosition
- Overload main kmeans_run pl/python function with a 2nd version that takes additional dictionary argument to overwrite the algorithm constants (like: sampling_size, max_iterations, etc)
- Add multi-user / multi-session support (output tables based on RUN_ID parameter).
- Modify API to take a table/view name with predefined columns, instead of table and column names as parameters.
- Add validation for all parameters.
- Overload main svdmf_run pl/python function with a 2nd version that takes additional dictionary argument to overwrite the algorithm constants (like: original_step, num_iterations, etc)
- Convert all support tables to temporary tables.
- Support for other base types.
- Indexing on svecs. This requires indexing capability on arrays, which is currently unsupported in GP.
- Adjust documentation to our standards.
- Add support for sparse vectors (now it's only array of float8s).
- Prefix support functions with "__"
- Review API, add support for multi-user/session environment.
- Convert to Python
- Add parameter validation
- Add measures for the goodness of fit
- Add raw fmsketch and mfvsketch methods, and client-side python to parse it. This will allow the sketches to be pre-materialized and used in clients (e.g. Wrangler)