API Compatibility with Numpy Arrays and Scipy Matricies for features #16

uwaisiqbal · 2017-06-28T14:51:22Z

At the moment the library only accepts a list of feature dictionaries which for our purposes can consume an enormous amount of memory even when using generators. Would it be possible to extend the API to accept numpy arrays or scipy sparse matricies generated from the sklearn DictVectorizer?

kmike · 2017-06-28T17:37:30Z

@Oasis789 crfsuite implements vectorization itself, that's why dicts are currently exposed. I wonder why do you prefer DictVectorizer - sklearn-crfsuite data format is largely compatible, with a few extra features usable for sequential models.

It could be possible to implement what you're suggesting usin crfsuite C API (https://github.com/jakevdp/pyCRFsuite did that), but it requires wor.

See also: scrapinghub/python-crfsuite#38

uwaisiqbal · 2017-06-29T14:29:01Z

I wanted to put together a pipeline for feature generation that would include the crf model making use of sklearn feature unions. The feature unions concatenate the output of transformations in the form of spares matrices. I wanted to be able to feed this directly to the crf model within the pipeline.

albertoandreottiATgmail · 2018-07-11T05:14:20Z

hi @kmike are floats used as features in dictionaries taken as they are or do they suffer any transformation? I'm asking because I'm concerned with data sparcity, for example if I encode my feature in a [-1, 1] range I wouldn't like the vectorizer to create a single feature for each single possible value.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

API Compatibility with Numpy Arrays and Scipy Matricies for features #16

API Compatibility with Numpy Arrays and Scipy Matricies for features #16

uwaisiqbal commented Jun 28, 2017

kmike commented Jun 28, 2017

uwaisiqbal commented Jun 29, 2017

albertoandreottiATgmail commented Jul 11, 2018 •

edited

Loading

API Compatibility with Numpy Arrays and Scipy Matricies for features #16

API Compatibility with Numpy Arrays and Scipy Matricies for features #16

Comments

uwaisiqbal commented Jun 28, 2017

kmike commented Jun 28, 2017

uwaisiqbal commented Jun 29, 2017

albertoandreottiATgmail commented Jul 11, 2018 • edited Loading

albertoandreottiATgmail commented Jul 11, 2018 •

edited

Loading