forked from biolab/orange3-text
-
Notifications
You must be signed in to change notification settings - Fork 1
/
README.pypi
27 lines (23 loc) · 939 Bytes
/
README.pypi
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
Orange3 Text
============
Orange add-on for text mining. It provides access to publicly available data,
like NY Times, Twitter and PubMed. Further, it provides tools for preprocessing,
constructing vector spaces (like bag-of-words, topic modeling and word2vec) and
visualizations like word cloud end geo map. All features can be combined with
powerful data mining techniques from the Orange data mining framework.
See [documentation](http://orange3-text.readthedocs.org/).
Features
--------
#### Access to data
* Load a corpus of text documents
* Access publicly available data (The Guardian, NY Times, Twitter, Wikipedia, PubMed)
#### Text analysis
* Preprocess corpus
* Generate bag of words
* Embed documents into vector space
* Perform sentiment analysis
* Detect emotions in tweets
* Discover topics in the text
* Compute document statistics
* Visualize frequent words in the word cloud
* Find words that enrich selected documents