This is a simple project that attempts to expose Chrome's natural language segmentation to JavaScript. It's particularly useful for languages like chinese that aren't easily parsed in to 'words'.
This is a jQuery plugin.
To tokenize text in all 'p' tags on a page use...
var tokens = $('p').findTokens();
To surround each token in a span so you can access them later...
<p> I like watermelon. </p>
<p> 我很喜歡西瓜。</p>
// $('.word-watermelon').text() === 'watermelon';
// $('.word-西瓜').text() === '西瓜';
Example here: http://codepen.io/psytau/pen/sjJKl
clone then...
npm install
karma run tests/karma.conf.js
to install karma and run tests