concrete-js is a Node.js/JavaScript library for working with Concrete, a set of NLP data types defined by a Thrift schema. Thrift makes it easy to use a shared set of data structures across multiple programming languages.
This document is available online alongside an API reference.
The concrete-js library is designed for visualization and annotation. While it is possible to implement NLP algorithms in JavaScript, consider using concrete-python or concrete-java instead.
In general, you should use concrete-js when you have a Concrete Communication object that was created by another tool, and need a user interface that visually represents data in the Communication. You may also need the user or annotator to interactively modify Concrete data structures and then save their modifications.
There are two supported distributions of concrete-js: a Node.js
distribution and a JavaScript + jQuery distribution. These two
distributions provide similar functionality but are designed to
support different approaches to web development.
If you use the npm
or yarn
commands to install third-party
libraries, the Node.js distribution will likely fit best in your
workflow.
If you don't know which distribution you should use, we suggest using
the JavaScript + jQuery distribution.
Both distributions provide serialization and deserialization code for Concrete data structures that is generated by Thrift according to the Concrete schema. They also provide utility functions for working with Concrete objects. The JavaScript + jQuery distribution additionally provides visualization code that maps Concrete data structures to DOM elements using jQuery.
The rest of this document describes the Node.js distribution of concrete-js. For documentation of the JavaScript + jQuery distribution, go to the JavaScript + jQuery concrete-js documentation page.
The Node.js distribution of concrete-js is published as @hltcoe/concrete.
To use concrete-js in your Node.js project, use npm install
to add it to
your dependencies:
npm install @hltcoe/concrete
The following code provides an example of using concrete-js: Fetching and displaying the first ten Communications from a FetchCommunicationService. To use this example, replace the example URL with the URL to your own FetchCommunicationService:
const concrete = require("@hltcoe/concrete");
/**
* Returns up to the first ten Communications from a Fetch service.
*
* @param fetchClient A Fetch client for the desired Fetch service
* @param maxNumCommunications The maximum number of Communications to
* retrieve
* @returns A list of maxNumCommunications Communications (or fewer
* if the Fetch client does not have that many Communications).
*/
async function headCommunications(fetchClient, maxNumCommunications = 10) {
const numCommunications = await fetchClient.getCommunicationCount();
const communicationIds = await fetchClient.getCommunicationIDs(
0,
Math.min(maxNumCommunications, numCommunications)
);
const request = new concrete.access.FetchRequest({
communicationIds: communicationIds,
});
const result = await fetchClient.fetch(request);
return result.communications;
}
const client = concrete.util.createXHRClientFromURL(
"http://localhost:8080/fetch_http_endpoint/",
concrete.access.FetchCommunicationService,
);
headCommunications(client).then((communications) =>
communications.forEach((communication) => {
console.log(`--- ${communication.id}`);
console.log(communication.text);
console.log("---");
})
).catch((error) => console.error(error));
The concrete-js source code can be accessed from GitHub.
You do not need to build concrete-js yourself in order to use it.
The Node.js package is available on the npm registry as
@hltcoe/concrete,
and a working copy of the JavaScript + jQuery library is available in
the dist/
directory in the concrete-js repository or here:
readable version,
minified version.
The concrete-js library
only needs to be (re)built when the Thrift schema files for Concrete
have been updated or when the code in src/
or src_nodejs/
is modified.
Requirements for building concrete-js:
- A clone of
the Concrete schema repository
at
../concrete
- Thrift
- Node.js (a JavaScript backend runtime and packaging tool)
First, install the Node.js packages required for building concrete-js:
npm ci
Then build concrete-js:
npx grunt
This command will build both the JavaScript + jQuery and the Node.js version of the library and then regenerate the documentation:
- Build Node.js library
- call the Thrift compiler to generate Node.js code for the Thrift schema under
~/concrete
- run JSHint on the code to check for problems
- combine the Thrift-generated code with the hand-written utility code in
dist_nodejs
- call the Thrift compiler to generate Node.js code for the Thrift schema under
- Build JavaScript + jQuery library
- call the Thrift compiler to generate JavaScript + jQuery code for the Thrift schema under
~/concrete
- run JSHint on the code to check for problems
- combine the Thrift-generated code with the hand-written utility code in
dist
- minify the combined code to reduce download size
- download a supported version of
thrift.js
todist/thrift.js
- call the Thrift compiler to generate JavaScript + jQuery code for the Thrift schema under
- Generate documentation in
docs
Code linting will be performed automatically when concrete-js is built.
To run tests for both libraries, do:
npm test
Or to run the tests for just the Node.js library or just the
JavaScript + jQuery library, do npm run test_nodejs
or
npm run test_js
, respectively.
Read this section carefully, as the publication process is complex.
Publishing a new version of @hltcoe/concrete
involves the following
steps:
- Update the package version
- Perform a clean build of the package with the new version
- Add changes to git and commit
- Tag the commit
- Push the built package in
dist_nodejs
to NPM - Update the package version to a pre-release version
- Add changes to git and commit
- Push
main
and the new tag to GitHub and GitLab
These steps are automated in publish.bash
. Note that a failure
at any point in the process will cause the script to exit with an
error. The script echoes each command as it runs to help you recover
from a failure.
Example usage:
./publish.bash patch