Other functions supported by the General Refine Expression Language (GREL)
See also: GREL Functions.
Returns the type of o, such as undefined, string, number, etc.
Returns a boolean indicating whether o has a field called name. For example, cell.hasField("value") always returns true, as every cell has a value field.
Quotes a value as a JSON literal value.
Parses s as JSON. get can then be used with parseJson, e.g.,
parseJson(" { 'a' : 1 } ").get("a")
returns 1.
To get all instances from a JSON array called "keywords" having the same object string name of "text", combine with the forEach() function to iterate over the array, e.g.,
JSON:
{
"status":"OK",
"url":"",
"language":"english",
"keywords":[
{
"text":"York en route",
"relevance":"0.974363"
},
{
"text":"Anthony Eden",
"relevance":"0.814394"
},
{
"text":"President Eisenhower",
"relevance":"0.700189"
}
]
}
GREL expression:
forEach(value.parseJson().keywords,v,v.text).join(":::")
will output like this:
York en route:::Anthony Eden:::President Eisenhower
Returns an array of zero or more rows in the project projectName for which the cells in their column columnName have the same content as cell c (similar to a lookup).
NOTE: There is also a VIB-BITS OpenRefine extension that automates some of the GREL cross() functionality for ease of use with its "Add column(s) from other projects..." function.
- https://groups.google.com/forum/#!topic/openrefine/unN6dQsyYUs
- https://www.bits.vib.be/index.php/software-overview/openrefine
NOTE: If you don't get matches when using cross(), then you might need to first do a few things, such as
- trim() your key column before doing cross()
- Deduplicate values in your key column if necessary
OK, with that out the way, lets show you an example of how to use the cross() function. Consider 2 projects with the following data:
My Address Book
friend | address |
---|---|
john | 120 Main St. |
mary | 50 Broadway Ave. |
anne | 17 Morning Crescent |
Christmas Gifts
gift | recipient |
---|---|
lamp | mary |
clock | john |
Now in the project "Christmas Gifts", we want to add a column containing the addresses of the recipients, which are in project "My Address Book". So we can invoke the command Add column based on this column on the "recipient" column in "Christmas Gifts" and enter this expression:
cell.cross("My Address Book", "friend")[0].cells["address"].value
When that command is applied, the result is
Christmas Gifts
gift | recipient | address |
---|---|---|
lamp | mary | 50 Broadway Ave. |
clock | john | 120 Main St. |
To extract more than one value from the result of a cross, combine with the forEach() function to iterate over the array and the forNonBlank() function to prevent errors caused by null values present in the looked up column, e.g.
forEach(cell.cross("My Address Book", "friend"),r,forNonBlank(r.cells["address"].value,v,v,"")).join("|")
Returns the facet count corresponding to the given choice value
gift | recipient | price |
---|---|---|
lamp | mary | 20 |
clock | john | 54 |
watch | amit | 80 |
clock | claire | 62 |
If we wanted to show how many of each gift we had given, we could use the following expression:
facetCount(value, "value", "gift")
This would then complete the table as so:
gift | recipient | price | count |
---|---|---|---|
lamp | mary | 20 | 1 |
clock | john | 54 | 2 |
watch | amit | 80 | 1 |
clock | claire | 62 | 2 |
A tutorial of the below HTML functions is shown StrippingHTML
returns a full HTML document and adds any missing closing tags.
You can then extract or .select() which portions of the HTML document you need for further splitting, partitioning, etc.
An example of extracting all table rows <tr> from a <div> using parseHtml().select() together is described more in StrippingHTML.
returns an element from an HTML doc element using selector syntax.
Example:
value.parseHtml().select("div#content")[0].select("tr").toString()
This function can be used with most of the Jsoup selector syntax as documented here: http://jsoup.org/cookbook/extracting-data/selector-syntax
returns a value from an attribute on an HTML Element.
returns the text from within an element (including all child elements).
returns the innerHtml of an HTML element.
returns the outerHtml of an HTML element. (Note: outerHtml Indicates the rendered text and HTML tags including the start and end tags, of the current element.)
Gets the text owned by this HTML element only; does not get the combined text of all children.