This subdirectory contains the Cypher TCK.
The TCK consists of a number of Cucumber .feature
files, which specify four things:
-
The required initial state of the graph; i.e. the graph prior to the execution of
Q
, the Cypher query of interest -
Q
, along with any required parameter names and values -
Expected results from executing
Q
, or expected errors ifQ
was an invalid query-
See Format of the expected results for a description of how the results are formatted
-
-
Expected side effects from executing
Q
For JVM-based implementations, there is a Scala API which parses and provides the TCK as an in-memory object structure, where the implementation simply has to hook in via implementing a small interface. For details, see the TCK tools README.
If this API is unavailable on your platform, you can integrate the TCK via Cucumber, either by depending on the TCK via Maven, or by copying the feature files directly from this repository.
<dependency>
<groupId>org.opencypher</groupId>
<artifactId>tck</artifactId>
</dependency>
Each TCK feature file is made up of scenarios (see the Cucumber documentation for more on this), and these all follow the schematic setup described here.
Q
which creates a node with one property, where the property value is given by a parameter.Scenario: Creating a node
Given an empty graph // (1)
And after having executed:
"""
CREATE () // (2)
"""
And parameter values are: // (3)
| parameter | 0 |
When executing query:
"""
CREATE ({property: $parameter}) // (4)
"""
Then the result should be empty // (5)
And the side effects should be: // (6)
| +nodes | 1 |
| +properties | 1 |
Q
- matching all nodes and returning one of them - on the graph <GRAPH_NAME>.Scenario: Returning a single node
Given the <GRAPH_NAME> graph // (1)
When executing query:
"""
MATCH (n)
RETURN n
LIMIT 1 // (4)
"""
Then the result should be, in any order: // (5)
| n |
| () |
Q
- matching all nodes and returning one of them - on the graph <GRAPH_NAME>.Scenario: Indexing a list with a string
Given any graph // (1)
When executing query: WITH [0] AS expr, 'x' AS idx RETURN expr[idx] // (4)
Then a TypeError should be raised at runtime: ListElementAccessByNonInteger // (7)
-
This step specifies the required initial state of the graph.
<GRAPH_NAME>
is the name of one of the specified graphs for initial states (read more in Graphs for initial states). -
This step is used to specify an initialization query by scenarios that require certain patterns to exist in the graph.
-
If
Q
uses parameters, this step will specify a two-column table detailing the required parameter names and values. The parameter values are in the same format as the expected results. -
The actual Cypher query,
Q
. -
This step specifies the expected results of executing
Q
, represented in a table format if not empty. The first row contains the column names, and subsequent rows contain values. However, ifQ
contains aRETURN
clause with anORDER BY
subclause, then this line should instead beThen the result should be, in order:
, thus ensuring that the result table will be ordered by the first column appearing afterORDER BY
. -
This step specifies the expected side effects of executing
Q
, with the side effect name and the relevant quantity. Read about the possible side effects in Side effects of executing a query. -
This step specifies the expected error that should be raised by executing the invalid
Q
. This step is described in detail in Cypher errors.
The keyword Given
in the TCK’s scenarios specifies the required initial state for the scenario.
In order to not obfuscate the purpose of the scenario - which is to display the behaviour of the query - the initial state is usually simple.
Certain queries do require a more complex graph structure in order to yield less obvious behavior, and in these cases a named graph included in the TCK must be set up by the implementation before the query is executed.
The TCK currently contains these named graphs:
-
binary-tree-1
-
A binary tree of depth 4, with only label
:X
on non-root nodes, and three different relationship types.
-
-
binary-tree-2
-
A binary tree of depth 4, with labels
:X
or:Y
on non-root nodes, and three different relationship types.
-
For instructions on how to use the named graph descriptions and metadata, please see Named Graphs.
A Cypher query that contains write clauses may have side effects that are persisted in the graph.
A side effect is either the addition (denoted by +
) or the removal (-
) of one of the following:
-
A node, denoted by
nodes
-
A relationship, denoted by
relationships
-
A property, denoted by
properties
-
A label, denoted by
labels
An unspecified quantity implies that it is expected to be zero.
If the side effects step reads And no side effects
, this implies that all quantities are expected to be zero.
For 'negative' tests, where errors are expected (see Cypher errors), it is implied that the graph suffers no side effects.
In order for a side effect to be reported, it has to be observable from the point of view of a subsequent Cypher query executed against the same graph. This means that side effects that are only temporarily in effect during the execution of a query are not measured in this metric.
For example, the query CREATE (n) DELETE n
, which creates a node only to immediately delete it, may be correctly implemented as a no-op by a Cypher implementation, and a TCK scenario featuring it should not specify any side effects.
Concretely, observability of each metric is defined by one Cypher query per metric, which will present the metric as the difference in returned records from executing the query before and after the query Q
under test.
These defining queries are listed in the following.
properties
metric:MATCH (n)
UNWIND keys(n) AS key
WITH properties(n) AS properties, key, n
RETURN n AS entity, key, properties[key] AS value
UNION ALL
MATCH ()-[r]->()
UNWIND keys(r) AS key
WITH properties(r) AS properties, key, r
RETURN r AS entity, key, properties[key] AS value
Note that in the definition above, a property is defined as the triple of containing entity, key, and value. Therefore the operation of moving a property from one entity to another will be noted as one removal and one addition in the side effects. Likewise, the operation of changing the value of a property is noted as one removal and one addition in the side effects.
Values that can be returned from Cypher can be categorized into three groups: primitives (integers, floats, booleans, and strings), containers (lists and maps), and graph elements (nodes, relationships, and paths). Please refer to the Cypher Type System specification for more information about types and values in Cypher.
Unless there is an ORDER BY
present in the RETURN
clause of the Cypher query, Cypher provides no guarantees as to the order in which the records are returned.
In theory, this means that executing the same query twice could yield the same records returned in different orders.
For this reason, the rows of the expected results table are to be considered a set, rather than a list, unless the above condition on the RETURN
clause is met.
-
Primitives:
-
An integer will be written as a simple string of decimal digits.
-
A float will be written in decimal form with all present decimals, or in scientific form, or with the strings
NaN
,Inf
, or-Inf
for the IEEE 754 special values. -
A string will be written as a string of unicode characters, wrapped in single quotes.
-
Note that Cypher makes no difference between single and double quotes (when used as string indicators), but the TCK will always use single quotes in the expected values.
-
-
A boolean will be written as the string
true
orfalse
. -
A null value will be written as the string
null
.
-
-
Containers:
-
A list will be written as
[v0, v1, …, vn]
, wherevi
are the values contained in the list.-
Lists in Cypher may contain any combination of values, including lists (nesting).
-
-
A map will be written as
{k0: v0, k1: v1, …, kn: vn}
, whereki
are the keys andvi
the values of the map.-
Map keys in Cypher are strings (with some constraints), while values may be of any type.
-
-
-
Graph elements:
-
A node with labels
L1
andL2
, and propertiesp
andq
with values0
and'string'
, respectively, will be written as(:L1:L2 {p: 0, q: 'string'})
. -
A relationship with type
T
, and properties as the node above, will be written as[:T {p: 0, q: 'string'}]
. -
A path will be written as
<n0-r1->n1<-r2- … -rk->nk>
, whereni
andri
are the nodes and relationships, respectively, that make up the path.-
Note that the relationship direction is always specified and may be left-to-right or right-to-left, as exemplified in the outline.
-
Note that the smallest possible path, with length zero, consists of one node and zero relationships.
-
-
In order to implement the Cypher TCK, you will have to retrieve the full suite of TCK feature files, which are best found in this GitHub repository under features
.
The TCK feature files are also included in the resources
path of a Maven JAR archive that is periodically released as part of the openCypher release process.
Find the latest version via Maven Central.
The Then
step used to specify expected errors from running a given invalid query follows this schematic setup:
Then a TYPE should be raised at PHASE: DETAIL
TYPE will be one of the following error types:
-
SyntaxError "The statement contains invalid or unsupported syntax."
-
SemanticError "The statement is syntactically valid, but expresses something that the database cannot do."
-
ParameterMissing "The statement refers to a parameter that was not provided in the request."
-
ConstraintVerificationFailed "A constraint imposed by the statement is violated by the data in the database."
-
ConstraintValidationFailed "A constraint imposed by the database was violated."
-
EntityNotFound "The statement refers to a non-existent entity."
-
PropertyNotFound "The statement refers to a non-existent property."
-
LabelNotFound "The statement refers to a non-existent label."
-
TypeError "The statement is attempting to perform operations on values with types that are not supported by the operation."
-
ArgumentError "The statement is attempting to perform operations using invalid arguments."
-
ArithmeticError "Invalid use of an arithmetic operation, such as dividing by zero."
PHASE will be either runtime
or compile time
.
DETAIL is a more fine-grained categorization of the error, and will describe the actual circumstance that caused the error to happen.
TCK might be executed on graph databases with a schema. To avoid errors when values with different types are assigned to properties with the same name, following property names are suggested:
Type | Property name |
---|---|
Variable types |
|
Integer |
|
String |
|
Float |
|
Temporal |
|
Boolean |
|
List |
plural of type e.g. |
The Cypher TCK is licensed with Apache license 2.0, which is inherited from the containing openCypher
project.
Read more in the openCypher
README.