A repository containing Python interface for accessing the HIRAX Layout DB, as well as the code for the web interface.
Assuming that JanusGraph, Flask, and React are installed and configured (see sections below), we can start the JanusGraph server, followed by the Flask server, and then the React server for testing.
To start the JanusGraph server, do
cd janusgraph-full-0.x.xx/
bin/janusgraph.sh start
Once the JanusGraph server is started, we start the Flask server (using port 4300, which is currently the default port that the React server communicates with):
cd flask-interface/
flask run --no-debugger -p 4300
Once that is finished, open up another terminal and start the Oauth server
cd oauth-proxy-server
npm install # For the first time running the project
npm start
Finally, we start the React server:
cd web-interface/
npm start
The following steps outline how to install JanusGraph along with the Apache Cassandra storage backend and the Elasticsearch indexing backend, as well as setting up Gremlin-Python, which is used for querying the JanusGraph backend from a Python interface. The operating system used is Windows 10 running Windows Subsystem for Linux using Ubuntu 20.04.2 LTS, however the following instructions will work for native Ubuntu 20.04.2 LTS.
JanusGraph is built on top of Apache TinkerPop, which, in turn, is built on top of Java and hence requires Java SE 8. The implementation of Java that we will install is OpenJDK 1.8. First, refresh the list of available packages:
sudo apt update
Next, install OpenJDK 1.8:
sudo apt install openjdk-8-jdk
To verify that the correct version has been installed, run java -version
. A version similar to openjdk version "1.8.0_292"
should be displayed.
Head to /usr/lib/jvm/
and locate the installation fo the JDK. It should look similar to /usr/lib/jvm/java-8-openjdk-amd64
. Set the $JAVA_HOME
environment variable to the point to the installation of the JDK:
export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
From the JanusGraph Releases, download the .zip of the "full" installation of the latest JanusGraph version (the file name should resemble janusgraph-full-X.X.X.zip
, where X.X.X
is the version number), and extract the contents. This "full" installation includes pre-configured JanusGraph, Apache Cassandra and Elasticsearch.
Important: You should follow all the steps below for proper functioning of padloper.
- Add
schema.default=none
to the graph properties, i.e., toconf/janusgraph-cql-es.properties
. This will only allow vertices/edges of the right type to be added and will throw an exception if you break the schema. - The default memory settings can cause the JanusGraph to use quite a bit of RAM. Here are the parameters you can/should tune:
- With the default settings, the biggest hog is the Cassandra backend (which uses 25% of the RAM or 8GB, whichever is less!). For something more reasonable, edit the
MAX_HEAP_SIZE
setting incassandra/conf/cassandra-env.sh
. You can set it to something like512M
or1024M
. So far 512 MB has proven perfectly adequate. - You can also set the JVM heap size for JanusGraph and the ElasticSearch backend. The relevant parameters are
-Xms
and-Xmx
in each ofelasticsearch/config/jvm.options
,conf/jvm-8.options
andconf/jvm-11.options
. These represent the initial and maximum heap size allowed, respectively, and you should set them to the same value. For instance, for a heap size of 1 GB, set them to-Xms1g
and-Xmx1g
; for 512 MB,-Xms512m
and-Xmx512m
. If you put these settings in each of the configuration files listed earlier, then a total of 2 GB would be used.
- With the default settings, the biggest hog is the Cassandra backend (which uses 25% of the RAM or 8GB, whichever is less!). For something more reasonable, edit the
- From here, start the JanusGraph server by running
bin/janusgraph.sh start
- Finally, you need to define the schema. Open the Gremlin console as described in the next session, and execute the commands in the
index_setup.txt
file. This will tell JanusGraph which vertex/edge properties are allowed, their type and will also create indices for faster searching.
Once connected to the JanusGraph server, we can open the Gremlin console by running
bin/gremlin.sh
Next, we may create a remote connection to the JanusGraph server. To use the variables when remotely accessing the Gremlin server using Gremlin console, we can connect to the server with a session:
:remote connect tinkerpop.server conf/remote.yaml session
From here, we can send commands to the JanusGraph server by preceding them with :>
. We can avoid this by running
:remote console
which will enable sending all queries directly to the JanusGraph server and avoid the need of :>
.
We can also access the JanusGraph server from a Python interface. First, we install the gremlinpython
Python module by running
pip install gremlinpython
Important: make sure the version of gremlinpython
is supported by your version of Janusgraph. For instance, for v0.6.2 of Janusgraph, more recent versions of gremlinpython
are not supported and you have to do:
pip install gremlinpython==3.5.3
Now, we may create a Python file to connect to and query the graph:
# Import full gremlinpython functionality
from gremlin_python import statics
from gremlin_python.structure.graph import Graph
from gremlin_python.process.graph_traversal import __
from gremlin_python.driver.driver_remote_connection import DriverRemoteConnection
# Instantiate a Gremlin Graph clientside
graph = Graph()
# Connect to the server, instantiate traversal of graph. Note that the server is opened on port 8182 by default.
g = graph.traversal().withRemote(DriverRemoteConnection('ws://localhost:8182/gremlin','g'))
# Get the vertices of the graph as a list, and print them.
print(g.V().toList())
This code will print a list of the vertices of the graph.
As of at least Janusgraph 0.6.2, this recommendation is deprecated, but is left here in case a similar issue arises in the future.
Using JanusGraph 0.5.3 or later versions, it is possible that a Netty version between 4.1.44 and 4.1.46 is used. To check, navigate to
lib/
and check the version numbers ofnetty-all
,netty-common
,netty-buffer
,netty-codec
,netty-handler
,netty-resolver
, andnetty-transport
files. These versions have a bug that may cause intermittent Apache Cassandra exceptions that become more frequent as the number of vertices and edges gets larger. Read this GitHub issue for more details.If your Netty version is newer than 4.1.46, this section may be skipped.
Otherwise, head to the Maven repository for Netty and download the latest (stable) versions for
netty-all
,netty-common
,netty-buffer
,netty-codec
,netty-handler
,netty-resolver
, andnetty-transport
, and replace the old .jar files inlib/
with these new files.
To install Flask, run the following command:
pip install -Iv Flask==2.0.1
pip install -Iv python-dotenv==0.19.0
This will install Flask
version 2.0.1 and python-dotenv
version 0.19.0 (see the TODOs, updating these will make Flask not work), which will read the .flaskenv
file in the flask-interface folder to configure the Flask server.
In web-interface
, run npm install
to install all dependencies. However, react-scripts
must be set to version 4.0.3
(see the TODOs).