Monday, April 03, 2017

Oracle Linux - Install Neo4j

Neo4j is a graph database  a graph database is a database that uses graph structures for semantic queries with nodes, edges and properties to represent and store data. A key concept of the system is the graph (or edge or relationship), which directly relates data items in the store. The relationships allow data in the store to be linked together directly, and in many cases retrieved with one operation.

This contrasts with conventional relational databases, where links between data are stored in the data, and queries search for this data within the store and use the join concept to collect the related data. Graph databases, by design, allow simple and fast retrieval of complex hierarchical structures that are difficult to model in relational systems. Graph databases are similar to 1970s network model databases in that both represent general graphs, but network-model databases operate at a lower level of abstraction and lack easy traversal over a chain of edges.

When developing an solution which is heavily depending on the relationship between data points the choice for a graph database such as Neo4j is a good choice. Examples of such an application can be for example a solution where you need to gain insight in the relationships between historical events, the relationship between people and actions or the relationship between events in a complex system. The last might be an alternative way for logging in a distributed microservice architecture based solution.

Install Neo4j on Oracle Linux
For those who like to setup Neo4j and get started with it to explore the options it might give you company, the below short set of instructions shows how to setup Neo4j on Oracle Linux. For those who use RedHat, the instructions below will most probably also work on RedHat Linux. However, the installation is done and tested on Oracle Linux.

First thing we need to do is to ensure we are able to use yum for the installation of Neo4j on our system. Other ways of obtaining Ne04j are also available and can be used however the yum way of doing things is the most easy way and provides the quickest result. An word of caution, Neo4j currently states that the yum based installation is experimental, we have  however not found any issue while using yum.

To ensure we have the gpg key associated with the Ne04j yum repository we have to import it, shown below is an example of how you can download the key.

[root@oracle-65-x64 tmp]#
[root@oracle-65-x64 tmp]# wget http://debian.neo4j.org/neotechnology.gpg.key
--2017-04-02 13:14:07--  http://debian.neo4j.org/neotechnology.gpg.key
Resolving debian.neo4j.org... 52.0.233.188
Connecting to debian.neo4j.org|52.0.233.188|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 4791 (4.7K) [application/octet-stream]
Saving to: “neotechnology.gpg.key”

100%[===========================================>] 4,791 --.-K/s   in 0.005s 

2017-04-02 13:14:08 (1.01 MB/s) - “neotechnology.gpg.key” saved [4791/4791]
[root@oracle-65-x64 tmp]#

As soon as we have obtained the key by downloading it from the Neo4j download location we can import the key by using the import option from the rpm command as shown below:

[root@oracle-65-x64 tmp]#
[root@oracle-65-x64 tmp]# rpm --import neotechnology.gpg.key
[root@oracle-65-x64 tmp]#

Having the key will help validate the packages we download from the Neo4j repository on our Oracle Linux machine to be installed. We do however have to ensure yum is able to locate the Neo4j repository. This is done by adding a repo file to the yum repository directory. Below is shown how the repository is added to the yum configuration, this command will create the file neo4j.repo in /etc/yum./repos.d where yum can locate it and use it include the repository as a valid repository to search packages.

cat <<EOF> /etc/yum.repos.d/neo4j.repo
[neo4j]
name=Neo4j Yum Repo
baseurl=http://yum.neo4j.org/stable
enabled=1
gpgcheck=1
EOF

Having both the key and the repository present on your system will enable you to use yum for the installation of Neo4j. This means you can now use a standard yum command to install Neo4j on Oracle Linux, an example of the command is shown below.

[root@oracle-65-x64 tmp]#
[root@oracle-65-x64 tmp]# yum install neo4j

This should ensure you have Neo4j installed on your Oracle Linux instance.

Configuring Neo4j
As soon as you have completed the installation a number of tasks needs to be executed to ensure you have a proper working Neo4j installation.

By default NEo4j will not allow external connections to be made. This means that you can only connect to Neo4j by using the 127.0.0.1 address for the localhost. Even though this might very well be enough for a development or local test environment this is not what you want when deploying a server. It will be required that the Neo4j instance is also accessible from the outside world. This requires a configuration change to the Neo4j configuration file. The standard location for the configuration file, when Neo4j is deployed on an Oracle Linux machine, is /etc/neo4j . In this location you will notice the neo4j.conf file which holds all the configuration data for the Neo4j instance.

By default the below lines is commented out. Ensure you uncomment the line, this should ensure that Neo4j will accept non-local connections:

dbms.connectors.default_listen_address=0.0.0.0

Additionally your want Neo4j to starts during boot. For this you will have to ensure Neo4j is registered as a servic and activate. You can do so by executing the below command:

[root@oracle-65-x64 tmp]#
[root@oracle-65-x64 tmp]# chkconfig neo4j on

Now NEo4j should be registered as a service that will start automatically when the machine boots. To check this you can check this by using the below command.

[root@oracle-65-x64 tmp]# chkconfig --list | grep neo4j
neo4j          0:off 1:off 2:on 3:on 4:on 5:on 6:off
[root@oracle-65-x64 tmp]#

This however is not stating your  Neo4j instance is running, you will have to start it the first time after installation manually. To check the status of Ne04j on Oracle Linux you can use the below command:

[root@oracle-65-x64 ~]# service neo4j status
neo4j is stopped
[root@oracle-65-x64 ~]#

To start it you can use the below command:

[root@oracle-65-x64 ~]# service neo4j start
[root@oracle-65-x64 ~]# service neo4j status
neo4j (pid  5643) is running...
[root@oracle-65-x64 ~]#

Now you should have a running Neo4j installation on your Oracle Linux instance which is ready to be used. You now also should be able to go to the web interface of Neo4j and start using it.



Neo4J in the Oracle Cloud
When running Neo4J in the Oracle cloud the main installation of the Neo4J is already described in the section above. A number of additional things need to be kept in consideration when deploying it within the Oracle Public Cloud.

When deploying Neo4j in the Oracle cloud you will deploy it using the Oracle Public Cloud Compute Cloud Service. In the Compute Cloud Service you will have the option to provision an Oracle Linux machine and using the above instructions you will have a running machine in the Oracle Cloud within a couple of minutes.

The main key pointers you need to consider are around how to setup your network security within the Oracle Cloud. This also ties into the overall design, who can access Neo4j, which ports should be open and which routes should be allowed.

The way Oracle Cloud works with networks, firewalls and zone configuration is a bit different from how it is represented in a standard environment. However, even though the Oracle Compute Cloud service uses some different terms and different ways of doing things it provides you with exactly the same building blocks as a traditional IT deployment to do proper zone configuration and shield your database and applications from unwanted visitors.

A general advice when deploying virtual machines in the Oracle Public Cloud is to plan ahead and ensure you have your entire network and network security model mapped out and configured prior to deploying your first machine.

For the rest, using the Oracle cloud for a Neo4j installation is exactly the same as you would do in your own datacenter, with the exception that you can make use of the flexibility and speed of the Oracle Cloud.

No comments: