Johan Louwers - Tech blog: December 2017

Sunday, December 31, 2017

Oracle Linux - start Apache Kafka as service

In a previous post I showcased how to run Apache Kafka on Oracle Linux. This was intended as an example for testing purposes, the downside of this example was that you needed to start Zookeeper and Kafka manual. Adding it to the startup scripting of your Oracle Linux system makes sense. Not only in a test environment, also in a production environment (especially) you want to have Kafka started when you boot your machine. The below code snippet can be used for this.

You have to remember that, after you place this in /etc/init.d, you have to use chkconfig to add it as a service and you have to use the service command to start it for testing.

Saturday, December 30, 2017

Oracle Linux - Install Apache Kafka

Apache Kafka is an open-source stream processing platform developed by the Apache Software Foundation written in Scala and Java. The project aims to provide a unified, high-throughput, low-latency platform for handling real-time data feeds. Its storage layer is essentially a "massively scalable pub/sub message queue architected as a distributed transaction log," making it highly valuable for enterprise infrastructures to process streaming data. Additionally, Kafka connects to external systems (for data import/export) via Kafka Connect and provides Kafka Streams, a Java stream processing library.

In this blogpost we will install Apache Kafka on Oracle Linux, the installation will be done in a test setup which is not ready for production environments however can very well be used to explore Apache Kafka running on Oracle Linux.

Apache Kafka is also provided as a service from the Oracle Cloud in the form of the Oracle Cloud Event Hub. This provides you a running Kafka installation that can be used directly from the start. The below video shows the highlights of this service in the Oracle Cloud.

In this example we will not use the Event Hub service from the Oracle Cloud, we will install Kafka from the ground up. This can be done on a local Oracle Linux installation or it can be done on a Oracle Linux installation in the Oracle Cloud, making use of the Oracle IaaS components in the Oracle Cloud.

Prepare the system for installation.
In esscence, the most important step you need to undertake is to ensure you have Java installed on your machine. The below steps outline how this should be done on Oracle Linux.

You can install the Java OpenJDK using YUM and the standard Oracle Linux repositories.

yum install java-1.8.0-openjdk.x86_64

You should now be able to verify that Java is installed in the manner shown below as an example.

[root@localhost /]# java -version
openjdk version "1.8.0_151"
OpenJDK Runtime Environment (build 1.8.0_151-b12)
OpenJDK 64-Bit Server VM (build 25.151-b12, mixed mode)
[root@localhost /]#

This however is not making sure you have set the JAVA_HOME and JRE_HOME as environment variables. To make sure you will have the following two lines in /etc/profile.

export JAVA_HOME=/usr/lib/jvm/jre-1.8.0-openjdk
export JRE_HOME=/usr/lib/jvm/jre

After you made the changes to this file you reload the profile by issuing a source /etc/profile command. This will ensure that the JRE_HOME and JAVA_HOME environment variables are loaded in the correct manner.

[root@localhost ~]# source /etc/profile
[root@localhost ~]#
[root@localhost ~]# env | grep jre
JRE_HOME=/usr/lib/jvm/jre
JAVA_HOME=/usr/lib/jvm/jre-1.8.0-openjdk
[root@localhost ~]#

Downloading Kafka for installation
Before following the below instructions, it is good practice to check what the latest version is and download the latest stable Apache Kafka version. In our case we download the file kafka_2.11-1.0.0.tgz for the version we want to install in our example installation.

[root@localhost /]# cd /tmp
[root@localhost tmp]# wget http://www-us.apache.org/dist/kafka/1.0.0/kafka_2.11-1.0.0.tgz
--2017-12-27 13:35:51--  http://www-us.apache.org/dist/kafka/1.0.0/kafka_2.11-1.0.0.tgz
Resolving www-us.apache.org (www-us.apache.org)... 140.211.11.105
Connecting to www-us.apache.org (www-us.apache.org)|140.211.11.105|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 49475271 (47M) [application/x-gzip]
Saving to: ‘kafka_2.11-1.0.0.tgz’

100%[======================================>] 49,475,271  2.89MB/s   in 16s 

2017-12-27 13:36:09 (3.01 MB/s) - ‘kafka_2.11-1.0.0.tgz’ saved [49475271/49475271]

[root@localhost tmp]# ls -la *.tgz
-rw-r--r--. 1 root root 49475271 Nov  1 05:39 kafka_2.11-1.0.0.tgz
[root@localhost tmp]#

You can untar the downloaded file with a tar -xvf kafka_2.11-1.0.0.tgz and than move it to the location where you want to place Apache Kafka. In our case we want to place kafka in /opt/kafka so we undertake the below actions:

[root@localhost tmp]# mkdir /opt/kafka
[root@localhost tmp]#
[root@localhost tmp]# cd /tmp/kafka_2.11-1.0.0
[root@localhost kafka_2.11-1.0.0]# cp -r * /opt/kafka
[root@localhost kafka_2.11-1.0.0]# ls -la /opt/kafka/
total 48
drwxr-xr-x. 6 root root    83 Dec 27 13:39 .
drwxr-xr-x. 4 root root    50 Dec 27 13:39 ..
drwxr-xr-x. 3 root root  4096 Dec 27 13:39 bin
drwxr-xr-x. 2 root root  4096 Dec 27 13:39 config
drwxr-xr-x. 2 root root  4096 Dec 27 13:39 libs
-rw-r--r--. 1 root root 28824 Dec 27 13:39 LICENSE
-rw-r--r--. 1 root root   336 Dec 27 13:39 NOTICE
drwxr-xr-x. 2 root root    43 Dec 27 13:39 site-docs
[root@localhost kafka_2.11-1.0.0]#

Start Apache Kafka
The above steps should have placed Apache Kafka on your Oracle Linux system, now we will have to start it and test it for its working. Before we can start Kafka on our Oracle Linux system we first have to ensure we have ZooKeeper up and running. To do so, execute the below command in the /opt/kafka directory.

bin/zookeeper-server-start.sh -daemon config/zookeeper.properties

Depending on the sizing your machine you might want to change some things to the startup script for Apache Kafka. When, as in my case, you deploy Apache Kafka in an Oracle Linux test machine you might not have as much memory allocated to the test machine as you might have on a "real" server. The below line is present in the bin/kafka-server-start.sh file which sets the memory heap size that should be used.

    export KAFKA_HEAP_OPTS="-Xmx1G -Xms1G"

in our case we changed the heap size to 128 MB which is more than adequate for testing purposes however migth be way to less when trying to deploy a production system. The below is an example of the setting as used for this test:

    export KAFKA_HEAP_OPTS="-Xmx1G -Xms128M"

This should enable you to start Apache Kafka for the first time as a test on Oracle Linux. You can start Apache Kafka from the /opt/kafka directory using the below command:

bin/kafka-server-start.sh config/server.properties

You should be able to see a trail of messages from the startup routine and, if all is gone right the last message should be the one shown below. This shuuld be an indication that Kafka is up and running.

INFO [KafkaServer id=0] started (kafka.server.KafkaServer)

Testing Kafka
As we now have Apache Kafka up and running we could (should) test if Apache Kafka is working as expected. Kafka comes with a number of scripts that will make testing more easy. The below scrips come in use when starting to test (or debug) Apache Kafka;

bin/kafka-topics.sh
For taking actions on topics, for example creating a new topic

bin/kafka-console-producer.sh
Used for the role as producer of event messages

bin/kafka-console-consumer.sh
for the role as consumer for receiving event messages.

The first step in testing is to ensure we have a topic in Apache kafka to publish event messsage towards. For the we can use the kafka-topics.sh script. We will create the topic "test" as shown below;

[vagrant@localhost kafka]$
[vagrant@localhost kafka]$ bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic test
Created topic "test".
[vagrant@localhost kafka]$

To ensure the topic is really available in Apache Kafka we list the available topics with the below command:

[vagrant@localhost kafka]$ bin/kafka-topics.sh --list --zookeeper localhost:2181
test
[vagrant@localhost kafka]$

having a topic in Apache Kafka should enable you to start producing messages as a producer. The below example showcases starting the kafka-console-producer.sh script. This will give you an interactive commandline where you can type messages.

[vagrant@localhost kafka]$ bin/kafka-console-producer.sh --broker-list localhost:9092 --topic test
hello 1
hello 2
thisisatest
this is a test
{test 123}

[vagrant@localhost kafka]$

As the whole idea is to produce messages on one side and recieve them at the other side we will also have to test the subscriber side of Apache Kafka to see if we can receive the messages on the topic "test" as a subscriber. The below command subscribes to the topic "test" and the --from-beginning options indicates we want to recieve not only event messages that are created as from this moment, we want to receive all messages from the beginning (creation) of the topic (as far as they are available in Apache Kafka).

[vagrant@localhost kafka]$ bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic test --from-beginning
hello 1
hello 2
thisisatest
this is a test
{test 123}

As you can see the event messages we created as the producer are received by the consumer (subscriber) on the test topic.

Wednesday, December 27, 2017

Oracle Cloud - The future of retail

Retail in general is changing, customers are more and more getting into the drivers seat and voice what they expect from a retailer. The situation where retailers are ensured from business based upon the location where their store is located are over. Customers more and more tend to use online retailers in combination with brick and mortar stores. Ensuring you provide exactly the services customers desire a the right survival tactic for most retailers. For those who are unable to make this change it is not uncommon to find themselves in a situation that they go bankrupt simply due to the fact that customers are overlooking them and do business with other retailers.

Next to providing customers what they know they want it is the tactic of being able to understand the customer in such a way that you can provide them services they did not knew they wanted. Surprising customers with unexpected services and options is vital for survival.

being able to deliver goods the same day to any location the customer might be or might visit, changing advertising screens based upon customers that are close to them, dynamically change pricing throughout the day, analyzing eye movement from customers to optimize shelving or deploying robots in stores to help customers.... all examples of retail innovations that more than real and are currently already used and tested by retailers.

Oracle Cloudday - the future of retail from Johan Louwers

Within this presentation a number of innovations are showcased and aligned with Oracle Cloud technology to support retailers to implement them. Leveraging the Oracle Cloud for retail innovation is something retailers should look into to ensure they are able to stay ahead of competition.

Oracle Cloud - Enterprise Security

Security is becoming more and more a topic which is getting the attention it needs. Enterprises and C level executives start grasp the reality of the importance of security, legislation forcing enterprises to ensure the right level of security is in place to safeguard customer information and to ensure the safety of vital infrastructure services are pushing the topic in the right direction. Even though the topic is starting to get the correct level of attention the actual state of security at most enterprises is still lacking behind the reality of threads. The majority of enterprises are still relying on old and outdated mechanisms to secure their vital IT assets and data.

Oracle Cloudday security from Johan Louwers

In this presentation an outline is given on the brutal truth of the state of security at the majority of enterprises. Additionally an insight is given in the solutions Oracle and the Oracle Cloud are providing to help enterprises to face the security challenges of today.

Tuesday, December 19, 2017

Oracle Management Cloud - Manage multiple clouds like OPC, AWS & AZURE

There will be almost no enterprise that will have all the systems in a single datacenter or with a single cloud provider. With the rise of hybrid cloud strategies and with the rise of cloud native and cloud born application we see more and more that application footprints become hybrid and span multiple cloud providers and in some cases also multiples clouds and customer datacenters. There are very good reasons for this and technically there are no issue in achieving this.

However, management and monitoring of systems and services in a distributed environment that is spanning multiple cloud vendors and private datacenters can become extremely difficult if you do not plan this and ensure you have a strategy for it. Planning for it and having a strategy for it can, and this is very advisable, include the use of one single and central management and monitoring application.

When you use the Oracle Cloud Management solution you can, for example, monitor an application as a single application even though it might span over multiple cloud vendors.

The screenshot above shows the automatically generated and updated application topology of an application spanning both the Oracle Public Cloud (OPC) and AWS. In this example we have servers and services that are consumed by both cloud providers, however, monitoring is done from a single solution.

Having the option to manage all your applications, regardless from the location where they reside is a big benefit for every company and can improve the stability of your IT footprint and lower the time needed to resolve incidents.

Oracle Linux - Base64 encoding

Base64 is a group of similar binary-to-text encoding schemes that represent binary data in an ASCII string format by translating it into a radix-64 representation. The term Base64 originates from a specific MIME content transfer encoding. For all clarity, Base64 encoding is NOT encryption and it will not make your message or data secure. However, a lot of good use cases for using a Base64 encoding exist and you will come across them frequently.

When working with Base64 encoded data (files, strings, etc) while developing scripting in Bash requires you to understand the basics of how to handle it. Under Oracle Linux the most common way is using the base64 command.

As an example, we have a string which we do want to apply base64 encoding on:

[vagrant@consul-dc2 tmp]$ echo "this is a clear text" | base64
dGhpcyBpcyBhIGNsZWFyIHRleHQK
[vagrant@consul-dc2 tmp]$

In the above example you will see that the clear text string is transformed into a base64 encoded string.

As an example, we have the base64 encoded string from the previous example and we want to decode this to clear text:

[vagrant@consul-dc2 tmp]$ echo "dGhpcyBpcyBhIGNsZWFyIHRleHQK" | base64 --decode
this is a clear text
[vagrant@consul-dc2 tmp]$

This, in effect, is all you have to do do handle base64 encoding under Oracle Linux.

Monday, December 18, 2017

Oracle Linux | Consul ServiceAddress from multiple datacenters

Consul, developed by HashiCorp, is a tool commonly used for service discovery and service registration in microservice based landscapes. When running a distributed landscape spanning multiple datacenters or multiple Oracle Cloud zones you will, most likely, be running multiple cosul datacenter clusters. A consul datacenter cluster will take care of all service registration requests and all service discovery requests for this specific datacenter. Service discovery can be done based upon a REST API interface or it can be done based upon a DNS resolve mechanism.

Having stated that a Consul datacenter cluster takes care of all service registered in that datacenter is not fully true. You can connect the Consul clusters in the different datacenters or Oracle Cloud zones together with a concept called WAN gossip. Using WAN gossip the clusters will be aware of each other and you can use you local cluster to query services in the other datacenter.

An example of the construct of a FQDN is shown below. In this example we query a service named testservice in datacenter dc1. A DNS resolve on this FQDN will give you all the IP addresses for servers (or Docker containers) running this service.

testservice.service.dc1.consul

if we query the same Consul cluster without dc1 (testservice.service.consul) we will get exactly the same IP's as the DNS service from Consul will default the datacenter name to the name he is responsible for. however, if we query testservice.service.dc2.consul we will get a list of all IP's for instances for the testservice registered in the other datacenter.

In a lot of cases this is a very workable solution and it will solve most of the situations, it also gives a level of safeguard against cross-dc requests. However, in some cases you would like to have a full list of all IP's for a services without taking into consideration in which DC they reside.

In effect, Consul is not supporting this out of the box at this moment. If you use the API and not DNS the below command is the quickest way to get a list of all IP addresses for a service without taking into account in which datacenter they reside.

[vagrant@consul-dc2 testjson]$ { curl -s "http://localhost:8500/v1/catalog/service/testservice?dc=dc1" ; curl -s "http://localhost:8500/v1/catalog/service/testservice?dc=dc2" ; } | jq '.[].ServiceAddress'
"10.0.1.16"
"10.0.1.17"
"10.0.1.15"
"10.0.1.16"
"10.0.2.20"
"10.0.2.15"
"10.0.2.16"
[vagrant@consul-dc2 testjson]$

A similar command could be made using dig and leveraging the DNS way of doing things. However, for some reason it feels like this is a missing option in Consul.

Do note, we use JQ also, JQ is available from the Oracle Linux YUM repository, however, you will have to enable an additional channel as it is not available in the default enabled ones.

Saturday, December 09, 2017

Oracle Linux - use tee to write to stdout and file

When writing scripting for Oracle Linux using Bash a very common scenario is that you want to write something to both the screen and to a file. In effect there is nothing that would prevent you from writing a line that will write to the screen and the other to write to a file. However, it is not that practical. You could decide to write a function that will take the line as input and will take both tasks. This will already limit the number of lines of code you might need. However a more gentle solution can be found in the tee command.

The tee command will read from standard input and write to standard output and files and can take the following flags:

-a, --append : append to the given FILEs, do not overwrite

-i, --ignore-interrupts : ignore interrupt signals

so, in effect, if you want to write to screen and file in one single line of code you can use the tee command to achieve this. An example of this is shown below where we will write "Hello world" to both the screen and a file.

[root@docker consul]# echo "hello world" | tee /tmp/world.txt
hello world
[root@docker consul]# cat /tmp/world.txt 
hello world
[root@docker consul]#

The nice thing about the tee command is that this will work for all output which is send to stdout. Meaning, you can spool almost everything to a file and show it on the screen at the same time.