Wednesday, July 17, 2013

Oracle CEP, Hadoop and Oracle eBS

The internet has always been focussing around people, enabling people to find information on the internet was one of the first usages of the public internet next to sharing information via mail and newsgroups. The second phase was, also called WEB2.0, the sharing culture. The sharing culture is influenced by the uprising of social networks. In my opinion WEB2.0 is not about new technologies it is more about the way people think and interact with the internet and the social media that is available via the internet. Now we see a new trend coming, this trend is almost completely technology driven, it is the internet of things. Internet of things is about machines using the internet and especially using the network infrastructure of the internet.

With the internet of things phase of the internet you will see more and more equipment being connected to the internet. Your phone is quite obvious however also your music installation and your television are already connected to the internet most likely. Now your fridge, your microwave, your doorlocks, your scale and more will be connected to the internet. Some of those connections will be used to help you organize your life and make things easier. Some of those things will also help vendors create better products.

If we take for example the washing machine, your washing machine can be connected to the internet and send a tweet or other form of message to you to inform you that it is done and you can get your fresh clothing out of it. However, it can also server a purpose for the engineers building washing machine. For an engineering team it is very valuable to know how many times you use it a week, what the average load is, what program you are using. Next to this other things might be interesting to know, what is the average load put on top of a washing machine, what is the average temperature, air pressure and humidity in the room where the machine is standing. All this information can help an engineering team to build a better machine and spot possible solutions for a specific market segment based upon a geographical or demographic profile.

Building a board and connecting all sensors to measure those things is not the biggest hurdle, even if you are quite an amateur in electronics you will be able to build a device like that with ease with for example a RaspBerry Pi and some out of the box sensors from adafruit.com . Connecting it via the home wifi to the internet and have it broadcast valuable information back is also not the big issue. The main issue will be around two things.

Consider you are a large enterprise and thinking about shipping millions of your products and collecting data from that, your first issue will be storing this information in a certain way. The second issue is retracing this information and making it usable for future analysis. Future analysis can be in this case for multiple departments within your company, for example the marketing department, engineering department and your warranty and claims department.

What is interesting to me is especially the gathering and storing part. To be able to handle a hight flow of data and take correct actions on this and store this in a good way in your datastore you will have to have some sort of mechanism in place. First solution that comes to mind is writing the data to a large file in a line by line manner on a HDFS filesystem and then have it chunked into the correct format by a MapReduce algorithm. What however can be a great alternative before you write your data to HDFS is to use Oracle CEP to already look into the data and take action. Oracle CEP, or Complex Event Processing, formally known as WebLogic Event Server).

Oracle CEP is a Java server for the development and deployment of high-performance event driven applications. It is a lightweight Java application container based on Equinox OSGi, with shared services, including the Oracle CEP Service Engine, which provides a rich, declarative environment based on Oracle Continuous Query Language (Oracle CQL) - a query language based on SQL with added constructs that support streaming data - to improve the efficiency and effectiveness of managing business operations. Oracle CEP supports ultra-high throughput and microsecond latency using JRockit Real Time and provides Oracle CEP Visualizer and Oracle CEP IDE for Eclipse developer tooling for a complete real time end-to-end Java Event-Driven Architecture (EDA) development platform.

The below image is showing a deployment which is using Oracle CEP among with some other parts.


As you can see in the above image the washing machines are reporting the sensor readings to the Oracle CEP Stream Adapter. The Stream Adapter will in turn "stream" this into the CQL processor. What the CQL processor is doing in this setup is monitoring all incoming sensor readings. Most of the results will be passed directly to the HDFS storage where it will later be used by Hadoop to chop it into usable parts and fill the databases of the different departments. The big differentiator in this setup is however that by using the CEP approach is that when the CQL processor is detecting a fault in the washing machine, for example a broken part, it can call a Java Event Bean and in return the Java Event Bean can send a message to the ERP system that a repair is needed for a specific machine. If we take for example Oracle E-Business Suite we could send a trigger to Oracle E-Business Suite to spawn a service task and assign a specific service engineer to it. The service department then will inform the customer that something is broken and will, by using the correct modules in Oracle E-Business suite, plan the repair.

When the service engineer is done the machine can report, via the same way as all the sensor data is send, that the machine is correctly operating again and the service request can be closed via an automated approach.

This example is showing that by using machine-2-machine communication and by suing different technologies a company building products and a customer can benefit from reporting information directly. By implementing a Hadoop way of handling the enormous amounts of sensor reading we can ensure that every department is getting the data they need in there data-warehouse and are not overloaded with information. The above example is a very simple and very high level example, building a solution like this in the real-world is still a challenge and will need a large number of different skills however it can help a company and customers in many ways.

No comments: