Johan Louwers - Tech blog

Sunday, May 27, 2012

Finding objects in Oracle via user_objects

When working with Oracle databases in within a large company with multiple customers (departments or external customers) you will also be faced with the situation that not all databases are alike. Some databases will have a strict datamodel and a strict documentation policy. From those databases you will know exactly what is in the datamodel and you will be able to find all documentation of it. This is normally the case with all production databases and the associated D(evelopment), T(est) and A(cceptance) environments. However when it comes to the "play" environments and the environments used for research and development you are not always that lucky. Specially when you are looking into a database which is used by multiple developers to work on small coding projects and using it to learn new tricks of the trade.

In those cases it is not uncommon that you have to reverse engineer some parts of the code and from time to time find lost objects. Someone stating something like "yes I have stored that in a table a year ago and called in something like HELP" is not uncommon. In those cases you will have to start looking for the object and to do so your best friend is the USER_OBJECT table in the Oracle database.

The USER_OBJECT table holds information on all the objects available tot the users. This will help you finding the table you are looking for. Some people like to directly use CAT and do something like;

SELECT
      *
FROM
    CAT
WHERE
     TABLE_NAME LIKE 'HELP'

This however will only give you the table name (HELP) and the table_type (TABLE). Secondly you will have limited options to search. You can imagine that the person stating that the table name was HELP might have some mistaken as it is more than a year ago he created it. It might very well be that the table name is USERHELP and it might also very well be that a lot of objects have the "HELP" in their name. I do personally think that using USER_OBJECTS gives you just that extra power above CAT to find the correct object quickly.

Below you see a describe of the USER_OBJECTS table:

Name           Null Type          
-------------- ---- ------------- 
OBJECT_NAME         VARCHAR2(128) 
SUBOBJECT_NAME      VARCHAR2(30)  
OBJECT_ID           NUMBER        
DATA_OBJECT_ID      NUMBER        
OBJECT_TYPE         VARCHAR2(19)  
CREATED             DATE          
LAST_DDL_TIME       DATE          
TIMESTAMP           VARCHAR2(19)  
STATUS              VARCHAR2(7)   
TEMPORARY           VARCHAR2(1)   
GENERATED           VARCHAR2(1)   
SECONDARY           VARCHAR2(1)   
NAMESPACE           NUMBER        
EDITION_NAME        VARCHAR2(30)

it might be wise to give the USER_OBJECTS table a good look and play arround with it some more to understand it correctly. For example it will be able to show you all the objects and it is not limited to tables only for example.
you can find out what kind of objects are used within the database by executing the below query. This will give you a list of used user_objects.

SELECT 
      DISTINCT(object_type)
FROM 
    user_objects 
ORDER BY 
        object_type

back to question of the "HELP" table. You know for example that you are looking for a table so you can filter on object_type to only show you the table objects. Secondly you know that it most likely has "HELP" in the name so you can filter for all objects having "HELP" as part of the object_name and secondly you know it is created 11 or more months ago so you can use this as a filer on the created date field. As you can see, it gives you just a little more options then using the CAT option in your Oracle database.

Wednesday, May 23, 2012

Children will be the early adopters of 3D printing

Supermarkets are currently using children and the gathering behavior of humans to lure customers into their shops. By giving a collectable by every $20 of groceries to the customer for the children to collect they create an emotional buy-in. Children collect them, the swop them at school and you are not considered cool and part of the “inner crowd” when you cannot share in the fun of collecting and swapping. Parents always do want their children to fit in and for this reason they might even change their normal supermarket for another when their children are begging them for the latest collectable toys they get for free.

Looking at the way 3D printing is going we might see in the near future very affordable 3D printers. You can feed your 3D printer with a model and it will print it for you. If we combine this trend with the toys children get we will see in the near future that children will not get the toy itself however will get a QR code corresponding with a one-time print job for a 3D printer. Once scanned the model will be downloaded via a one-time download link and printed directly. After that the QR code will become invalid.

The impact of not giving away toys but giving away the model will make supermarkets and other vendors a whole new business model. It will also enable supermarkets adopt more and more this strategy because the only investment will be a downloadable 3D model and a QR code printed to a small coupon. The production and shipping of the toys is no longer on the costs of the supermarket but rather on the parents who will bring the QR code to their children and who will have to pay for the raw material for the 3D printer.

As soon as 3D printing becomes only a little bit more mainstream we will see that companies will adopt this and by doing so promote the purchase of 3D printing. Even though 3D printing is promising it is currently in its first stages of developing. However we will see this picked up in the coming years and not only merchandise will be printed, within the near future the business model of the “make industry” will change and we will see a lot more products which you can download from the net and print it yourself.

While this is considered a good deal for some it will also have its impact on other parts of the market and the industry. The make industry will notice this at first in the products which are produced cheaply in low price countries. The merchandise industry and the low end consumer products will see a shift of making them somewhere in China to downloading and printing it themselves. This has not only an impact on the production part of the chain but also on the logistical part of the chain as the products no longer need to be shipped.

This is something that will reshape the make industry and where we will see a lot of chains in currently standard ways of doing business. However, the road to adoption is via children and the merchandise from supermarkets.

Friday, May 18, 2012

Oracle NoSQL configuration

The Oracle NoSQL database is a very simple to deploy NoSQL key-value store which requires almost no setup. However keep in mind the almost part. There are some things that you have to configure. You can configure this via the commandline or by deploying it with a correct config.xml file. If you have not configured the NoSQL database and/or did not deploy a correct config.xml file you will end up with an error message like the one below. What you can see in this error message is that we are trying to start the NoSQL KVstore and that it is looking for the config.xml file however is unable to locate it.

05-05-12 12:20:16:09 CEST INFO [snaService] Starting, configuration file: /home/nosql/kv-1.2.123/config.xml
05-05-12 12:20:16:42 CEST SEVERE [snaService] Failed to start SNA: IOException parsing file: /home/nosql/kv-1.2.123/config.xml: java.io.FileNotFoundException: /home/nosql/kv-1.2.123/config.xml (No such file or directory)
java.lang.IllegalStateException: IOException parsing file: /home/nosql/kv-1.2.123/config.xml: java.io.FileNotFoundException: /home/nosql/kv-1.2.123/config.xml (No such file or directory)
   at oracle.kv.impl.param.LoadParameters.load(LoadParameters.java:181)
   at oracle.kv.impl.param.LoadParameters.getParameters(LoadParameters.java:64)
   at oracle.kv.impl.util.ConfigUtils.getBootstrapParams(ConfigUtils.java:81)
   at oracle.kv.impl.sna.StorageNodeAgent.start(StorageNodeAgent.java:301)
   at oracle.kv.impl.sna.StorageNodeAgentImpl.main(StorageNodeAgentImpl.java:704)
   at oracle.kv.impl.util.KVStoreMain$3.run(KVStoreMain.java:139)
   at oracle.kv.impl.util.KVStoreMain.main(KVStoreMain.java:319)

This means that the issue can be resolved quite easy by deploying a config.xml file. You can create one and deploy it by using something like VI. The file should look like the one below:

<config version="1">
<component name="bootstrapParams" type="bootstrapParams">
<property name="hostingAdmin" value="false" type="BOOLEAN"/>
<property name="adminHttpPort" value="5001" type="INT"/>
<property name="storageNodeId" value="0" type="INT"/>
<property name="hostname" value="nosql0.terminalcultexample.org" type="STRING"/>
<property name="registryPort" value="5000" type="INT"/>
<property name="haPortRange" value="5010,5020" type="STRING"/>
</component>
</config>

if you are not that comfortable with building your own config.xml you can use the makebootconfig command to build the xml file for you as shown in the example below:

java -jar ./lib/kvstore-1.2.123.jar makebootconfig -root /home/nosql/kv-1.2.123/ -port 5000 -admin 5001 -host nosql0.terminalcultexample.org -harange 5010,5020

Thursday, April 19, 2012

NoSQL graph database

When talking about big data most people do start to get a general idea of what big data is. The industry and the market is still trying to create a single picture of what big data is. However this will be very hard as we will be unable to state hard that data is big data from this volume on. Oracle is trying to give some guidelines; other companies are trying to some guidelines. I personally think it will not really be possible to give big data hard borders to be defined in and I do see that as a good thing. However one of the bad things is that a lot of people are talking about it and only a limited number of them do understand the bigger picture and the technological components of it.

If we take for example NoSQL which is in most big data solutions a technology component people do tend to think it is one type of database however several companies (and opensource teams) do build their own “version” of it. NoSQL is however not a single type of database, it is a grouping name for a large number of database types who do have some common ground. The nosqltapes.com project is trying to collect a number of interviews on what NoSQL is. In the below video a high level discussion is shown where Graph Databases are explained and why and when it would be good to use a graph database.

Tuesday, April 17, 2012

The legal side of the cloud

Cloud computing is one of the things that is to be stated as a game changing revolution in the IT. Depending on your definition of cloud computing this can indeed be true. And without any doubt and with any form of cloud computing people (and companies) are more empowered to start something new at a very low cost. If you had to start an IT project a couple of years ago you where in most cases in need of hardware to develop and run your solution on. If you where trying to start a new business you needed at least some investment in hardware to start your startup. In both cases it would mean a huge upfront investment. With cloud computing and namely with cloud hosting you can now order a number of servers for a relative low cost with only your credit card. You do not need to buy your own hardware, install it, maintain it and host it. It will simply be done by one of the cloud hosting vendors.

Meaning that, for example, if you are developing some reporting solutions on a departmental level you no longer have to go to your IT department you can simply order one or more servers at a cloud vendor and start developing what you need and start using it. This sounds like a promising move and like a way which will start a lot of innovative new projects.

I personally do love the idea that you can start a project that easy and that you can order servers that easy without the need for large upfront investment. I do also believe it will help startup companies to really start and it will help businesses to move away from the sometimes difficult IT domain and help them on focusing to their day to day business. So in general I am a big fan of cloud computing however there is also another side to the story.

When dealing with data you always have to keep in mind the security of your data. In case you like to create an analysis tool for your stock levels you can put this in a server in the cloud and do your computations on it. A couple of things to keep in mind are, how valuable is your data, can you drive business if it is down, how secure is the connection to the cloud and how secure is the solution you will deploy on this cloud server? These are things that are often overlooked. Big cloud vendors like Amazon are not very keen on providing you with a SLA which means that if they are down they are down. So you have to think about, what will happen if it is not available. Also what you have to consider is how valuable is this data and could it leak to the outside world? And one point very often overlooked is how secure is the connection I am using to upload the data to the cloud and to retrieve the computational results?

In case of stock levels this is not even that hard however as soon as you start talking about customer data you have to consider this is even more confidential. In some countries there are laws that state that you have to protect this data and that you are obliged to certain rules and regulations for security. And to take a next step, in some cases the law will state that you cannot put it in the cloud just that easy.

Most cloud vendors are currently located within the US and due to this are under US law. This means that the US government can demand access to your data without you even knowing it by making use of the patriot act. The patriot act is on a collision course with some other laws which might apply to the country where your company is located. If you, for example, are located in Europe you will have to take into consideration the data protection act. If you have data that has to comply with the data protection act you cannot make use of systems, and cloud solutions, that fall under US law. Most companies do think they do not have to comply with the data protection act in Europe however you have to comply quite quickly if you have some private data of customers and citizens in your system. When dealing with data of governments you almost always have to comply with this.

More and more countries are realizing that data placed in the cloud and physically within the US or hosted by a company outside the US however where the highest legal entity of the organization is a US based company is subject to the patriot act. In the patriot act it is clearly stated that the US government can gain access to this data without informing the owner of the data. To protect vital parts of the infrastructure and to ensure the security and privacy of their citizens countries are now deploying laws to prevent data from moving outside the EU or even outside the country. Some Scandinavian countries have already stated that government data cannot be placed on servers based in the united states and recently a political flame war has erupted between the United states and Australia.

"The United States' global trade representative has strongly criticized a perceived preference on the part of large Australian organizations for hosting their data on-shore in Australia, claiming it created a significant trade barrier for U.S. technology firms. A number of U.S. companies had expressed concerns that various departments in the Australian Government, namely the Department of Defence had been sending negative messages about cloud providers based outside the country, implying that 'hosting data overseas, including in the United States, by definition entails greater risk and unduly exposes consumers to their data being scrutinized by foreign governments.' Recently, Acting Victorian Privacy Commissioner Anthony Bendall highlighted some of the privacy concerns with cloud computing, particularly in its use by the local government. He said the main problems were the lack of control over stored data and privacy, in overseas cloud service providers."
You can read more on the current way of thinking in Australia at delimiter.com.au

In my opinion it is good that companies and politicians are thinking about what the cloud can mean for the security of citizens, the privacy of citizens and even the security of countries itself. Cloud can be a good thing, it is a good thing, it will help innovation however when using a cloud vendor it is good to take into consideration some security and privacy points and not simply deploy your application wherever you like for the lowest price.

For a first impression on how the situation in the world currently is and where your data is the most secure you can check the forrester website. Forrester launched an interactive website where you can obtain more information.

Tuesday, March 13, 2012

solved Oracle NoSQL java.net.NoRouteToHostException

When you are installing Oracle NoSQL on an Oracle linux machine you can follow the Oracle guide which will run you through the simple installation process. There are however some things to keep into consideration. One of the steps is to do a check if you key-value store database is up and running. When you use a default Oracle Linux installation you will most likely succeed when you do a ping to the same host that you are working on. The issue however is starting to occur when you try to do a ping to another machine.

The following situation:
nosql0.exampledomain.com -- 192.168.1.80
nosql1.exampledomain.com -- 192.168.1.81

When you are on nosql0 and execute the following below command you will have a positive result.

java -jar ./lib/kvstore-1.2.123.jar ping -port 5000 -host nosql0.exampledomain.com

If you are on nosql1 and you execute the below command you will also get a positive result.
java -jar ./lib/kvstore-1.2.123.jar ping -port 5000 -host nosql1.exampledomain.com

However, if you are on nosql0 and try to ping the nosql1 with the below command you will get an error.
java -jar ./lib/kvstore-1.2.123.jar ping -port 5000 -host nosql1.exampledomain.com

if you use a ping command you can ping the other server (if not you have another network issue) and you can setup a SSH session however the error message will state that you do not have a route to the mentioned host. The error message will look something like the one below:

[nosql@nosql0 kv-1.2.123]$ java -jar ./lib/kvstore-1.2.123.jar ping -port 5000 -host nosql1.terminalcultexample.org
Exception in thread "main" java.rmi.ConnectIOException: Exception creating connection to: nosql1.terminalcultexample.org; nested exception is:
        java.net.NoRouteToHostException: No route to host
        at sun.rmi.transport.tcp.TCPEndpoint.newSocket(TCPEndpoint.java:632)
        at sun.rmi.transport.tcp.TCPChannel.createConnection(TCPChannel.java:216)
        at sun.rmi.transport.tcp.TCPChannel.newConnection(TCPChannel.java:202)
        at sun.rmi.server.UnicastRef.newCall(UnicastRef.java:340)
        at sun.rmi.registry.RegistryImpl_Stub.list(Unknown Source)
        at oracle.kv.util.Ping.getTopology(Ping.java:332)
        at oracle.kv.util.Ping.main(Ping.java:104)
        at oracle.kv.impl.util.KVStoreMain$8.run(KVStoreMain.java:218)
        at oracle.kv.impl.util.KVStoreMain.main(KVStoreMain.java:319)
Caused by: java.net.NoRouteToHostException: No route to host
        at java.net.PlainSocketImpl.socketConnect(Native Method)
        at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:327)
        at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:193)
        at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:180)
        at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:384)
        at java.net.Socket.connect(Socket.java:546)
        at java.net.Socket.connect(Socket.java:495)
        at java.net.Socket.(Socket.java:392)
        at java.net.Socket.(Socket.java:206)
        at sun.rmi.transport.proxy.RMIDirectSocketFactory.createSocket(RMIDirectSocketFactory.java:40)
        at sun.rmi.transport.proxy.RMIMasterSocketFactory.createSocket(RMIMasterSocketFactory.java:146)
        at sun.rmi.transport.tcp.TCPEndpoint.newSocket(TCPEndpoint.java:613)
        ... 8 more
[nosql@nosql0 kv-1.2.123]$

As it turns out by default Oracle Linux will have iptables configured and this will block your connection to port 5000. You can check you iptables settings by issuing the following command: iptables -L -n

This will give you something like the below:

[root@nosql1 init.d]# iptables -L -n
Chain INPUT (policy ACCEPT)
target     prot opt source               destination
ACCEPT     all  --  0.0.0.0/0            0.0.0.0/0           state RELATED,ESTABLISHED
ACCEPT     icmp --  0.0.0.0/0            0.0.0.0/0
ACCEPT     all  --  0.0.0.0/0            0.0.0.0/0
ACCEPT     tcp  --  0.0.0.0/0            0.0.0.0/0           state NEW tcp dpt:22
REJECT     all  --  0.0.0.0/0            0.0.0.0/0           reject-with icmp-host-prohibited

Chain FORWARD (policy ACCEPT)
target     prot opt source               destination
REJECT     all  --  0.0.0.0/0            0.0.0.0/0           reject-with icmp-host-prohibited

Chain OUTPUT (policy ACCEPT)
target     prot opt source               destination
[root@nosql1 init.d]#

What you need to do is configure iptables to allow network traffic to port 5000 or disable iptables. Disabling iptables is never a smart move however you can opt for it in some cases.

Monday, March 12, 2012

Duqu trojan payback for decommissioning old IBM systems

Our society is more and more depending on computers. Financial transactions are done mosntly via computer transactions, industries are depending on it and armed forces are mostly blind and without information if the IT systems supporting them stop working. We do have to worry about solar storms knocking out most of our modern day communication channels and systems however somewhat closer to earth we also have some things to worry about.

One of the threads we have to worry about and which is coming from our own planet is the fact that criminals and not only criminals but also the military and secret service of countries are working on very advanced technology to knock out systems, break into them or cause other distortion and theft. We are not talking about hackers, as I still do have a mindset in which I see hackers as the good guys who do play intelligent games that sometimes are just reaching over the line of the officially legal. However in my opinion hackers are still the good guys.

It is the people who turn to the true dark side and do sell their craft to wealthy criminal organizations. Some very gifted developers and computer scientists go for the big bugs and do not care about what they develop and with what intention it will be developed.

The second group is the group of computer scientists who do sell their craft to governments in the form of working for an army of secret service. This group is somewhat more discussable on the fact if they go for the monetary pleasure or not. It is without any doubt that governments are willing to pay top dollar for gifted developers and computer scientists however we have to keep in mind that one man his terrorist is the other his freedom fighter. We can state we do agree or disagree with some of the thoughts of other governments however I do think that this is less dollar driven.

That a lot of money is paid to developers to develop virus code and tools to cause mayhem is shown again by the people from the Kaspersky security lab. The Kaspersky lab is currently trying to find out how the new Duqu Trojan is developed and how it is working. What they have found up until now is what it is doing and how it is communication. The scary part however of this Trojan is that it is developed in a language that we do not know. It is tested to see if it is developed in C++, Objective C, Java, Python, Ada, Lua, or any other languages however all tests are currently negative.

Developing a new programming language is a very long and costly process and will need very experienced developers. Developing a new programming language needs a wealthy backing in the form of a government or very wealthy criminal syndicate. However there is another option, the option that it is not a new language however a very old language. Some people claim that it might be the result of an old IBM compiler used in OS400 SYS38 and the oldest sys36 systems.

“That code looks familia:r
The code your referring to .. the unknown c++ looks like the older IBM compilers found in OS400 SYS38 and the oldest sys36.

The C++ code was used to write the tcp/ip stack for the operating system and all of the communications. The protocols used were the following x.21(async) all modes, Sync SDLC, x.25 Vbiss5 10 15 and 25. CICS. RSR232. This was a very small and powerful communications framework. The IBM system 36 had only 300MB hard drive and one megabyte of memory,the operating system came on diskettes.

This would be very useful in this virus. It can track and monitor all types of communications. It can connect to everything and anything.”

Some parts of the current Duqu Framework are “simple” C++ code however some parts are written in the unknown code which might be related to the above quote from As400tech (looking at his knowledge and his name would be a very experienced AS400 developer). If this turns out to be true it could mean that it could be that the developer of this part of the Trojan is an experienced AS400 developer. As we see that companies are decommissioning AS400 systems daily and that they leave an entire community of AS400 developers behind without a proper job this could mean a large group of people comes to the market that are potentially very interesting to governments, secret services and criminal syndicates. Whoever stated that AS400 developers where out of the market was apparently wrong.

However, it is only a thought of one person and not necessarily correct. Some people do think it is coded low level assembly code. This would mean that someone has taken the task upon himself to write all the assembly code himself instead of using a compiler to build it into machine language. However the person is who created the Duqu Trojan (and the Stuka) it must have been a very good programmer or a team of good programmers ( in my humble opinion).

You can condemn the writing of such a code from an ethic and moral point of view, you can agree with it, in any case whatever your point of view on this is you have to admire the craftsmanship of the developer.

Friday, March 09, 2012

When to use Hadoop

Hadoop is one of the big players in the big-data and can be seen as one of the main engines running the big-data machine. We however still do not have a clear picture on what is big-data. we do have some definitions on when we call a lot of data big data however giving it a number has not been done up until now and will most likely never been done. I already zoomed in into this definition question in the "Map reduce into relation of Big Data and Oracle" post on this blog. A number of key components state if data is big-data, to name them; volume of the data, the velocity in which the data grows, the variety of sources which add to the volume of the data and the value it can "potentially" hold. These factors can help you decide when data is big data.

Then we have the question on when data (even big-data) can still be handled in a standard relational database and can still be handled by a "standard" approach. There are some guidelines that can help you. Please do note this is a comparison primarily for handling data in a relational database or in Hadoop. This is not for storing data.

	RDBMS	Hadoop / MapReduce
Data Size	Gigabytes	Petabytes
Access	Interactive and batch	Batch
Structure	Fixed Schema	Unstructured schema
Language	SQL	Procedural (Java, C++, Ruby, etc.)
Integrity	High	Low
Scaling	nonlinear	linear
Updates	Read and Write	Write ones, read many times
Latency	Low	High

By taking this into consideration when you are struggling with the question if you need to use a MapReduce approach or a RDBMS approach it might be a little more easy to make your decision.

Friday, February 24, 2012

Setup Cloudera Hadoop in combination with Oracle virtualization

One of the things Cloudera is propagating is that they have a very easy to use and easy to start implementation of Apache Hadoop. If you check the Cloudera website you have a download section where you can download CDH3.

"CDH consists of 100% open source Apache Hadoop plus nine other open source projects from the Hadoop ecosystem. CDH is thoroughly tested and certified to integrate with the widest range of operating systems and hardware, databases and data warehouses, and business intelligence and ETL systems."

You can deploy it in several ways and the most easy one for people who do start testing with Cloudera and Apache Hadoop is to use one of the pre-created virtual machines. Currently they are available for KVM, VMWare and Oracle VirtualBox. Below is a very quick step by step guide on how you can start using the downloaded Cloudera distribution within Oracle VirtualBox Reason for this, there are some guides on "old" versions of virtualbox and when I do refer someone to a step by step guide I would like that guide to be accurate.

When you have download the Cloudera distribution you will need to unpack the downloaded ..tar.gz file as you would normally do and store the resulting .vmdk file (probably named cloudera-demo-vm.vmdk) at a location where you normally save your virtual machines.

Step 1:
Start VirtualBox and click the "new" button to start the creation of a new virtual machine.

Step 2:
Give you new, to be created, virtual machine a name. In our case this was Cloudera_0. You have to select a operating system and a version. In the screenshot below you see I have selected Debian 64Bit this however is wrong. It is working however the distribution officially used by Cloudera in this release is a CentOS 5.7 64Bit version using a kernel version 2.6.18-274.17.1.el5 .

Step 3:
You have to state the amount of memory. Cloudera claims you can run the system 1 GB however recommends at least 2 GB to be able to start everything properly. In the below screenshot you can see I am using 2048 MB however I did double that after playing with the system for some time as more memory if quite convenient

Step 4:
Now it is time to select your hard disk. For this you have to select the .vmdk file. Within this file is the complete Cloudera distribution with Apache Hadoop. Their is no need to create a new disk.

Step 5:
Now you will see the final results and when you select create your virtual machine will be created.

Step 6:
Your VirtualBox is created, when you select the newly created Cloudera virtual machine and start it you will see the system boot and within no-time you will have your first Cloudera instance up and running.

Thursday, February 23, 2012

Oracle Enterprise Manager patch advisory

A lot of software companies are pushing patch advisories to there customers in the form of a popup telling them that a new patch has been released and that it would be good if they installed it. Oracle traditionally did not do that however already since one of the first releases of Oracle Enterprise Manager you can connect your Oracle Enterprise Manager installation to the My Oracle Support website and there you will get automatically information about which patches are available for you and which you can install.

For good reasons some companies do not allow their Oracle Enterprise Manager to connect with the outside world. You do connect out of your comfortable secure environment and with every link to the outside world you create potentially a security issue. Even though it is very unlikely it could potentially be a security thread and if you are hosting confidential and/or high valuable data it is your responsibility to guard this in every way possible.

The other side of the coin is that having a proper patch management strategy in place is also a very important part of your security. If you have a large estate of Oracle products it is almost not humanly possible to keep up with all the patches and patch advisories so you do want to have a automated patch advisory system. This decision has to be made in your organisation with security as one of the main questions on the table.

Above you can see a screenshot of a 11GR1 patch advisory for a database installation from An oracle manual.

The Patch Advisor in Enterprise Manager describes critical software patches for your installed Oracle products. To help ensure a secure and reliable configuration, all relevant and current critical patches should be applied.

The Patch Advisor provides support for Remedies. When you select an advisory, you can view the calculated remedies from the context of that Advisory, as well as the affected Oracle homes.

The Patch Advisor also displays a list of available patches and patch sets for your installation, along with the name of the feature that is impacted. You can choose to display only patches for features that are used by your database, or all available patches.

Wednesday, February 22, 2012

linkedin buzzwords

Most people have created a resume at some point in their career when they where looking for a new job. The interesting part and the interesting question when creating a resume, or letter of recommendation, is always; how do I stand out between the others. People have been looking at ways to phrase and re-phrase parts of their resume just to be able to standout. When all the resumes where looked into the person who was looking into them could find out (if he bothered to do so) what the most popular phrases (buzzwords) where that where used. This however would require some manual "computation" on paper resumes.

Nowadays most people do have a linkedin account so we could potentially do a data mining action on all the digital linkedin profiles and find out what the buzzwords of today are when it comes to describing your career and yourselves as professionals. Linkedin has done this in 2010 and now also for 2011 with the following results for the united states:

It turns out that the top 10 buzzwords in the united states over 2011 used on linkedin where: Creative, Organizational, Effective, Extensive Experience, Track Record, Motivated, Innovative, Problem Solving, Communication Skills and Dynamic.

Also included in the blog post in the LinkedIn blog is a infographic showing the top buzz words used on LinkedIn globally.

For all people looking for a job or updating their LinkedIn profile the big question now is; if I use those words will it help me or not? Will I stand out if I do not use them or do future employers expect me to use those words.......

Oracle Big Data approach

In a previous post I already zoomed in at the way Oracle is thinking about big data. The post Map reduce into relation of Big Data and Oracle there was a outline on how Oracle is defining big data and how they are intending to use map reduce and Hadoop in their approach to handling big data. As you might know Oracle has launched a big data appliance which integrates and makes use of a couple of important components in the big data components currently used. The Oracle big data appliance will provide you an out of the box working solution where the supplier has engineered all the components like in all the other solutions in the Oracle Exa- stack. Or as Oracle likes to state "hardware and software engineered to work together"

As you can see in the above diagram the Oracle Big Data Appliance makes use of some of known and important components. The decision is made to run the entire system on Oracle Linux, an option would have been to run it on Solaris however due to the wide adoption of Oracle Linux and the fact that a majority of the Hadoop solutions is primarily focusing on Linux and not on Solaris it is running on Linux (assumption from my side)

For the rest we see the Oracle NoSQL database as integrated part of the appliance which is also not a big surprise as Oracle is pushing its NoSQL solution into the market to gain market share in the NoSQL market. Looking at the Oracle NoSQL solution they do a quite good job and have launched a good NoSQL product with a lot of potential.

As we are talking about big data Hadoop is part of this appliance and this comes as no surprise, what also not comes as a surprise however is very good to see is the integration in this appliance with the Oracle loader for Hadoop and the Oracle Data Integrator.

Oracle Loader for Hadoop:

"Oracle Loader for Hadoop is a MapReduce utility to optimize data loading from Hadoop into Oracle Database. Oracle Loader for Hadoop sorts, partitions, and converts data into Oracle Database formats in Hadoop, then loads the converted data into the database. By preprocessing the data to be loaded as a Hadoop job on a Hadoop cluster, Oracle Loader for Hadoop dramatically reduces the CPU and IO utilization on the database commonly seen when ingesting data from Hadoop. An added benefit of presorting data is faster index creation on the data once in the database."

Oracle Data Integrator:

"Oracle Data Integration provides a fully unified solution for building, deploying, and managing real-time data-centric architectures in an SOA, BI, and data warehouse environment. In addition, it combines all the elements of data integration—real-time data movement, transformation, synchronization, data quality, data management, and data services—to ensure that information is timely, accurate, and consistent across complex systems."

The Big Data Appliance fits into the overall exa strategy from Oracle where they are delivering appliances and it also fits in the overall big data strategy.

As you can see a lot of the steps in the acquire and the organize stages of the big data approach from Oracle are covered by the big data appliances.

Tuesday, February 21, 2012

State Of The Social Media Agency

The people at socialfresh.com have launched their invest in social website in 2011 as a listing and search engine for social companies. The hope of socialfresh was and is to be able to find all social companies and show what they are working on.

Today there are over 920 social media companies listed on the directory. 555 of those companies are agencies. It is very interesting to see what companies start in this field, who are working on what and how employees of companies are using social. For companies who are looking into ways of interacting more in a social (media) way it can be very interesting to look at other companies and companies who provide services in this field.

The below infographic is created by socialfresh to show a breakdown of what they have found since the startup of the "invest in social" website.

Thursday, February 16, 2012

The online social side of food

There is only one reason I am not a big user of Foodspotting is simply because I do not visit restaurants as much as I want to. Even though that is stated I do use the foodspotting app as it makes food more social from a online perspective. Foodspotting is becoming the foursquare for food. I do personally think Foursquare did miss their target here and they could have kept Foodpotting out of the game however they failed at it. From a user perspective this is not a negative thing as Foodspotting is doing a great job.

Foodspotting is one of the startups started by real and true believers in the subject and that is what you see in the final result, a great passion of developing a real cool and great product.

In the above videio you can see a interview done by Robert Scoble who is talking to the people behind Foodspotting. Foodspotting is one of the companies who are jumping into the next OpenGraph from Facebook and who do see the potential of this new options coming from Facebook.

Map reduce into relation of Big Data and Oracle

Everyone is talking about big-data, we are still trying to define when data becomes big data and we are just at the doorstep of understanding all the possibilities of what we can do with big data if we apply big analysis on it. Even though this field of (enterprise) IT is quite new we see a lot of companies who are taking big data very serious. For example Oracle is taking this point very serious as they are seen as the company which should be able to handle large sets of data. Oracle is teeming up with some of the big players in the market, for example they are teeming up with Cloudera which is one of the leading players in the Hadoop field.

As the data company Oracle is spending a lot of time on thinking about big data and building products and solutions to work with Big Data. Meaning Oracle is trying to answer the question "how did data become big data" or to rephrase that question "when is data big data". The answer which Oracle is coming with and what was promoted by Tom Kyte is coming as this slide in their latest presentation

Oracle states that big data can be defined based upon 4 criteria. It should have a certain volume, it should have a certain velocity (speed of data growth), the variety (all kinds of sources and forms the data is coming in) and the value as in the value that the data has or potentially value it can have as you are able to extract the true value from it.

Extracting the true value and unlocking the true value of your big data will take a lot of computing power and for this you will need a superb compute infrastructure. We have the map reduce solution which is developed by Google and has been released a couple of years ago. In the below slide you can see how the map reduce compute infrastructure / algorithm thinking works. This is the map reduce picture used by Tom Kyte during its presentation on big data.

MapReduce is a framework for processing highly distributable problems across huge datasets using a large number of computers (nodes), collectively referred to as a cluster (if all nodes use the same hardware) or a grid (if the nodes use different hardware). Computational processing can occur on data stored either in a filesystem (unstructured) or in a database (structured).

"Map" step: The master node takes the input, partitions it up into smaller sub-problems, and distributes them to worker nodes. A worker node may do this again in turn, leading to a multi-level tree structure. The worker node processes the smaller problem, and passes the answer back to its master node.

"Reduce" step: The master node then collects the answers to all the sub-problems and combines them in some way to form the output – the answer to the problem it was originally trying to solve.

MapReduce allows for distributed processing of the map and reduction operations. Provided each mapping operation is independent of the others, all maps can be performed in parallel – though in practice it is limited by the number of independent data sources and/or the number of CPUs near each source. Similarly, a set of 'reducers' can perform the reduction phase - provided all outputs of the map operation that share the same key are presented to the same reducer at the same time. While this process can often appear inefficient compared to algorithms that are more sequential, MapReduce can be applied to significantly larger datasets than "commodity" servers can handle – a large server farm can use MapReduce to sort a petabyte of data in only a few hours. The parallelism also offers some possibility of recovering from partial failure of servers or storage during the operation: if one mapper or reducer fails, the work can be rescheduled – assuming the input data is still available.

As Google is the company who came with map reduce it might be good to check what Google has to say on it when they are explaining it. In the below video you can see a recording of the Google Developers Day 2008 where Google was explaining the map reduce solution they had developed and where using internally.

Map reduce and and Hadoop which is the primary solution for map reduce coming from the Apache foundation as an open source solution fits in the statement "the future of computing is parallelism" and which is to my opinion is still very valid. In that article we zoomed more in to the parallelism where Hadoop and map reduce talk about a more massive scale parallelism however in essence it is still valid and the same.