Red Hat has outlined its big data strategy today. The company has announced that it is going to contribute its Storage Hadoop plug-in to the Apache Hadoop open community as a part of its big data strategy. Red Hat is focusing heavily on enterprise customers infrastructures and platforms in open hybrid cloud based environments.
The Red Hat Storage Hadoop provides compatibility for Apache Hadoop, a popular framework in its segment. Ranga Rangachari, VP and GM of Red Hat, storage business unit, claimed that opening their product to the comity will help transform Storage Hadoop into a highly robust, Hadoop compatible file system for big data. In a webcast, Rangachari said, “The Apache community is very significant. The community is the center of gravity for Hadoop development.”
He went on to further explain the company’s big data strategy – focusing on enterprise customers ideally suited for open hybrid cloud based environments. He mentioned that the company is developing a network based ecosystem where enterprise integrator partners deliver its big data products to enterprise customers.
Red Hat is working on a commercial OpenStack cloud control giant and has also created its own OpenShift platform cloud using various open source projects. The company has also acquired several existing products and has formed a mash-up of acquired as well as self-created code. The company also acquired Gluster to attain a cluster based file system running on X86. It can be used to compute on cloud based environments and eventually Hadoop MapReduce.
Red Hat plans to inform its customers that they will need to eventually dump HDFS and start using Red Hat’s Storage Server. The company believes its solution is more reliable and scalable when compared to HDFS. They also help resolve the NameNode problem, in a way.
The Red Hat Storage Server runs on Linux based X86 servers with SATA/SAS drives. These can be arranged into a RAID stack to protect the drives. The file system (clustered) can then ride ext3, ext4, XFS and other file systems. This is essentially titled GlusterFS – aggregating these file systems and presenting a global namespace to processors to access the cluster.
Companies looking to virtualize Hadoop and other big data environments can use Red Hat’s solutions in the long run, for added flexibility. The company is also working on a Hive connector for JBoss middleware. Hive, the data warehousing system riding on top of HDFS, allows users to make queries like SQL for the data in HDFS. GlusterFS presents it as HDFS to Hadoop.
Red Hat’s strategy reveals how the enterprise software company is focussed on the mainstream stable of tools for corporations in the future. The company is rightly headed towards making the most of the big data technology and enabling customers to find solutions that just work. It’ll be interesting to see how the company implements this strategy in 2013.