Yes Hadoop is a success, as over the last few years it became the platform for parallel computation in Java.
But it’s all it is to me, a leadership by default. I’ve misleading you ,”About the Hadoop success”, while probably true, is not a praise of the Hadoop solution but is a small and short critical review of the solution:
- No serious competitors so far has bring this leadership by default, competition is an essential requirement for software evolution. Hadoop has a wandering evolution. It all started from from Google’s map-reduce concept, small and well defined software concept but today no one of the users really understand the undertaken path
- Not production ready ? Well, stability and efficiency seems to remain awaited by many. The 1.0 version has been released and announced as a big event but still, were is the maturity of a 1.0 commercial product ?
- Unfriendly eco-system is another recurring critics,referring to the “Hadoop ecosystem map” one obvious remark popup: complexity.
- Data management is inefficient, we all agree it’s easier to code queries in Hive than by using MapReduce directly.All Hadoop data management are moving to higher level languages, i.e., to SQL and SQL-like languages and Hadoop should push ahead such move.
In summary, the Gartner Group has formulated the well-known “hype cycle” to describe the evolution of a new technology from inception onward.
Current Hadoop is promised as the “best thing since sliced bread” by its advocates.
We hope that its shortcomings can be fixed in time for it to live up to its promise.
- How did it all start: huge data on the web!
- Nutch built to crawl this web data
- Huge data had to saved: HDFS was born!
- How to use this data?
- Map reduce framework built for coding and running analytics (Java, any language through streaming/pipes)
- How to import unstructured data: web logs, click streams – fuse,webdav, chukwa, flume, Scribe
- Hiho and sqoop for loading data into HDFS – RDBMS can join the Hadoop band wagon!
- High level interfaces required over low level map reduce programming– Pig, Hive, Jaql
- BI tools with advanced UI reporting- drilldown etc- Intellicus
- Workflow tools over Map-Reduce processes and High level languages
- Monitor and manage Hadoop, run jobs/Hive, view HDFS – high level view- Hue, Karmasphere, Eclipse plugin, Cacti, Ganglia
- Support frameworks- Avro (Serialization), Zookeeper (Coordination)
- More High level interfaces/uses- Mahout, Elastic map Reduce
- OLTP- also possible – HBase