Cloudera is the Apache packaging solution to deploy the integrated solution including Hadoop, Sqoop, Pig, Hive, HBase, ZooKeeper, Oozie, Hume, Flume, and Whirr, The Cloudera VirtualBox Demo will bring you all this platform configured and ready to experiment with in less than 5 minutes.
Making it easy for users to experiment with these tools increases the chances for adoption.
Because data design is fundamental and because NoSQL database break the Normal form database normalization.You need to review and explore the recommended ways to model your data with your NoSQL solution.
10gen’s Richard Kreuter offer a nice presentation on MongoDB schema design: “Schema Design Basics: White Board Session”:
Jonathan Ellis on cassandra, datastaxapache cassandra project chair Jonathan Ellis talks about the dynamo–bigtable hybrid. from its origins at facebook to the creation of support company datastax (known as riptano at the time). he also details the design tradeoffs, internals as well as his take on high-profile cassandra deployment stories.
MongoDB is now live at Craigslist, where it is being used to archive billions of records.
The NoSQL data store is now being used to archive billions of records at Craigslist, the popular classifieds and job posting community that serves 570 cities in 50 countries.
Every post in the history of the site was previously held in a large MySQL cluster. Since Craigslist had a variety of database needs moving forward, ranging from wanting to add new machines without downtime to routing around dead machines without clients failing, the development team decided to initiate a major migration to a NoSQL solution. Mongo DB was the solution they chose.
Here are some basic numbers about the Craigslist MongoDB cluster from Jeremy Zawodny, one of the site’s software engineers:
We’re sizing the install for around 5 billion documents. That’s from the initial 2 billion document import we need to do plus room to grow for a few years to come. Average document size is right around 2KB. (Five billion 2KB documents is 10TB of data.) We’re getting our feet wet with MongoDB so this particular task isn’t high throughput or growing in unpredictable ways.
We can put data into MongoDB faster than we can get it out of MySQL during the migration.
Zawodny explains the evolution of data storage at Craigslist and how MongoDB will fit into the future of the site’s infrastructure and explainwhy Craigslist chose MongoDB over other data stores in this video.
Recent history and current state of nosql: video discussion with @fastip founderNoSQL Tapes Vol. 9: Benjamin Black on NoSQL,Cloud Computing & Fast IP
Benjamin Black shares his thoughts on NoSQL,Cloud Computing and sheds some light on his new company: FASTIP