New Search App in Hue 2.4

Hue 2.4 unleashed the power of Hadoop, in this version you can now search across Hadoop data just like you would do keyword searches with Google or Yahoo! In addition, a wizard lets you tweak the result snippets and tailors the search experience to your needs.

The new Hue Search app uses the regular Solr API underneath the hood, yet adds a remarkable list of UI features that makes using search over data stored in Hadoop a breeze. It integrates with the other Hue apps like File Browser for looking at the index file in a few clicks.

Here’s a video demoing queries and results customization. The demo is based on Twitter Streaming data collected with Apache Flume and indexed in real time:

 

More information: http://cloudera.github.io/hue/

dotnetConf – Applied NoSQL in .NET

Live video from the dotnetConf

Perhaps you’ve heard about the next generation of databases roughly classified as NoSQL databases? These databases are generally much better than RDBMS at scaling, performance, and ease-of-development (e.g. in NoSQL the object-relational impedance mismatch usually disappears). Unfortunately, many talks on NoSQL are very academic and general. Not this one. This session will introduce the ideas around the so-called NoSQL movement, and we’ll learn how to leverage MongoDB (a popular open source NoSQL db) to build .NET applications using LINQ as the data access language. We’ll build out a .NET application using LINQ and MongoDB in a series of interactive demos using Visual Studio 2012 and C#.

 

TV coverage of the internet in 1995

Processing data with Drake

Introducing ‘Drake’, a “Make for Data”

We call this tool Drake, and today we are excited to share Drake with the world, as an open source project. It is written in Clojure.

Drake is a text-based command line data workflow tool that organizes command execution around data and its dependencies. Data processing steps are defined along with their inputs and outputs.  It automatically resolves dependencies and provides a rich set of options for controlling the workflow. It supports multiple inputs and outputs and has HDFS support built-in.

We use Drake at Factual on various internal projects. It serves as a primary way to define, run, and manage data workflow. Some core benefits we’ve seen:
    • Non-programmers can run Drake and fully manage a workflow
    • Encourages repeatability of the overall data building process
    • Encourages consistent organization (e.g., where supporting scripts live, and how they’re run)
    • Precise control over steps (for more effective testing, debugging, etc.)
    • Unifies different tools in a single workflow (shell commands, Ruby, Python, Clojure, pushing data to production, etc.)

Drake official blog: 

http://blog.factual.com/introducing-drake-a-kind-of-make-for-data

 

RethinkDB screencast on ReQL query language

Watch a quick overview of the ReQL query language, sharding, replication, and more:

http://www.rethinkdb.com/docs/guides/quickstart/

Nine Databases in 45 Minutes

From http://www.datanami.com/datanami/2012-12-04/nine_databases_in_45_minutes.html

 

The following video provides a crash course in nine key databases:  Postgres, CouchDB, MarkLogic, Riak, VoltDB, MongoDB, Neo4j, HBase and Redis. All in just 45 minutes.

Miles Pomeroy, Chad Maughan, and Jonathan Geddes run through each database in five minutes each, thus the title, “9 Databases in 45 minutes.” The video proceeds in efficient and occasionally amusing fashion, where a countdown clock and a gong keep the presentations terse.

TaDaWeb is automating what you do everyday on the Internet

TaDaweb is Automating what you do everyday on the Internet.http://www.tadaweb.com/ 

 

 

Made In Luxembourg:

 

Introduction to graph database by Neo4J

Great introduction to Graph Databases provided by Neo4J:

This video demonstrates how graph databases fit within the NOSQL space, and where they are most appropriately used. In this session you will learn:

  • Overview of NOSQL
  • Why graphs matter
  • Overview of Neo4j
  • Use cases for graph databases

What is Hadoop in 3 minutes video

What is Hadoop with Rafael Coss, “Manager Big Data Enablement”

 

 

About JSON genesis

Video from  IEEE Computing Conversations

Interview with Douglas Crockford about the development of JavaScript Object Notation (JSON)

 

  • Crockford is likeably humble about the origins of JSON. Rather than claiming he inventedJSON he instead says he discovered it:
“I don’t claim to have invented it, because it already existed in nature. I just saw it, recognized the value of it, gave it a name, and a description, and showed its benefits. I don’t claim to be the only person to have discovered it.”

 

  • Crockford tried very hard to strip unnecessary stuff from JSON so it stood a better chance of being language independent. When confronted with push back about JSON not being a “standard” Crockford registered json.org, put up a specification that documented the data format, and declared it as a standard.

 

  • Crockford wanted something that made his life easier. He needed JSON when building an application where a client written in JavaScript needed to communicate with a server written in Java.   He wanted something where the data serialization matched the data structures available to both programming language environments.