Wrangler the smartest ETL and much more

Wrangler is an interactive tool for data cleaning and transformation. A smart ETL software which allows to spend less time formatting and more time analyzing your data. Take the time to watch the below video, its really worth it.

Official website:  http://vis.stanford.edu/wrangler/

The infoworld technology of the year 2013

Infoworld just published its Technology of the Year Award winners and some well known NoSQL solution have been rewarded:

  • Apache Hadoop
  • Apache Cassandra
  • Couchbase Server

http://www.infoworld.com/slideshow/80986/infoworlds-2013-technology-of-the-year-award-winners-210419#slide1

HRider 1.0.1 has been released

The h-rider is a UI application that provides an easier way to view or manipulate the data saved in the – HBase™ – distributed database that supports structured data storage for large tables.

Welcome

https://github.com/NiceSystems/hrider

 

Aerospike nosql database

Aerospike provide NoSQL database running on SSDs with indices held in DRAM. Aerospike was previously known as Citrusleaf and is headquartered in Mountain View. The company claims that, with its in-SSD database, “it can get 500,000 transactions answered per second on a $2,000 server or 1 million transactions per second on a $5,000 machine.”

Tapad,  an online ad-broker, runs a 5-node Aerospike cluster with each server node having six 120GB SSDs. Reads and writes are spread across these SSDs and not one has had to be replaced in 18 months of operation. The operational stats are impressive. It manages more than 150 billion ad impressions a month sent to 2 billion devices, with up to 50,000 queries/sec per server node, “reaching 150,000 ads per second during peak activity.” The total data volume is 3.6TB and still growing. The system spreads work across nodes by monitoring their latency.

 

Ever heard of Cloudant ?

Cloudant was founded in Cambridge, Massachusetts in 2008 by three MIT physicists who at the time were moving multi-petabyte data sets around from the Large Hadron Collider. Frustrated by the available tools for managing and analyzing Big Data in their research, the founders built a distributed, fault-tolerant, globally scalable data layer on top of Apache CouchDB.

The service has grown since then. The team now manages and serves mobile and web app data on behalf of thousands of developers and hundreds of customers to their users around the world.

 

The Cloudant Data Layer collects, stores, analyzes and distributes application data across a global network of secure, high-performance data centers, delivering low-latency, non-stop data access to users no matter where they’re located.

Features

Cloudant enables advanced features like full-text search, replication, off-line computing, mobile sync, geo-location, and federated analytics. A RESTful API and support for standards like JSON and MapReduce make Cloudant easy to use; there are never schema or data migrations to slow you down.

GLOBAL DATA NETWORK

  • Global network of servers
  • Built-in replication and syncing
  • Pushes data closer to users
  • Built-in failover/disaster recovery

NOSQL

  • Schemaless development for JSON & other doc types
  • MapReduce for indexing & data access
  • Replication of apps, indexes & data
  • RESTful API

DATA LAYER AS A SERVICE

FAULT TOLERANCE

  • Clustering in a ring (a la Amazon Dynamo)
  • Built-in distributed Erlang
  • Masterless
  • Code sent to data, node-local, data-parallel

BUILT-IN ANALYTICS

  • Push olap-style workflows into the database
  • Based on incremental, chainable MapReduce
  • Multi-language support
  • Attachment analytics

DISTRIBUTED, FULL-TEXT SEARCH

  • Based on Lucene libraries
  • Support for custom indexers
  • Learn more about Cloudant search

Postgres 9.2 available at heroku.com

Effective immediately, Heroku is moving Postgres 9.2 into GA, which will become the new default shortly after. Postgres 9.2 is full of simplifications and new features that will make your life better, including expressive new datatypes, new tools for getting deep insights into your database’s performance, and even some simple user interface improvements. Oh, and it’s much, much faster for the most common kind of write performance pattern we see in our fleet.

You can request a version 9.2 database from the command line like this:

heroku addons:add heroku-postgresql:dev --version=9.2

Let’s dig in a bit further with the new features this version brings.

Visibility

Visibility into your data has long been a problem for many application developers.

  • How often a query is run
  • How much time is spent running the query
  • How much data is returned

You can turn on the tracking of pg_stat_statements with CREATE EXTENSION pg_stat_statements; Then run the query below and you’ll receive all of your top run queries:

SELECT count(*), query FROM pg_stat_statements GROUP BY 2 ORDER BY 1 DESC LIMIT 10;

JSON Support

Developers are always looking for more extensibility and power when working with and storing their data. Earlier this year we announced our support for hstore, a powerful key/value store within Postgres, which you can easily use within RailsDjango, and Java Spring.

With Postgres 9.2 there’s even more robust support for NoSQL within your SQL database, thanks to Andrew Dunstan, in the form of JSON. By using the JSON datatype your JSON is validated that it’s proper JSON before it’s allowed to be committed.

Range Type Support

The range datatype, thanks to Jeff Davis, is another example of powerful data flexibility. The range datatype is a single column consisting of a to and from value. Your range can exist as a range of timestamps, alpha-numeric, or numeric range and can even have constraints placed on it to enforce common range conditions.

For example, this schema ensures that in creating a class schedule we can’t have two classes at the same time:

CREATE TABLE schedule (class int, during tsrange); ALTER TABLE schedule ADD EXCLUDE USING gist (during WITH &&);

Then attempting to add data we would receive an error:

INSERT INTO schedule VALUES (3, '[2012-09-24 13:00, 2012-09-24 13:50)'); INSERT INTO schedule VALUES (1108, '[2012-09-24 13:30, 2012-09-24 14:00)'); ERROR: conflicting key value violates exclusion constraint "schedule_during_excl"

Performance

Of course, any new release of a database wouldn’t be complete without some focus on performance. Postgres 9.2, as expected, has delivered here in a big way including up to 4X improvements in speed on read queries and up to 20X improvements on data warehousing queries. In particular index-only scans can offer much faster queries because they no longer need to access disk to ensure correct results.

Nine Databases in 45 Minutes

From http://www.datanami.com/datanami/2012-12-04/nine_databases_in_45_minutes.html

 

The following video provides a crash course in nine key databases:  Postgres, CouchDB, MarkLogic, Riak, VoltDB, MongoDB, Neo4j, HBase and Redis. All in just 45 minutes.

Miles Pomeroy, Chad Maughan, and Jonathan Geddes run through each database in five minutes each, thus the title, “9 Databases in 45 minutes.” The video proceeds in efficient and occasionally amusing fashion, where a countdown clock and a gong keep the presentations terse.

Develop with Pleasure!

http://blogs.jetbrains.com/idea/2012/12/intellij-idea-12-is-available-for-download/

 

IntelliJ IDEA 12 is Available for Download

Revealing the Darker Side of Productive Coding

A few weeks ago we finished the Early Access Program for the upcoming release of IntelliJ IDEA 12. We would like to thank all of you who evaluated the preview builds and submitted your feedback. We really appreciate support from the community, watching closely every new feature we announced and providing us with comments and bug reports. It would be absolutely impossible to do what we did without your contributions!

Today we are excited to announce that IntelliJ IDEA 12, the next major version of our flagship Java IDE, is finally released and available for download.

As usual, it is difficult to list all the new features in the release. Every single day we try to not only add something new, but also rethink existing features to make them even more useful for your productivity and usability. So let me highlight the most exciting features awaiting you in IntelliJ IDEA 12.

New User Interface and Darcula Theme

The newest release of IntelliJ IDEA comes with a redesigned user interface, along with a new stylish dark look and feel called Darcula. The new interface is supposed to be more clean and functional. A lot of people find a dark look and feel much less distracting. Now that we’ve added it, you can focus more on the code and less on the IDE.

The new dark look and feel is fully customizable, so you can create your own dark themes, supported natively by every component of the IDE.

Brand New Compiler Mode

In addition to the interface, IntelliJ IDEA 12 introduces a completely new approach to compiling the project, which is now much faster and provides better user experience. We have rebuilt it from the ground up to move the compiler to a separate process. Now the project can be compiled automatically in background on every change you make, so you can run it almost instantly any time.

For more details about the new compiler mode see this blog post.

Java 8

Another important feature is support for Java 8, the next generation of the Java platform, announced by Oracle this year. IntelliJ IDEA 12 embraces the cutting edge version of the language and provides code assistance for the new syntax, such as lambda expressionsmethod references and default methods. Now you can try the new features of JDK 8 in your projects.

Android UI Designer

Over the last year Android has become the-fastest-growing mobile platform. Ever since IntelliJ IDEA introduced support for Android in its free and open-source Community Edition, we’ve worked hard to make it better with each new release. Finally IntelliJ IDEA 12 comes with well-crafted UI designer, one the most anticipated features in this release.

Read more details about the new UI designer and enjoy a demo in our blog.

Spring Frameworks Support

Intellij IDEA 12 comes with significantly improved support for Spring. The new update includes much better performance, support for XML and annotation-based configurations for the project simultaneously, enhanced dependency diagram (with drag and drop support) and of course code assistance for even more frameworks, such as IntegrationWeb Flow, MVC, SecurityBatch and others.

Play 2.0 Support for Java and Scala

One more remarkable feature many people have been waiting for is support for the newest version of the Play framework. IntelliJ IDEA 12 enables you to create, run and debug Play 2.0 applications easily using both Java and Scala languages, with advanced code assistance, including templates support, formatter, refactorings and many other features.

Database Development Tools

While IntelliJ IDEA is frequently called the most intelligent Java IDE, it also provides powerful database tools and support for SQL. The new release reveals more exciting features for developers who use databases in their projects.

With IntelliJ IDEA 12, you not only have intelligent code assistance for SQL, but can also design your database right from the IDE. As databases are part of most projects today, it is time for us to help developers work with them more productively.

Other important features introduced in IntelliJ IDEA 12 include:

  • Intelligent code formatting
  • Better management tools for J2EE application servers, with support Cloud Foundry and CloudBees cloud platforms
  • Support for Drools Expert with advanced code assistance
  • Cucumber for JVM support

To see the full list of new features in IntelliJ IDEA 12 and to download the edition of your choice, please visit our website.

“Develop with Pleasure!”

Scalaris 0.5.0 codename "Saperda scalaris"

Scalaris 0.5.0 (codename “Saperda scalaris”) has been released. https://code.google.com/p/scalaris/

Scalaris is a scalable, transactional, distributed key-value store. It can be used for building scalable Web 2.0 services.

Scalaris uses a structured overlay with a non-blocking Paxos commit protocol for transaction processing with strong consistency over replicas. Scalaris is implemented in Erlang.

Discussion / Documentation / Download:

 

 

NGDATA publish Big Data Whitepaper

Consumer-centric companies such as banks have more data about their consumers but relatively very little intelligence about them. The world is increasingly interconnected, instrumented and intelligent and in this new world the velocityvolume, and variety of data being created is unprecedented. As the amount of data created about a consumer is growing, the percentage of data that banks and retailers can process is going down fast.

In this whitepaper, you will learn:

  • New revenue opportunities that banks can realize by embracing Big Data
  • Challenges banks are facing in getting a single view of consumer
  • Key banking use cases: Mobile Wallet and Fraud Detection
  • How interactive Big Data management can help you in changing the game?

Download this whitepaper to learn how banks can leverage Big Data to transform their business, know their customers better, realize new revenue opportunities, and detect frauds.

The NGDATA Team