Mesos 0.13.0 released

Mesos 0.13 has been released and fix many bugs and include the following improvment:

  • [MESOS-46] – Refactor MasterTest to use fixture
  • [MESOS-134] – Add Python documentation
  • [MESOS-140] – Unrecognized command line args should fail the process
  • [MESOS-242] – Add more tests to Dominant Share Allocator
  • [MESOS-305] – Inform the frameworks / slaves about a master failover
  • [MESOS-346] – Improve OSX configure output when deprecated headers are present.
  • [MESOS-360] – Mesos jar should be built for java 6
  • [MESOS-409] – Master detector code should stat nodes before attempting to create
  • [MESOS-472] – Separate ResourceStatistics::cpu_time into ResourceStatistics::cpu_user_time and ResourceStatistics::cpu_system_time.
  • [MESOS-493] – Expose version information in http endpoints
  • [MESOS-503] – Master should log LOST messages sent to the framework
  • [MESOS-526] – Change slave command line flag from ‘safe’ to ‘strict’
  • [MESOS-602] – Allow Mesos native library to be loaded from an absolute path
  • [MESOS-603] – Add support for better test output in newer versions of autools

Download the most recent stable release: 0.13.0. (Release Notes)

 

Cassandra 2.0, just big

In five years, Apache Cassandra has grown into one of the most widely used NoSQL databases in the world and serves as the backbone for some of today’s most popular applications including as Facebook,Netflix,Twitter.

cassandra

 

This newest version, Cassandra 2.0 just announced, includes multiple new features. But perhaps the biggest of them is that “Cassandra 2.0 makes it easier than ever for developers to migrate from relational databases and become productive quickly.”

New features and improvements include:

  • Lightweight transactions allow ensuring operation linearizability similar to the serializable isolation level offered by relational databases, which prevents conflicts during concurrent requests
  • Triggers, which enable pushing performance-critical code close to the data it deals with, and simplify integration with event-driven frameworks like Storm
  • CQL enhancements such as cursors and improved index support
  • Improved compaction, keeping read performance from deteriorating under heavy write load
  • Eager retries to avoid query timeouts by sending redundant requests to other replicas if too much time elapses on the original request
  • Custom Thrift server implementation based on LMAX Disruptor that achieves lower message processing latencies and better throughput with flexible buffer allocation strategies

Official website

Official announce

Download

Choosing a Shard key

Choosing a shard key can be difficult, and the factors involved largely depend on your use case.

In fact, there is no such thing as a perfect shard key; there are design tradeoffs inherent in every decision. This presentation goes through those tradeoffs, as well as the different types of shard keys available in MongoDB, such as hashed and compound shard keys

Mongo-Hadoop Adapter 1.1

The Mongo-Hadoop Adapter 1.1 have been released, it makes  easy to use Mongo databases, or mongoDB backup files in .bson format, as the input source or output destination for Hadoop Map/Reduce jobs. By inspecting the data and computing input splits, Hadoop can process the data in parallel so that very large datasets can be processed quickly.

The Mongo-Hadoop adapter also includes support for Pig and Hive, which allow very sophisticated MapReduce workflows to be executed just by writing very simple scripts.

  • Pig is a high-level scripting language for data analysis and building map/reduce workflows
  • Hive is a SQL-like language for ad-hoc queries and analysis of data sets on Hadoop-compatible file systems.

Hadoop streaming is also supported, so map/reduce functions can be written in any language besides Java. Right now the Mongo-Hadoop adapter supports streaming in Ruby, Node.js and Python.

How it Works

How the Hadoop Adapter works

  • The adapter examines the MongoDB Collection and calculates a set of splits from the data
  • Each of the splits gets assigned to a node in Hadoop cluster
  • In parallel, Hadoop nodes pull data for their splits from MongoDB (or BSON) and process them locally
  • Hadoop merges results and streams output back to MongoDB or BSON

 

FieldDB is available

FieldDB is a free, modular, open source project developed collectively by field linguists and software developers to make an expandable user-friendly app which can be used to collect, search and share your data, both online and offline. It is fundamentally an app written in 100% Javascript which runs entirely client side, backed by a NoSQL database (we are currently using CouchDB and its offline browser wrapper PouchDB alpha). It has a number of webservices which it connects to in order to allow users to perform tasks which require the internet/cloud (ie, syncing data between devices and users, sharing data publicly, running CPU intensive processes to analyze/extract/search audio/video/text). While the app was designed for “field linguists” it can be used by anyone collecting text data or collecting highly structured data where the fields on each data point require encryption or customization from user to user, and where the schema of the data is expected to evolve over the course of data collection while in the “field.”

FieldDB beta was officially launched in English and Spanish on August 1st 2012 in Patzun, Guatemala as an app for fieldlinguists.

More information about FieldDB are available here: https://github.com/OpenSourceFieldlinguistics/FieldDB

OrientDB 1.5 released

OrientDB 1.5 has been released, fix a bunch of issue and bring the following new feature and enhancement:

All the issues: https://github.com/orientechnologies/orientdb/issues?milestone=5&page=1&state=closed

  • New PLOCAL (Paginated Local) storage engine. In comparison with LOCAL it’s more durable (no usage of MMAP) and supports better concurrency on parallel transactions. To migrate your database to PLOCAL follow this guide: migrate-from-local-storage-engine-to-plocal
  • New Hash Index type with better performance on lookups. It does not support ranges
  • New “transactional” SQL command to execute commands inside a transaction. This is useful for “create edge” SQL command to avoid the graph get corrupted
  • Import now migrates RIDs allowing to import databases in a different one from the original
  • “Breadth first” strategy added on traversing (Java and SQL APIs)
  • Server can limit maximum live connections (to prevent DOS)
  • Fetch plan support in SQL statements and in binary protocol for synchronous commands too

Upgrade note: https://github.com/orientechnologies/orientdb/wiki/Upgrade

Download link: https://github.com/orientechnologies/orientdb/releases/download/1.5/orientdb-graphed-1.5.0.zip

Neo4j 1.9.2 has been released

Neo4j, version 1.9.2 is now available.

 

  • Optimize IO performance on Windows
  • Improve procedure to the set up networking for HA clusters
  • Some fixes to the REST API

Neo4j 1.9.2 is available immediately and is an easy upgrade from any other 1.9.x versions

You can download from the neo4j.org web site

Cassandra 2.0.0-beta1 have been released

The latest development release , the 2.0.0-beta1, is now available for download:

Full changes list:

  • Removed on-heap row cache (CASSANDRA-5348)
  • use nanotime consistently for node-local timeouts (CASSANDRA-5581)
  • Avoid unnecessary second pass on name-based queries (CASSANDRA-5577)
  • Experimental triggers (CASSANDRA-1311)
  • JEMalloc support for off-heap allocation (CASSANDRA-3997)
  • Single-pass compaction (CASSANDRA-4180)
  • Removed token range bisection (CASSANDRA-5518)
  • Removed compatibility with pre-1.2.5 sstables and network messages(CASSANDRA-5511)
  • removed PBSPredictor (CASSANDRA-5455)
  • CAS support (CASSANDRA-5062, 5441, 5442, 5443, 5619, 5667)
  • Leveled compaction performs size-tiered compactions in L0 (CASSANDRA-5371, 5439)
  • Add yaml network topology snitch for mixed ec2/other envs (CASSANDRA-5339)
  • Log when a node is down longer than the hint window (CASSANDRA-4554)
  • Optimize tombstone creation for ExpiringColumns (CASSANDRA-4917)
  • Improve LeveledScanner work estimation (CASSANDRA-5250, 5407)
  • Replace compaction lock with runWithCompactionsDisabled (CASSANDRA-3430)
  • Change Message IDs to ints (CASSANDRA-5307)
  • Move sstable level information into the Stats component, removing the
  • need for a separate Manifest file (CASSANDRA-4872)
  • avoid serializing to byte[] on commitlog append (CASSANDRA-5199)
  • make index_interval configurable per columnfamily (CASSANDRA-3961, CASSANDRA-5650)
  • add default_time_to_live (CASSANDRA-3974)
  • add memtable_flush_period_in_ms (CASSANDRA-4237)
  • replace supercolumns internally by composites (CASSANDRA-3237, 5123)
  • upgrade thrift to 0.9.0 (CASSANDRA-3719)
  • drop unnecessary keyspace parameter from user-defined compaction API (CASSANDRA-5139)
  • more robust solution to incomplete compactions + counters (CASSANDRA-5151)
  • Change order of directory searching for c*.in.sh (CASSANDRA-3983)
  • Add tool to reset SSTable compaction level for LCS (CASSANDRA-5271)
  • Allow custom configuration loader (CASSANDRA-5045)
  • Remove memory emergency pressure valve logic (CASSANDRA-3534)
  • Reduce request latency with eager retry (CASSANDRA-4705)
  • cqlsh: Remove ASSUME command (CASSANDRA-5331)
  • Rebuild BF when loading sstables if bloom_filter_fp_chance
  • has changed since compaction (CASSANDRA-5015)
  • remove row-level bloom filters (CASSANDRA-4885)
  • Change Kernel Page Cache skipping into row preheating (disabled by default)(CASSANDRA-4937)
  • Improve repair by deciding on a gcBefore before sending
  • out TreeRequests (CASSANDRA-4932)
  • Add an official way to disable compactions (CASSANDRA-5074)
  • Reenable ALTER TABLE DROP with new semantics (CASSANDRA-3919)
  • Add binary protocol versioning (CASSANDRA-5436)
  • Swap THshaServer for TThreadedSelectorServer (CASSANDRA-5530)
  • Add alias support to SELECT statement (CASSANDRA-5075)
  • Don’t create empty RowMutations in CommitLogReplayer (CASSANDRA-5541)
  • Use range tombstones when dropping cfs/columns from schema (CASSANDRA-5579)
  • cqlsh: drop CQL2/CQL3-beta support (CASSANDRA-5585)
  • Track max/min column names in sstables to be able to optimize slice
  • queries (CASSANDRA-5514, CASSANDRA-5595, CASSANDRA-5600)
  • Binary protocol: allow batching already prepared statements (CASSANDRA-4693)
  • Allow preparing timestamp, ttl and limit in CQL3 queries (CASSANDRA-4450)
  • Support native link w/o JNA in Java7 (CASSANDRA-3734)
  • Use SASL authentication in binary protocol v2 (CASSANDRA-5545)
  • Replace Thrift HsHa with LMAX Disruptor based implementation (CASSANDRA-5582)
  • cqlsh: Add row count to SELECT output (CASSANDRA-5636)
  • Include a timestamp with all read commands to determine column expiration(CASSANDRA-5149)
  • Streaming 2.0 (CASSANDRA-5286, 5699)
  • Conditional create/drop ks/table/index statements in CQL3 (CASSANDRA-2737)
  • more pre-table creation property validation (CASSANDRA-5693)
  • Redesign repair messages (CASSANDRA-5426)
  • Fix ALTER RENAME post-5125 (CASSANDRA-5702)
  • Disallow renaming a 2ndary indexed column (CASSANDRA-5705)
  • Rename Table to Keyspace (CASSANDRA-5613)
  • Ensure changing column_index_size_in_kb on different nodes don’t corrupt the
  • sstable (CASSANDRA-5454)
  • Move resultset type information into prepare, not execute (CASSANDRA-5649)
  • Auto paging in binary protocol (CASSANDRA-4415, 5714)
  • Don’t tie client side use of AbstractType to JDBC (CASSANDRA-4495)
  • Adds new TimestampType to replace DateType (CASSANDRA-5723, CASSANDRA-5729)

 

RethinkDB 1.7 has been released

RethinkDB 1.7 has been released and is available for download

This release includes the following features and improvements:

  • Tools for CSV and JSON import and export
  • Support for hot backup and restore
  • ReQL support for atomic set and get operations
  • A powerful new syntax for handling nested documents
  • Greater than 10x performance improvement on document inserts
  • Native binaries for CentOS / RHEL
  • A number of small ReQL improvements (explained below)

See the full list of over 30 bug fixes, features, and enhancements.

New Search App in Hue 2.4

Hue 2.4 unleashed the power of Hadoop, in this version you can now search across Hadoop data just like you would do keyword searches with Google or Yahoo! In addition, a wizard lets you tweak the result snippets and tailors the search experience to your needs.

The new Hue Search app uses the regular Solr API underneath the hood, yet adds a remarkable list of UI features that makes using search over data stored in Hadoop a breeze. It integrates with the other Hue apps like File Browser for looking at the index file in a few clicks.

Here’s a video demoing queries and results customization. The demo is based on Twitter Streaming data collected with Apache Flume and indexed in real time:

 

More information: http://cloudera.github.io/hue/