Postgres Outperforms MongoDB and Ushers in New Developer Reality

According to EnterpriseDB’s recent benchmark, Postgres Outperforms MongoDB and Ushers in New Developer Reality

Potgres would outperform MongoDB performance but also the MongoDB data size requirement would be outperformed by by approx. 25%

EDB found that Postgres outperforms MongoDB in selecting, loading and inserting complex document data in key workloads involving 50 million records:

  • Ingestion of high volumes of data was approximately 2.1 times faster in Postgres
  • MongoDB consumed 33% more the disk space
  • Data inserts took almost 3 times longer in MongoDB
  • Data selection took more than 2.5 times longer in MongoDB than in Postgres

Find the full article here

The benchmark tools is available on GitHub: https://github.com/EnterpriseDB/pg_nosql_benchmark

Big Data top paying skills

 According to kdnuggets the Big Data related skills led the list of top paying technical skills (six-figure salaries) in 2013.

The study focus on  technology professionals in the U.S. who enjoyed raises over the last year(2013).

Average U.S. tech salaries increased nearly three percent to $87,811 in 2013, up from $85,619 the previous year.Technology professionals understand they can easily find ways to grow their career in 2014, with two-thirds of respondents (65%) confident in finding a new, better position. That overwhelming confidence matched with declining salary satisfaction (54%, down from 57%) will keep tech-powered companies on edge about their retention strategies.

Companies are willing to pay hefty amounts to professionals with Big Data skills.


According to a report released on Jan 29, 2014 an average salary for a professional having knowledge and experience in programming language R was $115,531 in year 2013. 

Other Big Data oriented skills such as NoSQL, MapReduce, Cassandra, Pig, Hadoop, MongoDB are among top 10 paying skills. 

 

Source: kdnuggets

MongoDB 2.6 released

MongoDB 2.6 has been released with new majors features as primary target, but it also improve performance.

Performance improvements:

  • efficient use of network resources
  • oplog processing is 75% faster
  • classes of scan, sort, $in and $all performance are significantly improved
  • bulk operators for writes improve updates by as much as 5x.

Features improvements:

  • Text Search Integration
  • Insert and Update Improvements
  • A new write protocol integrates write operations with write concerns(The protocol also provides improved support for bulk operations)
  • A new authorization model that provides the ability to create custom User-Defined Roles and the ability to specify user privileges at a collection-level granularity.

Full release note

Dex, the Index Bot for MongoDB

Dex, the Index Bot

Dex is a MongoDB performance tuning tool that compares queries to the available indexes in the queried collection(s) and generates index suggestions based on simple heuristics. Currently you must provide a connection URI for your database.

Dex uses the URI you provide as a helpful way to determine when an index is recommended. Dex does not take existing indexes into account when actually constructing its ideal recommendation.

Currently, Dex only recommends complete indexes, not partial indexes. Dex ignores partial indexes that may be used by the query in favor of a better index, if one is not found. Dex recommends partially-ordered indexes according to a rule of thumb:

Your index field order should first answer:

  1. Equivalent value checks
  2. Sort clauses
  3. Range value checks ($in, $nin, $lt/gt, $lte/gte, etc.)

Note that your data cardinality may warrant a different order than the suggested indexes.

https://github.com/mongolab/dex

Choosing a Shard key

Choosing a shard key can be difficult, and the factors involved largely depend on your use case.

In fact, there is no such thing as a perfect shard key; there are design tradeoffs inherent in every decision. This presentation goes through those tradeoffs, as well as the different types of shard keys available in MongoDB, such as hashed and compound shard keys

Mongo-Hadoop Adapter 1.1

The Mongo-Hadoop Adapter 1.1 have been released, it makes  easy to use Mongo databases, or mongoDB backup files in .bson format, as the input source or output destination for Hadoop Map/Reduce jobs. By inspecting the data and computing input splits, Hadoop can process the data in parallel so that very large datasets can be processed quickly.

The Mongo-Hadoop adapter also includes support for Pig and Hive, which allow very sophisticated MapReduce workflows to be executed just by writing very simple scripts.

  • Pig is a high-level scripting language for data analysis and building map/reduce workflows
  • Hive is a SQL-like language for ad-hoc queries and analysis of data sets on Hadoop-compatible file systems.

Hadoop streaming is also supported, so map/reduce functions can be written in any language besides Java. Right now the Mongo-Hadoop adapter supports streaming in Ruby, Node.js and Python.

How it Works

How the Hadoop Adapter works

  • The adapter examines the MongoDB Collection and calculates a set of splits from the data
  • Each of the splits gets assigned to a node in Hadoop cluster
  • In parallel, Hadoop nodes pull data for their splits from MongoDB (or BSON) and process them locally
  • Hadoop merges results and streams output back to MongoDB or BSON

 

dotnetConf – Applied NoSQL in .NET

Live video from the dotnetConf

Perhaps you’ve heard about the next generation of databases roughly classified as NoSQL databases? These databases are generally much better than RDBMS at scaling, performance, and ease-of-development (e.g. in NoSQL the object-relational impedance mismatch usually disappears). Unfortunately, many talks on NoSQL are very academic and general. Not this one. This session will introduce the ideas around the so-called NoSQL movement, and we’ll learn how to leverage MongoDB (a popular open source NoSQL db) to build .NET applications using LINQ as the data access language. We’ll build out a .NET application using LINQ and MongoDB in a series of interactive demos using Visual Studio 2012 and C#.

 

Most popular data management systems

According to the DB-Engine ranking dsds

 

April 2013
Rank Last Month DBMS Database Model Score Changes
1. 1. Oracle  Relational DBMS 1560.59 +27.20
2. 3. MySQL  Relational DBMS 1342.45 +47.24
3. 2. Microsoft SQL Server  Relational DBMS 1278.15 -40.21
4. 4. PostgreSQL  Relational DBMS 174.09 -3.07
5. 5. Microsoft Access  Relational DBMS 161.40 -8.77
6. 6. DB2  Relational DBMS 155.02 -4.31
7. 7. MongoDB  Document store 129.75 +5.52
8. 9. SQLite  Relational DBMS 88.94 +5.68
9. 8. Sybase  Relational DBMS 80.16 -5.25
10. 10. Solr  Search engine 46.15 +2.99
11. Teradata  Relational DBMS 44.93
12. 11. Cassandra  Wide column store 38.57 +2.21
13. 12. Redis  Key-value store 35.58 +3.15
14. 13. Memcached  Key-value store 24.80 -0.17
15. 14. Informix  Relational DBMS 24.00 +0.10
16. 15. HBase  Wide column store 21.84 +1.40
17. 16. CouchDB  Document store 18.72 +0.42
18. 17. Firebird  Relational DBMS 12.24 -1.54
19. Netezza  Relational DBMS 11.14
20. 18. Sphinx  Search engine 9.55 +0.09
21. 19. Neo4j  Graph DBMS 8.34 +0.90
22. 21. Elasticsearch  Search engine 8.31 +1.56
23. 22. Riak  Key-value store 7.20 +1.10

MongoDB 2.4 has been released

MongoDB 2.4 has been released and includes new features and enhancements as follow:

  • Hash-Based Sharding
  • Capped Arrays
  • Text Search (Beta)
  • Geospatial Enhancements
  • Faster Counts
  • Aggregation Framework Improvements
  • Role-Based Privileges
  • Working Set Analyzer
  • Improved Replication

 

Release notes

Downloads

MongoDB 2.4.0-rc3 has been released

MongoDB 2.4.0-rc3 has been released, the 2.4 branch includes:

Downloads: http://www.mongodb.org/downloads

Change Log: https://jira.mongodb.org/browse/SERVER/fixforversion/12393