Most popular data management systems

According to the DB-Engine ranking dsds

 

April 2013
Rank Last Month DBMS Database Model Score Changes
1. 1. Oracle  Relational DBMS 1560.59 +27.20
2. 3. MySQL  Relational DBMS 1342.45 +47.24
3. 2. Microsoft SQL Server  Relational DBMS 1278.15 -40.21
4. 4. PostgreSQL  Relational DBMS 174.09 -3.07
5. 5. Microsoft Access  Relational DBMS 161.40 -8.77
6. 6. DB2  Relational DBMS 155.02 -4.31
7. 7. MongoDB  Document store 129.75 +5.52
8. 9. SQLite  Relational DBMS 88.94 +5.68
9. 8. Sybase  Relational DBMS 80.16 -5.25
10. 10. Solr  Search engine 46.15 +2.99
11. Teradata  Relational DBMS 44.93
12. 11. Cassandra  Wide column store 38.57 +2.21
13. 12. Redis  Key-value store 35.58 +3.15
14. 13. Memcached  Key-value store 24.80 -0.17
15. 14. Informix  Relational DBMS 24.00 +0.10
16. 15. HBase  Wide column store 21.84 +1.40
17. 16. CouchDB  Document store 18.72 +0.42
18. 17. Firebird  Relational DBMS 12.24 -1.54
19. Netezza  Relational DBMS 11.14
20. 18. Sphinx  Search engine 9.55 +0.09
21. 19. Neo4j  Graph DBMS 8.34 +0.90
22. 21. Elasticsearch  Search engine 8.31 +1.56
23. 22. Riak  Key-value store 7.20 +1.10

Nine Databases in 45 Minutes

From http://www.datanami.com/datanami/2012-12-04/nine_databases_in_45_minutes.html

 

The following video provides a crash course in nine key databases:  Postgres, CouchDB, MarkLogic, Riak, VoltDB, MongoDB, Neo4j, HBase and Redis. All in just 45 minutes.

Miles Pomeroy, Chad Maughan, and Jonathan Geddes run through each database in five minutes each, thus the title, “9 Databases in 45 minutes.” The video proceeds in efficient and occasionally amusing fashion, where a countdown clock and a gong keep the presentations terse.

Couchbase Server 1.8.1 has been released

Couchbase Server 1.8.1 is now available.

  • Rebalance improvements included in 1.8.1 optimize the movement of data when adding and removing the same number of nodes by moving data directly between the nodes being switched out. This enhancement puts customers in a position to proactively manage their application for all kinds of upgrades and maintenance reducing the impact on the rest of the cluster.
  • Management of the memory allocated within the system has been improved. This increases stability of the system as a whole. In addition to this, a few new memory usage statistics have been added to our monitoring console.
  • Several significant bugs related to rebalance and cluster stability have been fixed. The detailed release notes can be found here.

Download page

Manual page

CouchDB 1.2.0 has been released

Big time for Apache CouchDB, the 1.2.0 version has been released and is now available for download.

You can grab your copy here:

http://couchdb.apache.org/

Windows packages are now available. Grab them at the same download link.

This release also coincides with a revamped project homepage!

This is a big release with lots of updates. Please also note that this release contains breaking changes.

These release notes are based on the NEWS file.

Performance

  • Added a native JSON parser

    Performance critical portions of the JSON parser are now implemented in C. This improves latency and throughput for all database and view operations. We are using the fabulous yajl library.

  • Optional file compression (database and view index files)

    This feature is enabled by default.

    All storage operations for databases and views are now passed through Google’s snappy compressor. The result is simple: since less data has to be transferred from and to disk and through the CPU & RAM, all database and view accesses are now faster and on-disk files are smaller. Compression can be changed to gzip compression with options that specify the compression ratio or it can be fully disabled as well.

  • Several performance improvements, especially regarding database writes and view indexing

    Combined with the two preceding improvements, we made some less obvious algorithmic improvements that take the Erlang runtime system into account when writing data to databases and view index files. The net result is much improved performance for most common operations including building views.

    The JIRA ticket (COUCHDB-976) has more information.

  • Performance improvements for the built-in changes feed filters _doc_ids and _design

Security

The security system got a major overhaul making it way more secure to run CouchDB as a public database server for CouchApps. Unfortunately we had to break a bit of backwards compatibility with this, but we think it is well worth the trouble.

  • Documents in the _users database can no longer be read by everyone

    Documents in the _users databases can now only be read by the respective authenticated user and administrators. Before, all docs were world-readable including their password hashes and salts.

  • Confidential information in the _replication database can no longer be read by everyone

    Similar to documents in the _users database, documents in the _replicator database now get passwords and OAuth tokens stripped when read by a user that is not the creator of the replication or an administrator.

  • Password hashes are now calculated by CouchDB instead of the client

    Previously, CouchDB relied on the client to hash and salt the user’s password. Now, it accepts plain text passwords and hashes them before they are committed to disk, following traditional best practices.

  • Allow persistent authentication cookies

    Cookie based authentication can now keep a user logged in over a browser restart.

  • OAuth secrets can now be stored in the users system database

    This is better for managing large numbers of users and tokens than the old, clumsy way of storing OAuth tokens in the configuration system and configuration system.

  • Updated bundled erlang_oauth library to the latest version

    The Erlang library that handles OAuth authentication has been updated to the latest version.

Build System

  • cURL is no longer required to build CouchDB as it is only required by the command line JavaScript test runner

    This makes building CouchDB on certain platforms easier.

HTTP API

  • Added a data_size property to database and view group information URIs

    With this you can now calculate how much actual data is stored in a database file or view index file and compare it with the file size that is already being reported. The difference is CouchDB-specific overhead most of which can be reclaimed during compaction. This is used to power the automatic compaction feature (see below).

  • Added optional field since_seq to replication objects/documents

    This allows you to start a replication from a certain database update sequence instead from the start.

  • The _active_tasks API now exposes more granular fields for each task type

    The replication and compaction tasks, e.g. report their progress in the task info.

  • Added built-in changes feed filter _view

    With this you can use a view’s map function as a changes filter instead of duplicating.

Core Storage

  • Added support for automatic compaction

    This feature is disabled by default, but it can be enabled in the configuration page in Futon or the .ini files.

    Compaction is a regular maintenance task for CouchDB. This can now be automated based on multiple variables:

    • A threshold for the file_size to disk_size ratio (say 70%)
    • A time window specified in hours and minutes (e.g 01:00-05:00)

    Compaction can be cancelled if it exceeds the closing time. Compaction for views and databases can be set to run in parallel, but that is only useful for setups where the database directory and view directory are on different disks.

    In addition, if there’s not enough space (2 × data_size) on the disk to complete a compaction, an error is logged and the compaction is not started.

Replicator

  • A new replicator implementation that offers more performance and configuration options

    The replicator has been rewritten from scratch. The new implementation is more reliable, faster and has more configuration than the previous implementation. If you have had any issues with replication in previous releases, we strongly recommend giving 1.2.0 a spin.

    Configuration options include:

    • Number of worker processes
    • Batch size per worker
    • Maximum number of HTTP connections
    • Number of connection retries

    See default.ini for the full list of options and their default values.

    This allows you to fine-tune replication behaviour tailored to your environment. A spotty mobile network connection can benefit from a single worker process and small batch sizes to reliably, albeit slowly, synchronise data. A full-duplex 10GigE server-to-server connection on a LAN can benefit from more workers and higher batch sizes. The exact values depend on your particular setup and we recommend some experimentation before settling on a set of values.

Futon

  • Futon’s Status screen (active tasks) now displays two new task status fields: Started on and Updated on
  • Simpler replication cancellation

    Running replications can now be cancelled with a single click.

Log System

  • Log correct stack trace in all cases

    In certain error cases, CouchDB would return a stack trace from the log system itself and hide the real error. Now CouchDB always returns the correct error.

  • Improvements to log messages for file-related errors

    CouchDB requires correct permissions for a number of files. Error messages related to file permission errors were not always obvious and are now improved.

Various Bugfixes

  • Fixed old index file descriptor leaks after a view cleanup
  • Fixes to the _changes feed heartbeat option when combined with a filter. It affected continuous pull replications with a filter
  • Fix use of OAuth with VHosts and URL rewriting
  • The requested_path property of query server request objects now has the path requested by clients before VHosts and rewriting
  • Fixed incorrect reduce query results when using pagination parameters
  • Made icu_driver work with Erlang R15B and later
  • Improvements to the build system and etap test suite
  • Avoid invalidating view indexes when running out of file descriptors

Breaking Changes

This release contains breaking changes:

http://wiki.apache.org/couchdb/Breaking_changes

It is very important that you understand these changes before you upgrade.

Couchbase Survey Shows Accelerated Adoption of NoSQL in 2012

Couchbase today announced the results of an industry survey conducted in December that shows growing adoption of NoSQL in 2012. According to the survey, the majority of the more than 1,300 respondents will fund NoSQL projects in the coming year, saying the technology is becoming more important or critical to their company’s daily operations. Respondents also indicated that the lack of flexibility/rigid schemas associated with relational technology was a primary driver toward NoSQL adoption.

You can read the result of the survey here as well as some surprises in the survey at this page

NoSQL 2012 Survey Highlights

Key data points from the Couchbase NoSQL survey include:

  • Nearly half of the more than 1,300 respondents indicated they have funded NoSQL projects in the first half of this year. In companies with more than 250 developers, nearly 70% will fund NoSQL projects over the course of 2012.
  • 49% cited rigid schemas as the primary driver for their migration from relational to NoSQL database technology. Lack of scalability and high latency/low performance also ranked highly among the reasons given for migrating to NoSQL  (see chart below for more details).
  • 40% overall say that NoSQL is very important or critical to their daily operations, with another 37% indicating it is becoming more important.

 

 

Surprises from the Survey

Language mix. A common theme in the results was what one could interpret as the “mainstreaming” of NoSQL database technology. The languages being used to build applications atop NoSQL database technology, while they include a variety of more progressive choices, are dominated by the mundane: Java and C#. And while we’ve had a lot of anecdotal interest in a pure C driver for Couchbase (which we now have, by the way), only 2.1% of the respondents indicated it was the “most widely used” language for application development in their environment, behind Java, C#, PHP, Ruby, Python and Perl (in order).

Schema management is the #1 pain driving NoSQL adoption. So I’ll admit that I wasn’t actually surprised by this one, because I’d already been surprised by it earlier. Two years ago if you had asked me what the biggest need we were addressing was, I would have said it was the need for a “scale-out” solution at the data layer versus the “scale-up” nature of the relational model. That users wanted a database that scaled like their application tier – just throw more cheap servers behind a load balancer as capacity needs increase. While that is still clearly important, the survey results confirmed what I’d been hearing (to my initial surprise) from users: the flexibility to store whatever you want in the database and to change your mind, without the requirement to declare or manage a schema, is more important.

The Time For NoSQL Is Now

The Time For NoSQL Is Now is the conclusion made in a recent article by Andrew C. Oliver, practicing Oracle since 1998 and which thought Oracle was magical.

 

From its long experience he also share the following conclusion:

  • NoSQL solution requires less transformation which brings
    •  better the performance of the system
    • more automatic scalability
    • fewer bugs
  • In the end, applications almost always bottleneck on the DB
  • I truly think this is one technology that isn’t just marketing

 

You can read the full article available here:

http://osintegrators.com/node/76

 

Submitted by acoliver on Wed, 01/25/2012 – 11:08

In 1998 I had my first hardcore introduction to Oracle as well as HP/UX. I’d been working with SQL Server. I thought Oracle was magical. Unlike SQL Server there were so many knobs to turn! I could balance the crap out of the table spaces and rollback segments and temp segments. Nothing would ever have to be contending for IO on the same disk again (if only someone would buy enough disks)! HP/UX oddly led me to Linux. I had a graphical workstation and a Windows machine on my desk. The graphical workstation rarely remained in working order. A co-worker who had quit Red Hat, (before the IPO!) to join me at Ericsson USA, introduced me to Red Hat Linux 5.0. The great thing about it was that the X-Windows implementation allowed me to connect to the HP/UX box’s X-Windows with no problems. Eventually, I came to prefer Unix and Linux to Windows because they never seemed to crash and on relatively old hardware they performed better.

At some point, due to turnover, I became the final person around who knew how to care and feed Oracle. I knew what all the mystical ORA- numbers meant. Oracle was far better/faster for reporting than SQL Server. However, in normal business applications, not only was SQL Server cheaper but it didn’t need constant attention, care and feeding. Over the next 10 years, I watched the industry homogenize. Sure, SQL Server is around and far better than the one I used, but every large company I’ve worked with has an Oracle site license. Even the “Microsoft Shop” tends to have Oracle around for the big stuff.

I’ve made hundreds of thousands if not a million dollars tuning Oracle, caring for Oracle and scaling Oracle. I’ve written ETL processes, I’ve tuned queries, I’ve done PL/SQL procedures that call into other systems and even into caches to squeak a little more performance. I’ve scaled systems with millions of users on Oracle. The trouble with Oracle is that it is both frequently misused and that it really isn’t very scalable. I’m not saying to can’t make it scale, just saying it is expensive to do (the license costs are the tip of the iceberg) and that it is difficult work to do so.

The problem is not transactions so much as the relational model combined with the process model. You can work around these with flatter table structures and by using different instances for different data (think of the lines for voting A-N, O-R, S-Z), but that is work! Oracle RAC sorta makes things better on the read side, but with all new headaches and often with worse write performance. You can add a cache like Gemfire or Infinispanand indeed these are good solutions for the interim where you can’t fix the Oracle. However, fundamentally you’re trying to fix a broken foundation with struts and support beams. You can shift the weight, but really you’re not addressing the actual problem.

NoSQL databases, especially document databases like MongoDB or Couch, structure the data the way your system uses it. They shard the data automatically, they store your actual JSON structure without all of the transformation like Hibernate or JPA. The less transformation the better the performance of the system, the more automatic the scalability concerns, the fewer bugs! I don’t usually jump on bandwagons: the new language of the week (all compile to the same thing, none have significant productivity gains when you have a team size of more than say 2), whooo touchpads (I want a keyboard), filesystem of the day (I kept ext3 until I switched to an SSD drive), however, this particular bandwagon truly is a game-changer and completely necessary for cloud/Internet scale applications.

Yes SQL databases are completely fine for a departmental IT system. While Open Software Integrators does a lot of varied work with applications that don’t need to handle 8 million users, I don’t personally work on those applications. Most of the systems I work on are at least dealing with thousands to hundreds of thousands and in some cases millions of users. In the end, applications almost always bottleneck on the DB and most of what I do for customers is actually DB consulting in the end. The transition to NoSQL databases will take time. We still don’t have TOAD, Crystal Reports, query language standardization and other essential tools needed for mass adoption. There will be missteps (i.e. I may need a different type of database for reporting than for my operational system), but I truly think this is one technology that isn’t just marketing.

 

The Future of CouchDB, by Damien Katz

Damien Katz,CTO & Creator @Couchbase has just publish on its blog  its view for CouchDB and anwering tp the question “What’s the future of CouchDB?”

  • the future is Couchbase
  • more and more of the core database in C/C++
  • Bringing forth UnQ

Very interesting post which deserve to be fully read at this address

TouchDB 1.0 is out

TouchDB is a lightweight CouchDB-compatible database engine suitable for embedding into mobile apps. Think of it this way: If CouchDB is MySQL, then TouchDB is SQLite.

By “CouchDB-compatible” I mean that it can replicate with CouchDB and Couchbase Server, and that its data model and high-level design are “Couch-like” enough to make it familiar to CouchDB/Couchbase developers. Its API will not be identical and it may not support some CouchDB features (like user accounts) that aren’t useful in mobile apps. Its implementation is not based on CouchDB’s (it’s not even written in Erlang.) It does support replication to and from CouchDB.

By “suitable for embedding into mobile apps“, I mean that it meets the following requirements:

  • Small code size; ideally less than 256kbytes. (Code size is important to mobile apps, which are often downloaded over cell networks.)
  • Quick startup time on relatively-slow CPUs; ideally 100ms or less.
  • Low memory usage with typical mobile data-sets. The expectation is the number of documents will not be huge, although there may be sizable multimedia attachments.
  • “Good enough” performance with these CPUs and data-sets.

And by “mobile apps” I’m focusing on iOS and Android, although there’s no reason we couldn’t extend this to other platforms like Windows Phone. And it’s not limited to mobile OSs — the initial Objective-C implementation runs on Mac OS as well.

Google Insights: the NoSQL fights

Google insights provide Web Search Interest for the following nosql solution: cassandra, redis, mongodb, hadoop, couchdb

No leading solution appears from this insights available here

NoSQL skills over the world