Postgres Outperforms MongoDB and Ushers in New Developer Reality

According to EnterpriseDB’s recent benchmark, Postgres Outperforms MongoDB and Ushers in New Developer Reality

Potgres would outperform MongoDB performance but also the MongoDB data size requirement would be outperformed by by approx. 25%

EDB found that Postgres outperforms MongoDB in selecting, loading and inserting complex document data in key workloads involving 50 million records:

  • Ingestion of high volumes of data was approximately 2.1 times faster in Postgres
  • MongoDB consumed 33% more the disk space
  • Data inserts took almost 3 times longer in MongoDB
  • Data selection took more than 2.5 times longer in MongoDB than in Postgres

Find the full article here

The benchmark tools is available on GitHub: https://github.com/EnterpriseDB/pg_nosql_benchmark

IBM’s new Power8 chip technology unveiled

IBM Unveils Power8 Chip As Open Hardware. Google and other OpenPower Foundation partners express interest in IBM’s Power8 chip designs and server motherboard specs since Power8 has been designed with some specific big-data handling characteristics.It is, for example, an eight-threaded processor, meaning each of 12 cores in a CPU will coordinate the processing of eight sets of instructions at a time — a total of 96 processes. “processes” is to understood as a set of related instructions making up a discrete process within a program. By designating sections of an application that can run as a process and coordinate the results, a chip can accomplish more work than a single-threaded chip.

By licensing technology to partners, IBM is borrowing a tactic used by ARM in the market for chips used in smartphones and tablets. But the company faces an uphill battle.

More information:

http://openpowerfoundation.org/

http://bits.blogs.nytimes.com/

Switch your databases to SSD Storage

http://highscalability.com/blog/2012/12/10/switch-your-databases-to-flash-storage-now-or-youre-doing-it.html

 

Switch Your Databases To Flash Storage. Now. Or You’re Doing It Wrong.

The economics of flash memory are staggering. If you’re not using SSD, you are doing it wrong. 

Some small applications fit entirely in memory – less than 100GB – great for in-memory solutions….If you have a dataset under 10TB, and you’re still using rotational drives, you’re doing it wrong.

With The Right Database, Your Bottleneck Is The Network Driver, Not Flash

Networks are measured in bandwidth (throughput), but if your access patterns are random and low latency is required, each request is an individual network packet. Even with the improvements in Linux network processing, we find an individual core is capable of resolving about 100,000 packets per second through the Linux core.

100,000 packets per second aligns well with the capability of flash storage at about 20,000 to 50,000 per device, and adding 4 to 10 devices fits well in current chassis. RAM is faster – in Aerospike, we can easily go past 5,000,000 TPS in main memory if we remove the network bottleneck through batching – but for most applications, batching can’t be cleanly applied.

This bottleneck still exists with high-bandwidth networks, since the bottleneck is the processing of network interrupts. As multi-queue network cards become more prevalent (not available today on many cloud servers, such as the Amazon High I/O Instances), this bottleneck will ease – and don’t think switching to UDP will help. Our experiments show TCP is 40% more efficient than UDP for small transaction use cases.

 

The Top Myths Of Flash

1. Flash is too expensive.

Flash is 10x more expensive than rotational disk. However, you’ll make up the few thousand dollars you’re spending simply by saving the cost of the meetings to discuss the schema optimizations you’ll need to try to keep your database together. Flash goes so fast that you’ll spend less time agonizing about optimizations.

2. I don’t know which flash drives are good.

Aerospike can help. We have developed and open-source a tool (Aerospike Certification Tool) that benchmarks drives for real-time use cases, and we’re providing our measurements for old drives. You can run these benchmarks yourself, and see which drives are best for real-time use.

3. They wear out and lose my data.

Wear patterns and flash are an issue, although rotational drives fail too. There are several answers. When a flash drive fails, you can still read the data. A clustered database and multiple copies of the data, you gain reliability – a server level of RAID. As drives fail, you replace them. Importantly, new flash technology is available every year with higher durability, such as this year’s Intel S3700 which claims each drive can be rewritten 10 times a day for 5 years before failure. Next year may bring another generation of reliability. With a clustered solution, simply upgrade drives on machines while the cluster is online.

4. I need the speed of in-memory

Many NoSQL databases will tell you that the only path to speed is in-memory. While in-memory is faster, a database optimized for flash using the techniques below can provide millions of transactions per second with latencies under a millisecond.

Techniques For Flash Optimization

Many projects work with main memory because the developers don’t know how to unleash flash’s performance. Relational databases only speed up 2x or 3x when put on a storage layer that supports 20x more I/Os. Following are three programming techniques to significantly improve performance with flash.

1. Go parallel with multiple threads and/or AIO

Different SSD drives have different controller architectures, but in every case there are multiple controllers and multiple memory banks—or the logical equivalent. Unlike a rotational drive, the core underlying technology is parallel.

You can benchmark the amount of parallelism where particular flash devices perform optimally with ACT, but we find the sweet spot is north of 8 parallel requests, and south of 64 parallel requests per device. Make sure your code can cheaply queue hundreds, if not thousands, of outstanding requests to the storage tier. If you are using your language’s asynchronous event mechanism (such as a co-routine dispatch system), make sure it is efficiently mapping to an OS primitive like asynchronous I/O, not spawning threads and waiting.

2. Don’t use someone else’s file system

File systems are very generic beasts. As databases, with their own key-value syntax and interfaces, they have been optimized for a particular use, such as multiple names for one object and hierarchical names. The POSIX file system interface supplies only one consistency guarantee. To run at the speed of flash, you have to remove the bottleneck of existing file systems.

Cassandra 1.0 unleashed high performance improvements

According to Cassandra developer center latest blog post about performance, the 1.0 release of Cassandra unleashed the following performance improvements:

  • Reads performance increased by 400%  !!!
  • Writes performance increased by 40%
  • Networks/Non-local operation are 15% faster

 

The full articles is available here:

http://www.datastax.com/dev/blog/whats-new-in-cassandra-1-0-performance

 

 

 

 

Solving your performance issue when dealing with big data

  1. Foresee and understand your performance issue

When dealing with big data, you will face performance problems  with the most simple and basic operation as soon as the processing require the whole data sets to be analyzed.

It is the case for instance when:

  • You aggregate data, to deliver summary statistics: action such as “count”,”min”,”avg” etc…
  • You need to sort your data

This in mind, you can easily and quickly anticipated issues in advance and start thinking about solving the problem.

  1. Solving the performance issue using technical tools

Compression is often a key solution to many performance issue as its require CPU speed which is currently always faster than i/o disk and i/o networks, so compression allow to speed up disk access, data transfer over network and eventually allow to  keep reduced data in memory.

Statistics can often apply to fasten your algorithm and are not necessarily complex, maintaining values range (min,max) or values distribution might fasten your processing resolution path.

Caching, deterministic result are provided by process independant from the data or based on data which rarely changed and forwhich you can assume they won’t change during your process time

Avoid data type conversions, because it’s always resources consuming

Balance then loadparalellized the processing and use map reduce :)

 

  1. Solving the performance issue by giving-up or resigning

We tend to refuse such approach, but sometimes it is a good exercise to go back and review why we do the things we do.

Can i approximate without altering significantly the result ?

Can i use a representative data sample instead of the whole data ?

At least do not avoid to think this way, wondering if solving an easier problem or looking for approximate result can’t finally bring you very close to the solution.

 

Think Stats: Probability and Statistics for Programmers

Most of the processing against large datasets are willing to calculate statistics. This book “Think Stats: Probability and Statistics for Programmers” is an open PDF under the terms of the Creative Commons Attribution-NonCommercial 3.0.

 

Download thinkstats



				

Google reloaded, at least trying to

Google, the company which bring revolution to the web seems facing a critical time.

The truth based on facts is the company currently face multiple problem such as:

  • The stock is falling….breaking the extraordinary trend which was part of the google’s myth
  • There are criticism, from financial community, since Larry Page take over the company
  • FTC antitrust probe are back
  • Apple has reborn under the google’s reign
  • They miss to get “social” and now need to fight hard with Facebook (first display of advertising in US)
  • They miss Google TV so far … but this fight still open ….
  • Rolling out the people widget feature on gmail is taking longer than expected(still not yet completed)
  • Technical issue, see Google: At scale everything breaks

 

Google know, and its already big,  they are in trouble, and try to react. This reaction is interesting, because its trying to bringing back the success keys of the company:

  • Larry Page means the initial company spirit(technical startup) is back and the engineers should be backed up,strengthened in their choice
  • Restoring the technical leadership , GFS the Google File System at the very basis of their architecture is being reviewed and enhanced. Google Panda is being updated to version 2.2, with an improved algorithm.

 

Performance: the Google approach

Performance has a real meaning at Google, we all know that, demonstration is made every day.

Let’s take a moment to learn from the Google approach in this matter.

Let’s first start by a technical session by Brett Slatkin (Google),it introduce techniques you can use to improve your application’s performance:

http://sites.google.com/site/io/building-scalable-web-applications-with-google-app-engine

 

Once you saw this video you understood things goes dofferently at Google (in case you didn’t knew).

Yet every engineers had a list of performance figures to keep in mind (figures from the Google AppEngine perspective, but still generally applicable). Jeffrey Dean (Google) is responsible for this ‘Numbers Everyone Should Know’ list:

  • L1 cache reference 0.5 ns
  • Branch mispredict 5 ns
  • L2 cache reference 7 ns
  • Mutex lock/unlock 100 ns
  • Main memory reference 100 ns
  • Compress 1K bytes with Zippy 10,000 ns
  • Send 2K bytes over 1 Gbps network 20,000 ns
  • Read 1 MB sequentially from memory 250,000 ns
  • Round trip within same datacenter 500,000 ns
  • Disk seek 10,000,000 ns
  • Read 1 MB sequentially from network 10,000,000 ns
  • Read 1 MB sequentially from disk 30,000,000 ns
  • Send packet CA->Netherlands->CA 150,000,000 ns

 

The obvious things:

  • The scales of those differences: they are huge
  • Memory is is the target for high performance,then network is faster than disks
  • Writes are 40 times expensive than reads
  • Compression must be cheap and prior to network transfer

Gratuitous Hadoop: Stress Testing on the Cheap

Hadoop Streaming is a powerful tool and there is a nice illustration on how cool it can be:

http://devblog.factual.com/hadoop-streaming-stress-testing

 

Facebook – Predictability is more important to scalability than performance

Quick reminder that predictability is more important to scalability than  performance.

Facebook as recently explained their strategy in this matter, and gave a very interesting MySQL Tech Talk .