The economics of flash memory are staggering. If you’re not using SSD, you are doing it wrong.
Some small applications fit entirely in memory – less than 100GB – great for in-memory solutions….If you have a dataset under 10TB, and you’re still using rotational drives, you’re doing it wrong.
Networks are measured in bandwidth (throughput), but if your access patterns are random and low latency is required, each request is an individual network packet. Even with the improvements in Linux network processing, we find an individual core is capable of resolving about 100,000 packets per second through the Linux core.
100,000 packets per second aligns well with the capability of flash storage at about 20,000 to 50,000 per device, and adding 4 to 10 devices fits well in current chassis. RAM is faster – in Aerospike, we can easily go past 5,000,000 TPS in main memory if we remove the network bottleneck through batching – but for most applications, batching can’t be cleanly applied.
This bottleneck still exists with high-bandwidth networks, since the bottleneck is the processing of network interrupts. As multi-queue network cards become more prevalent (not available today on many cloud servers, such as the Amazon High I/O Instances), this bottleneck will ease – and don’t think switching to UDP will help. Our experiments show TCP is 40% more efficient than UDP for small transaction use cases.
1. Flash is too expensive.
Flash is 10x more expensive than rotational disk. However, you’ll make up the few thousand dollars you’re spending simply by saving the cost of the meetings to discuss the schema optimizations you’ll need to try to keep your database together. Flash goes so fast that you’ll spend less time agonizing about optimizations.
2. I don’t know which flash drives are good.
Aerospike can help. We have developed and open-source a tool (Aerospike Certification Tool) that benchmarks drives for real-time use cases, and we’re providing our measurements for old drives. You can run these benchmarks yourself, and see which drives are best for real-time use.
3. They wear out and lose my data.
Wear patterns and flash are an issue, although rotational drives fail too. There are several answers. When a flash drive fails, you can still read the data. A clustered database and multiple copies of the data, you gain reliability – a server level of RAID. As drives fail, you replace them. Importantly, new flash technology is available every year with higher durability, such as this year’s Intel S3700 which claims each drive can be rewritten 10 times a day for 5 years before failure. Next year may bring another generation of reliability. With a clustered solution, simply upgrade drives on machines while the cluster is online.
4. I need the speed of in-memory
Many NoSQL databases will tell you that the only path to speed is in-memory. While in-memory is faster, a database optimized for flash using the techniques below can provide millions of transactions per second with latencies under a millisecond.
Many projects work with main memory because the developers don’t know how to unleash flash’s performance. Relational databases only speed up 2x or 3x when put on a storage layer that supports 20x more I/Os. Following are three programming techniques to significantly improve performance with flash.
1. Go parallel with multiple threads and/or AIO
Different SSD drives have different controller architectures, but in every case there are multiple controllers and multiple memory banks—or the logical equivalent. Unlike a rotational drive, the core underlying technology is parallel.
You can benchmark the amount of parallelism where particular flash devices perform optimally with ACT, but we find the sweet spot is north of 8 parallel requests, and south of 64 parallel requests per device. Make sure your code can cheaply queue hundreds, if not thousands, of outstanding requests to the storage tier. If you are using your language’s asynchronous event mechanism (such as a co-routine dispatch system), make sure it is efficiently mapping to an OS primitive like asynchronous I/O, not spawning threads and waiting.
2. Don’t use someone else’s file system
File systems are very generic beasts. As databases, with their own key-value syntax and interfaces, they have been optimized for a particular use, such as multiple names for one object and hierarchical names. The POSIX file system interface supplies only one consistency guarantee. To run at the speed of flash, you have to remove the bottleneck of existing file systems.
According to Cassandra developer center latest blog post about performance, the 1.0 release of Cassandra unleashed the following performance improvements:
The full articles is available here:
When dealing with big data, you will face performance problems with the most simple and basic operation as soon as the processing require the whole data sets to be analyzed.
It is the case for instance when:
This in mind, you can easily and quickly anticipated issues in advance and start thinking about solving the problem.
Compression is often a key solution to many performance issue as its require CPU speed which is currently always faster than i/o disk and i/o networks, so compression allow to speed up disk access, data transfer over network and eventually allow to keep reduced data in memory.
Statistics can often apply to fasten your algorithm and are not necessarily complex, maintaining values range (min,max) or values distribution might fasten your processing resolution path.
Caching, deterministic result are provided by process independant from the data or based on data which rarely changed and forwhich you can assume they won’t change during your process time
Avoid data type conversions, because it’s always resources consuming
Balance then load, paralellized the processing and use map reduce :)
We tend to refuse such approach, but sometimes it is a good exercise to go back and review why we do the things we do.
Can i approximate without altering significantly the result ?
Can i use a representative data sample instead of the whole data ?
At least do not avoid to think this way, wondering if solving an easier problem or looking for approximate result can’t finally bring you very close to the solution.
Most of the processing against large datasets are willing to calculate statistics. This book “Think Stats: Probability and Statistics for Programmers” is an open PDF under the terms of the Creative Commons Attribution-NonCommercial 3.0.
Google, the company which bring revolution to the web seems facing a critical time.
The truth based on facts is the company currently face multiple problem such as:
Google know, and its already big, they are in trouble, and try to react. This reaction is interesting, because its trying to bringing back the success keys of the company:
Performance has a real meaning at Google, we all know that, demonstration is made every day.
Let’s take a moment to learn from the Google approach in this matter.
Let’s first start by a technical session by Brett Slatkin (Google),it introduce techniques you can use to improve your application’s performance:
Once you saw this video you understood things goes dofferently at Google (in case you didn’t knew).
Yet every engineers had a list of performance figures to keep in mind (figures from the Google AppEngine perspective, but still generally applicable). Jeffrey Dean (Google) is responsible for this ‘Numbers Everyone Should Know’ list:
The obvious things:
Hadoop Streaming is a powerful tool and there is a nice illustration on how cool it can be:
Quick reminder that predictability is more important to scalability than performance.
Facebook as recently explained their strategy in this matter, and gave a very interesting MySQL Tech Talk .