Challenging the reliability of distributed data system

There are two basic tasks that any computer system needs to accomplish:

  • storage
  • computation

Distributed systems allow to solve the same problem that you can solve on a single computer using multiple computers – usually, because the problem no longer fits on a single computer.

Distributed systems need to partition data or state up over lots of machines to scale. Adding machines increases the probability that some machine will fail, and to address this these systems typically have some kind of replicas or other redundancy to tolerate failures.

Where is the flaw in such reasoning?

It is the assumption that failures are independent. If you pickup pieces of identical hardware, run them on the same network gear and power systems, have the same people run and manage and configure them, and run the same (buggy) software on all of them. It would be incredibly unlikely that the failures on these machines would be independent of one another in the probabilistic sense that motivates a lot of distributed infrastructure. If you see a bug on one machine, the same bug is on all the machines. When you push bad config, it is usually game over no matter how many machines you push it to.

Storage market

We’ve been for quite a long time into a trend of ever cheaper storage capacity.The GB price have been falling until the industry crisis from 2011-2012 (flooding in Thailand). Even if the trend remains it seems like something I’ve changed.

1) The market is no longer seeking cheaper capacity storage but instead faster storage. SSD technologies will change the market just like the multi-core change the CPU market when power and heat issues forced chip makers in the early 2000’s to move from single to multiple cores to enable CPU design to keep pace with Moore’s Law

2)Even the demand of storage capacity has relieve for personal files. People no longer need to download and store locally their movies and music. Instead they can subscribe to legal offer such as Netflix,Spotify or benefit from service like Google Drive or DropBox. Not only cheaper but also simpler to use they have changed the market for times to come.


1956, IBM’s hard drive ….10 mo



The recent trend illustrated

blog-cost-per-gb-1 blog-historical-vs-actual-1



Parallel programming


What's going to change in the next 10 years ?

“What’s going to change in the next 10 years?” is a very interesting question buta very common one.

A more important question is ‘What’s not going to change in the next 10 years?‘ because you can build a business strategy around the things that are stable in time.

Amazon’s CEO, Jeff Bezos.

In the retail business for instance, we know that customers want low prices, and I know that’s going to be true 10 years from now.

They want fast delivery; they want vast selection. It’s impossible to imagine a future 10 years from now where a customer comes up and says, ‘Jeff I love Amazon; I just wish the prices were a little higher,’ [or] ‘I love Amazon; I just wish you’d deliver a little more slowly.’ Impossible. And so the effort we put into those things, spinning those things up, we know the energy we put into it today will still be paying off dividends for our customers 10 years from now. When you have something that you know is true, even over the long term, you can afford to put a lot of energy into it.”



Those principles applied to our IT world;

  • users want fast and reliable result
  • operators want stable and secure platform
  • developers want to focus on architecture and algorithm, not debugging the user interface anymore
  • managers want the cost for investments and maintenance to go down

Definition: Availability

Availability = uptime / (uptime + downtime)

Availability from a technical perspective is mostly about being fault tolerant. Because the probability of a failure occurring increases with the number of components, the system should be able to compensate so as to not become less reliable as the number of components increases.

For example, availability rate for a given service over an entire year mean the following:

Availability % How much downtime is allowed per year?
90% (“one nine”) More than a month
99% (“two nines”) Less than 4 days
99.9% (“three nines”) Less than 9 hours
99.99% (“four nines”) Less than an hour
99.999% (“five nines”) ~ 5 minutes
99.9999% (“six nines”) ~ 31 seconds

How users and programmers see each others


The Evolution of data storage


Happy 7th birthday MySQL bug #20786

[29 Jun 2006 22:31] Erik Kay reported the bug …..


[21 Jul 2011 1:27] John Swapceinski
I'm planning to throw a 5 year birthday party for this bug at the end of the month.
Who wants to come?  Anyone from MySQL/Sun/Oracle?  There will be cake.

[28 Jun 17:25] Ada Pascal
And now there has been cake:
Happy 7th birthday MySQL bug #20786!

Wikipedia Recent Changes Map

When an unregistered user edits Wikipedia, he or she is identified by his or her IP address. These IP addresses are translated to users’ approximate geographic location. Edits by registered users do not have associated IP information, so the map actually represents only a small portion of the total edit activity on Wikipedia.

Built using, and the Wikimedia RecentChanges IRC feed, broadcast through wikimon. Sourceavailable on github.

Built by Stephen LaPorte and Mahmoud Hashemi.

Google will soon worth more than Apple

Google now worth more than Microsoft and will soon worth more than Apple

GOOG:US  858.800 USD13.080 1.55%

Google glass is getting hyped and trashed all at the same time and it’s not even here yet. Meanwhile, Android’s marketplace dominance and Google’s nicely executed moves to mobile ads are contributing to the valuation. And of course, Microsoft is suffering as their tablet/smartphone offerings flounder and the PC business that they dominate shrinks.

It’s an interesting thought, thinking through what company could be the next one to reach the quarter-trillion valuation mark, which is the valuation that both Google and Microsoft just recently shot past. In the far future, could (or perhaps not too far into the future, as in later this year?), Oracle and Cisco and Intel reach that plateau? They each have been worth that much before.