Stability Improvements in the Hypertable 0.9.5.0 pre-release

Details on stability improvements in the Hypertable 0.9.5.0 pre-release have been posted on the blog: http://blog.hypertable.com/

 

We recently announced the Hypertable 0.9.5.0 pre-release.  Even though we’ve labelled it as a “pre” release, it is one of the biggest and most important Hypertable releases to date.  Among other things, it includes a complete re-write of the Master, to fix some known stability problems.  It represents a significant amount of work as can be seen by the following code change statistics:

  • 512 files changed
  • 30,633 line insertions
  • 14,354 line deletions

The following describes problems that existed in prior releases and how they were solved, and highlights other stability improvements included in the 0.9.5.0 pre-release.

Duplicate range load. In prior releases, when a Range Server decided to give up a range (e.g. after a split), it would inform the master by calling the Master::move_range() method and then record the move in its meta log (RSML).  Unfortunately, this logic contained a race condition.  If the range server called Master::move_range(), but died before it got a chance to record the move in the RSML, and then the Master was stopped (e.g. sysadmin restart of the system), all record of the move was lost.  When the RangeServer came back up, it would re-attempt to move the range, causing it to get loaded by two different range servers.  With the introduction of the Master MetaLog (MML) and a two-phaseMaster::move_range() operation, this problem has been resolved.

Overlapping ranges. In prior releases, the Master would ask a range server to load a range by calling theRangeServer::load_range() method and would rely on the ALREADY_LOADED response code to handle situations where the acknowledgement was lost (e.g. range server or master died at an inopportune moment) and the RangeServer::load_range() call was re-issued.  This logic also contained a race condition.  When a range was loaded and the acknowledgement was lost, the loaded range could split before the Master re-attempted to load the range.  When RangeServer::load_range() call was re-issued, the RangeServer happily loaded the range because it no longer contained the range in its live set (due to the split).  With the introduction of a two-phase load range operation, this problem has been resolved.

Lost updates (multiple access groups). This was a bug in how the system decided to remove commit log fragments for tables with multiple access groups.  The system computes an “earliest cached revision” value for each access group and only removes commit log fragments that contain cells whose revision number is less than that value.  In prior releases, there was a bug where the earliest cached revision of the last access group was taken for all the access groups.  In certain situations, this caused commit log fragments to be removed prematurely which resulted in data loss on system restart.  This bug has been fixed in the current release.
Transparent Master Failover. In prior releases, if a client issued a request to the Master and the Master failed before delivering the results, the request would fail.  With the introduction of the Master MetaLog and a two-phase request sequence, a Master failover can occur mid-request and the request will complete successfully on the new Master, completely transparent to the requesting client.
Cloudera’s CDH3 Hadoop Release. A big source of stability issues in the past have stemmed from problems with HDFS.  The well-known “sync” issue has been the biggest trouble for Hypertable, causing critical log files to effectively disappear, resulting in data loss or worse, leaving the system in an inconsistent and inoperable state.  The CDH3 Hadoop release from Cloudera includes a number of patches to the 0.20.2 Apache release that appear to have solved the sync problem.  We tested it with Hypertable through the beta period and have found it to be stable.  The current Hypertable release is built against CDH3 and we recommend it for all Hypertable deployments.
Stability is our #1 priority followed closely by performance and scalability.  Hypertable has been in development since early 2007 and the feedback we’ve gotten over the years from our production deployments and open source community has helped Hypertable to stabilize and become a much more mature product.  We’re aggressively working towards the 1.0 release and we look forward to seeing Hypertable become the infrastructure of choice for solving big data problems.

 

Leave a Reply

You must be logged in to post a comment.