Hypertable 0.9.5.3 has been released

Hypertable 0.9.5.3 has been released, and can be downloaded from http://www.hypertable.com/download/

In this new version, you will find the following changes:

  • Added support for OFFSET, CELL_OFFSET
  • Fixed bug in TableInfo
  • Fixed bug in Monitoring class where the last and current server sets were incorrectly compared.
  • Added support for TIME_ORDER DESC and HyperAppHelper::create_cell_unique for the PHP microblog sample
  • Added support for chronological timestamps and unique values
  • Andy’s Hyperspace performance improvements
  • Added support for .tar.bz2 to install_pkg Capfile task
  • Changed Mac package name to include OSX version number
  • Fixed missing library dependency on Mac OSX Lion
  • Added OFFLOAD algorithm to basic load balancer, to move ranges of a list of servers.
  • Updated performance test with latest HBase and Zookeeper
  • Added regression test for issue #719
  • Fixed issue #720: LOAD DATA INTO FILE now skips empty values
  • Fixed CELL_LIMIT_CF prob in ThriftBroker; Updated documentation
  • Prefetch schemas from Hyperspace to speed up local recovery in RangeServer
  • Fixed issue #718 – cpp thrift client is now installed in /lib/cpp

New Features in Hypertable 0.9.5.3

0.9.5.3 is a patch release, but it also includes a few new features that are worth mentioning.

Pagination with OFFSET and CELL_OFFSET

With this release you can specify an OFFSET or CELL_OFFSET option to a SELECT clause to skip a number of rows or cells in the query. This is often used in combination with LIMIT and CELL_LIMIT to implement pagination, i.e. when you only display the first 20 results of a query in a web page and then let users navigate to the next or previous pages.

SELECT * FROM table OFFSET 40 LIMIT 20;

The OFFSET and CELL_OFFSET options are also available for the C++ API (ScanSpecBuilder::set_row_offset and ScanSpecBuilder::set_cell_offset) and the Thrift APIs.

Chronological Timestamps

By default, Hypertable sorts Timestamps in reverse chronological order (newest on top). Many users have expressed wishes to reverse this order, therefore we added a new column family option TIME_ORDER DESC. A SELECT of a column with this option will always return the oldest values of each cell. See below how you can use this behavior to create unique user IDs or to simulate “AUTO_INCREMENT” fields.

CREATE TABLE test (cf1 TIME_ORDER DESC);

For completeness’ sake we also added TIME_ORDER ASC which specifies the default behavior.

Unique Values (for a scalable “AUTO_INCREMENT”)

A common usage scenario is to create unique IDs, i.e. for users signing up to a web page, for items being stored in your catalogue etc. Traditional SQL databases have options like “AUTO_INCREMENT” which automatically assign a new ID to a row whenever you insert one. Usually they use a counter which is then incremented. Such a counter would be inefficient in a distributed environment since it always has to be synchronized with the other nodes. Therefore we came up with a new design, one that scales and does not require synchronization.

The first step is to create a column for the unique values. A unique value can never be overwritten once it was created, therefore we use TIME_ORDER DESC in combination with MAX_VERSIONS 1 (because we’re not interested in other values than the oldest one).

CREATE TABLE user_profiles (user_ids TIME_ORDER DESC MAX_VERSIONS 1);

The actual insert operation consists of two parts: first insert a unique key; second verify that it was really inserted (to make sure that no other node in the cluster has inserted an identical key in the meantime).

The following HQL tries to create a unique User ID for user “alice”.

INSERT INTO user_profiles VALUES (“alice”, “user_ids”, “random_unique_id”);
SELECT user_ids FROM user_profiles WHERE ROW = “alice”;
# now verify that the SELECT returned cell value “random_unique_id”

For convenience we added a new HQL function GUID() which creates globally unique IDs and which can be used to create row keys or cell values:

INSERT INTO user_profiles VALUES (“alice”, “user_ids”, GUID());

For even more convenience we created a new helper library (HyperAppHelper) which can create GUIDs and insert unique values. The functions are declared in HyperAppHelper/Unique.h. These functions are also exported to the Thrift interface. Our PHP microblogging sample, which implements much of Twitter’s functionality, uses this function when new users sign up.

Leave a Reply

You must be logged in to post a comment.