Neo4j 1.9.2 has been released

Neo4j, version 1.9.2 is now available.


  • Optimize IO performance on Windows
  • Improve procedure to the set up networking for HA clusters
  • Some fixes to the REST API

Neo4j 1.9.2 is available immediately and is an easy upgrade from any other 1.9.x versions

You can download from the web site

Neo4J introducing labels in 2.0

A big new feature in Neo4j 2.0 are node labels and real, automatic indexes. Here you can quickly get an update on this extension of the property graph model.

You can watch the presentation here.

With the new node-label feature you can assign any number of types from your domain to a node. Imagine labels like Person, Location, Product, Project, User etc. Adding, querying and removing labels is supported in all Neo4j-APIs: Cypher, Java-API and REST-API (Batch-Inserter is in the works).

Starting with Neo4j 2.0 a last missing piece to Cypher functionality was added too. The new labels allow to provide label-based indexes which are handled automatically by the database. That means after an index is created all existing nodes with the label and properties are added to it behind the scenes and after the completion of that task the index will be updated transactionally.

These indexes are used by Cypher to perform index based lookups based on the label and properties that are part of the index. That either happens automatically for simple expressions or with an explicit index hint.


Neo4j Preview 2.0.0-M01 is available for download

Neo4j 1.9.M03 has been released

Neo4j 1.9.M03 has been released and an official bulletin have been posted on their blog

Changes and improvements includes:



We removed an issue with relationship-type-id’s which allows everyone to use the full size of 16 bit aka 65535 different relationship-types now. We also worked on the Java7/OpenJDK support (so far compilation and tests).

High Availability

For the new HA mode we improved the logging, added more tests and made it more robust on startup and when clients leave the cluster in order to stabilize the component for Neo4j 1.9.GA.


In this milestone, the code for the Tinkerpop integration into the Neo4j Server and its Web-Interface and the Gremlin Plugin has been refactored and centralized. This makes all Tinkerpop dependencies reside only inside that plugin. So it will be possible to support two versions of the Gremlin plugin, one for the 1.5 release and another plugin for the current 2.2.


Cypher got a lot of internal refactoring, especially around the internal handling of start- and updating clauses and general parse-result representation and management.

Our Cypher code has seen some bug fixes, namely around queries that returned non-existant paths, substring length limits and updating properties in bulk with SET. We also removed the now obsolete Cypher Plugin that has been deprecated for two releases now.


Install available from

Neo4j 1.9.M02 has been released

Neo4j Milestone Release 1.9.M02 is out and available for download at:

While the new changes might not be visible at the first glance, let’s look into Neo4j’s engine room to see what has changed.

Everyone’s most beloved query language, Cypher, has matured a lot thanks to Jake and Andres’ incredible work. They have made query execution much faster, for most use-cases, while utilizing less memory. The lazy execution of queries has sneaked away lately, so Andres caught it and put it back in. That means you can run queries with potentially infinitely large result sets without exhausting memory. Especially when streaming results (no aggregation and ordering) it will use only a tiny fraction of your memory. The very frequent construct ORDER BY … LIMIT … now benefits from a better top-n-select algorithm. These latest improvements are closing the performance gap to the core-API even more. We’ve also glimpsed a new internal SPI, that will allow Cypher to run even faster in the future.

For top speed please make sure to use query parameters everywhere, it helps a lot even if the nowconfigurable query cache size (e.g. query_cache_size=1000) allows for a larger number of queries to be cached.
As for some eye-candy, we provide you with a slicker version of the Neo4j console which now features interactive jQuery data result tables to allow in-browser filtering, searching and paging.

Our shiny new High Availability Cluster can now be upgraded seamlessly from an existing Zookeeper based setup to the new infrastructure that runs on a Paxos (coordinator-free) implementation. So you can upgrade your test-clusters without any downtime, if you want to know more, please check out the HA documentation.
For a better HA experience, we have added a new extension providing current cluster information that a load balancer can act on to update its routing. For the curious, there is also much more JMX monitoring information about the cluster available in the web interface.

In the depths of the Kernel, several performance improvements have been applied resulting in a better overall performance of Neo4j.

A nice feature we created due to a user request is the OrderByTypeExpander that keeps the provided order of relationship-types AND directions during the traversal.

To be better safe than sorry, we have sandboxed the JavaScript traversals that are exposed via the Server REST API to be more secure.

Now, go ahead and give your project some Neo4j love. Test it with the lastest and greatest Neo4j release so far and tell us how we did. Download it from the new Neo4j website that we created for your convenience and learning experience. And while you’re on it, please explore the new site and provide us with feedback on its helpfulness.

Nine Databases in 45 Minutes



The following video provides a crash course in nine key databases:  Postgres, CouchDB, MarkLogic, Riak, VoltDB, MongoDB, Neo4j, HBase and Redis. All in just 45 minutes.

Miles Pomeroy, Chad Maughan, and Jonathan Geddes run through each database in five minutes each, thus the title, “9 Databases in 45 minutes.” The video proceeds in efficient and occasionally amusing fashion, where a countdown clock and a gong keep the presentations terse.

ACID transaction in the NoSQL world

NoSQL solutions have lighter weight transactional semantics than relational databases generaly speaking

Actually some of them does support ACID, or alike, transaction :

Neo4j 1.8 Realease Candidate 1 has been released

Neo4j 1.8 Realease Candidate 1 has been released,  bringing performance and cypher improvements  and a bunch of nice changes, as noted in the change log:

  • Removed contention around allocating and moving persistence windows so that a thread won’t need to await another thread doing this refresh, instead just continue knowing that the windows will be optimally placed in a very near future.
  • Removed contention around translating a key (as String) into the ID by using copy-on-write map instead of a concurrent hash map. Used in property key as well as relationship type translation.
  • Fix for Node/Relationship#getPropertyValues() sometimes returning null values from the iterator.


  • Upgraded Jackson JAXRS to version 1.9.7
  • Keeping the Cypher execution engine between calls makes it possible to re-use execution plans
  • added User-Agent header tracking to udc to determine rest-driver usage


  • Removed the /../ literal for regular expressions. Now a normal string literal is used instead
  • Concatenation handles types better
  • Changed how the graph-matching module is used, to make it safe for concurrent use
  • Better type error messages
  • Renamed iterable to collection
  • Fixed #795: so that WITH keeps parameters also for empty aggregation results
  • Fixed #772: Creating nodes/relationships goes before SET, so SET can run on already created elements
  • Added error when using != instead of <>
  • Fixed #787: Issue when comparing array properties with literal collections
  • Fixed #751: Better error message for some type errors
  • Fixed #818: Problem where properties could only have scalar values (they can be arrays as well)
  • Fixed #834: Gives relevant syntax exception when properties are entered in MATCH

Get Neo4j 1.8.RC1

Neo4j 1.8.RC1 is available for:

90% of the existing data was created in the last two years

And a few more facts about BigData:

  • Wal-Mart handles more than a million customer transactions every hour.
  • Facebook hosts more than 50 billion photos.
  • Google has set up thousands of servers in huge warehouses to process searches.
  • 90 % of the data that exists today was created within the last two years.


It is a pattern of growth driven by such rapid and relentless trends as the rise of social networks, video and the Web.  Particularly for organizations struggling to keep on top of their most critical missions, providing visibility into, and actionable business intelligence out of the explosive surge in data, has created unprecedented challenges.

That’s because big data causes big problems for companies, as well as for our economy and national security. Look no further than the financial crisis. Near the end of 2008, when the global financial system stood at the brink of collapse, the CEO of a global banking giant during a conference call with analysts was repeatedly asked to quantify the volume of mortgage-backed security holdings on the bank’s books. Despite the bank’s having spent a whopping $37 billion on IT operations over the previous 10 years, his best response was a sheepish: “I don’t have that information.”

Had regulators and big banks been able to accurately assess their exposure to subprime lending, we might have dampened the recession and saved the housing market from its biggest fall in 30 years

Introduction to graph database by Neo4J

Great introduction to Graph Databases provided by Neo4J:

This video demonstrates how graph databases fit within the NOSQL space, and where they are most appropriately used. In this session you will learn:

  • Overview of NOSQL
  • Why graphs matter
  • Overview of Neo4j
  • Use cases for graph databases

Neo4j 1.8.M04 has been released

Neo4j 1.8.M04 has been released

Neo4j 1.8.M04 is now available:

This new version is offering a few new ways to help you find happy paths. To query a graph you use a traversal, which identifies paths of nodes and relationships. This release updates the capabilities of Neo4j’s core Traversal Framework and introduces new ways to use paths in Cypher.


Graph Sherpa Mattias Persson

Mattias Persson works throughout the Neo4j code base, but is particularly well acquainted with the Traversal Framework, a core component of the Neo4j landscape. He’s agreed to guide us on a traversal tour:
AK: So, what exactly is a Traversal?
MP: I would say from one or more given nodes in your graph move around to other nodes via their connected relationships in search of your answer. The traversal can be controlled in different ways, for example which relationships to traverse at any given position, ordering and so on. The general outcome is a list of paths from which the relevant information can be extracted.
AK: And the Traversal Framework, then. Is it just for describing a Traversal?
MP: Sure, it’s for describing where the traversal should go and also implementation to execute the traversal itself.
AK: Can you give an example, like how would I find the friends of my friends?
MP: So here the starting point is you, the node representing you. And you’d tell the traversal to follow KNOWS relationships or similar down to depth 2. Also every friend of friend should only be returned once (such uniqueness is by default). So in embedded code:

Iterable<Node> friendsOfFriends = traversal()

AK: OK, interesting. It’s such a different way of querying, though. For people who are new to Traversals, what’s your advice for how to ‘get it’?
MP: Look at traversals as local, where instead of having your entire database and query globally by matching values, you start at a known point where your relationships becomes your index and lead you to what you’re looking for. So you describe how the traversal will behave, where it should go and not go and you receive callbacks about relevant data, as per your description.
AK: And what are the benefits of the new update to the Traversal Framework?
MP: There are some additions here. One is bidirectional traversals, which is essentially like describing two traversals, one from each side (meaning one or more given start nodes) and where they collide in the middle will produce results in the form of paths. In most scenarios where you know both the start and end node(s) a bidirectional traversal will get you your answer with much less relationships traversed, i.e. faster traversal. Reason being that number of relationships needed to be traversed on each depth increases exponentially, so by traversing half the depth from each side cuts down on that growth. The “all paths” and “all simple paths” implementations in the graph-algo collection uses bidirectional traversals now. Dijkstra and A* will probably move over to that as well, and it’s essentially just a small change in your traversal description to make it bidirectional.
There’s also an addition to the “expander”, i.e. the one responsible for deciding which relationships to follow given a position in the traversal. Previously it could only make decisions based on the node for the current position, but now it can view the whole path leading up to the current position.
Also some minor things like being able to get metadata about the traversal (number of relationships visited and so forth), more convenience methods on Path interface.
AK: Nice. That’s a lot of good stuff. How will REST users be able to take advantage of these new capabilities?
MP: Well, you can soon expect Cypher to optimize queries that can take advantage of it. That’s the usual thing, just keep writing queries and we’ll keep making them faster.
AK: Thanks so much Mattias for all the hard work.

Paths as Expressions

In Cypher, much of the work in a statement involves working with paths. Now, paths themselves can be treated as expressions. This is most immediately explained with a simple example. Prior to 1.8.M04, you could capture a path with an identifier like this:

START n=node(...), m=node(...)
    match p=n-->()<--m
    return collect(p) asallPaths

With paths as expressions, that can be re-written as:

START n=node(...), m=node(...)
    return n-->()<--m as allPaths

Simply return the path that you want. There are, of course, much more fun things that can be done with this, which we’ll leave to explore another time. Because the best thing to do right now is…

Official blog post