An Initial Look at Graph Databases

I am working on a project where large tree-like data structures will be central. I am pretty sure a relational database would work but as I was prototyping and trying to see how such data structures would be handled, I thought about NoSQL -- I freely admit I was at least in part motivated by a desire to work with something more exotic than an old-fashioned RDB. (I of course also did not want to choose NoSQL technology just because it is trendy.) Anyway, at some point I must have googled "NoSQL graphs" and the results led me to graph databases. I experimented with both Neo4j and ArangoDB -- now it looks like I am going with the latter.

I then thought about using both a relational db and a graph db -- Arango would be used for the tree-like stuff (The complex graphs I envisioned could be stored in a RDB but how would they be traversed and manipulated?) and an RDB for more familiar things like customers (companies in this case) and users. Of course, the prospect of having two DBs which would have to be kept in synch did not thrill me and I wondered if Arango could be used for all things database in the project. I found examples where graph dbs indeed were used to store this frequently-encountered kind of data (normally represented without much thought in a relational db). It is pretty natural to visualize such data in a tree -- obviously a graph db can represent more complex graphs and there are many examples where, say, the relationships and responsibilities of employees within a company are best presented as a graph; so why not use a database which is designed specifically to handle such graphs?

I have only been looking at graph databases for a relatively short time but I suggest that there may be many classes of problems for which this technology is not just an alternative to relational databases but perhaps so compelling a choice that migrating existing relational databases to a graph db might be considered -- I can say without reservation that lucky people involved in new projects should take a good look at using a tool like Arango. Java developers who have used Spring Data will discover that both Arango and Neo4j have extended Spring to allow db interaction by using Spring Data annotations tailored for graph dbs. I suspect that analogous tools exist for languages other than Java. (Arango in fact provides more examples in JavaScript than Java although Java seems to be fully supported.)

Some Links:

  1. An Arango Spring-Data Project for GoT fans: Early investigation leads me to believe that using this Spring-Data extension may not be adequate in all cases -- perhaps further work is being done on this that will increase its utility.
  2. Basic Java Access for Arango
  3. Comparing RDB and GDB (Mentioned in this article is the idea that a GDB should support "graphy" queries in the data store itself which is a feature used extensively in my project.)
  4. Discusses Index-free Adjacency: An optimization offered by GDBs (article from Neo4j although Arango and other GDBs have it too).
  5. Arango's Article About Index-free Adjacency: Points out that this approach is not useful in the case of a distributed GDB -- need to find out how things work in the distributed case.

要查看或添加评论,请登录

社区洞察

其他会员也浏览了