A Sneak Peek at MarkLogic 9

A Sneak Peek at MarkLogic 9

I've had a chance to be a part of the Early Access program for MarkLogic 9, and it's given me an opportunity to both see and test where the company is moving to over the next year or so. These are still early impressions (and as I was unable to make it to the MarkLogic World conference this year they haven't been filtered by MarkLogic's usually media blitz), but so far, I like what I see.

There's no question that MarkLogic's focus with versions 7 and 8 were respectively semantics, Javascript and bitemporality. It can be argued that MarkLogic 8 was almost an interim release, cleaning up some features they'd introduced earlier and consolidating the platform. MarkLogic 9, on the other hand, looks like a pretty significant shift in direction. 

The Emergence of Entity Services

There is no question that MarkLogic's DNA has long been as a document database. However, increasingly their customers have been asking them to step up their game and provide far more data-centric services. This looks to be the release where MarkLogic will nail that requirement, hard.

Entity Services represents a way of modeling data that's near and dear to an ontologist's heart - represent critical objects in your system as entities, not just tables, with each entity having the potential to have more complex relationships than most SQL systems are typically used to handle. What they are proposing is a conceptual model of development, where you can talk about a Book, an Account, a Widget, a Manager, each as their own distinct entity, but with associated class information that makes working with these much easier for both the developer and the analyst.

This approach fits most contemporary applications far better than the typical relational database approach does - entities are more flexible, the models more easily fungible and the focus works far better with JSON and XML that is true for most relational systems (and even some NoSQL systems that seem to be staking their future on becoming more relational, at the cost of their very flexibility that makes them attractive).

At the same time, entity-oriented systems, when designed with this in mind, can also duplicate a more traditional SQL database. This is actually very important, because SQL still has a huge legacy footprint. To that end, one of the principle goals of the Entity Services approach  is to also support several different ways of querying and updating data, regardless of the internal representation. To that end, MarkLogic is not only creating its own Optics API for providing insight to the data at a modeling level, but is also revamping their SQL interfaces to take advantage of this.

Micro-Security

Internally, much of what's happening with Entity Services involves the Triple Store and Semantics (not everything, but quite a bit). This isn't really that surprising - a relational database store can be represented via triples quite easily, to the extent that if you rewrite the SQL implementation, it can appear (nearly) indistinguishable from one.

Yet at the same time, with Entities, you get considerably more for your money - the ability to query the model as well as the data within the same query, a much reduced need to build cross-reference tables, and perhaps most significantly, the ability to apply security at the field level, rather than at the database level.

Data is sensitive, but some data is more sensitive than others. MarkLogic has increasingly become the database of choice for areas such as Health Insurance and Electronic Health Records. Yet it's security, while superb, has not been specifically oriented towards securing field level information (for what it's worth, this has been a hard problem even for relational databases).

The combination of documents (whether XML or JSON) and semantics has opened up that particular domain by making it possible to both encrypt appropriate content at the field level (by keeping the metadata necessary for encryption in secured triple stores) and to provide redaction services when that data gets exported. This means that with ML 9, MarkLogic can manage personal health information (PHI) and personal privacy information (PPI) at a level that almost no other database on the planet can do.

Improved Cluster Management with Ops Center

I love working with MarkLogic, but when it comes down to managing clusters within the database, the love turns into dislike very quickly. Cluster management has evolved over time, but in that time it has also brought with it a lot of cruft that makes building a well-balanced cluster something of a nightmare.

The release of ML9 will bring with it a redesign of the cluster management tools in order to better consolidate the relevant functionality, make it easier to track changes with dashboards, and finally making adding new data stores, from SSDs to Hadoop instances, much easier.

The new Ops Center not only brings everything together, it does something that MarkLogic should have done years ago - builds a dashboard for tracking log data. Building MarkLogic applications can require a fair amount of debugging, and having to scroll through thousands of log messages to find a single statement is just not an efficient use of time. This one feature alone will likely make MarkLogic developers chomp at the bit to upgrade :-)

Now With Ecmascript 6!

I've been quietly falling in love with ES6/ES2015, the latest flavor of Javascript, for a while now. ES6 has been rewriting the rules of JavaScript with everything from classes (which I expected to hate, but have instead become a fan of), the introduction of maps and sets through to arrow notation, template literals,  iterators and generators, promises and more. 

MarkLogic has been upgrading their JavaScript engine to take advantage of these new features, and this in turn will have a significant impact in one area especially - legibility. In my experience, ES6 code is considerably more legible than earlier versions of JavaScript, is easier to maintain, and provides for a level of organization that has tended to get buried deep by layers upon layers of imports, much of it redundant.

This in turn means less code (and cleaner code) for working with MarkLogic in JavaScript mode, better control of asynchronous processes and more natural data flows.

Summary

Overall, I think there's some very interesting take-aways from what I've seen thus far.  Entity Services could be a big deal - I've been arguing for years that an Entity or Conceptual model is a prerequisite for building true enterprise level data integration solution, but the implementation is also a bit of a gamble for MarkLogic. It's one, though that they will have to make to move into the big leagues. I think that field level security will prove quite attractive to a number of health care, insurance, and financial services companies (I've built ad-hoc systems with MarkLogic 8 that use many of the same principles, but getting it out of the box is a big win). Finally ES6 integration will help both with their internal APIs and with winning over many developers who are moving away from Java and towards Javascript as their core language.

The one thing that I could have wished to see more of was an expanded effort in the analytics space (beyond the SQL upgrades, which are welcome). An efficient  and intuitive analytics pipeline system (perhaps, albeit I say this reluctantly, along the lines of either CPF or, better, XProc), would make sense for doing some serious data science work. Building in a more robust machine learning system similarly seems like a natural extension to what they are doing now with their entity systems - triggered SPARQL inferencing could go a long way there.

If past trends are any indication, it's unlikely that MarkLogic 9 will surface until Spring 2017, so it should be interesting to see how the application develops between now and then.

Kurt Cagle is the founder of Semantical, LLC. He has been designing and building MarkLogic applications for more than a decade, and is available for consulting or hire.

Mark Lawson

Technical Consultant

8 年

I've always thought that one of the great features of the OO movement was designing into real-world "objects"; Entities seem a similar, perhaps more powerful vision and sounds like a fine idea.

回复
Max Dunn

Silicon Publishing CEO | Pioneer in XML and Adobe InDesign Server automation. Online editing platform with unrivaled extensibility, scalability, and quality of output.

8 年

ES6, nice... Wish any of our clients could afford MarkLogic.

要查看或添加评论,请登录

Kurt Cagle的更多文章

  • Reality Check

    Reality Check

    Copyright 2025 Kurt Cagle / The Cagle Report What are we seeing here? Let me see if I can break it down: ?? Cloud…

    14 条评论
  • MarkLogic Gets a Serious Upgrade

    MarkLogic Gets a Serious Upgrade

    Copyright 2025 Kurt Cagle / The Cagle Report Progress Software has just dropped the first v12 Early Access release of…

    14 条评论
  • Beyond Copyright

    Beyond Copyright

    Copyright 2025 Kurt Cagle / The Cagle Report The question of copyright is now very much on people's minds. I do not…

    5 条评论
  • Beware Those Seeking Efficiency

    Beware Those Seeking Efficiency

    Copyright 2025 Kurt Cagle / The Cagle Report As I write this, the Tech Bros are currently doing a hostile takeover of…

    85 条评论
  • A Decentralized AI/KG Web

    A Decentralized AI/KG Web

    Copyright 2025 Kurt Cagle / The Cagle Report An Interesting Week This has been an interesting week. On Sunday, a…

    48 条评论
  • Thoughts on DeepSeek, OpenAI, and the Red Pill/Blue Pill Dilemma of Stargate

    Thoughts on DeepSeek, OpenAI, and the Red Pill/Blue Pill Dilemma of Stargate

    I am currently working on Deepseek (https://chat.deepseek.

    41 条评论
  • The (Fake) Testerone Crisis

    The (Fake) Testerone Crisis

    Copyright 2025 Kurt Cagle/The Cagle Report "Testosterone! What the world needs now is TESTOSTERONE!!!" - Mark…

    22 条评论
  • Why AI Agents Aren't Agents

    Why AI Agents Aren't Agents

    Copyright 2025 Kurt Cagle/The Cagle Report One of the big stories in 2024 was that "2025 Would Be The Year of Agentic…

    22 条评论
  • What to Study in 2025 If You Want A Job in 2030

    What to Study in 2025 If You Want A Job in 2030

    Copyright 2025 Kurt Cagle/The Cagle Report This post started out as a response to someone asking me what I thought…

    28 条评论
  • Ontologies and Knowledge Graphs

    Ontologies and Knowledge Graphs

    Copyright 2025 Kurt Cagle/The Cagle Report In my last post, I talked about ontologies as language toolkits, but I'm…

    53 条评论

社区洞察

其他会员也浏览了