Is MongoDB Really the Answer to Everything?
Omer Jacoby
Director at Davies Consulting | Helping financial institutions combat financial crime using innovative technological solutions
I remember a time around 5 years ago. Many starry-eyed believers looked at HP Vertica with a hopeful gaze. Column-store databases seemed to be the new "it".
However, the apostles walked the desert only to discover the promised land barren and scorched.
Where is Vertica now? I haven't heard of any solutions using it. In fact, Vertica is currently ranked 27 in the DB-Engines ranking index, comfortably placed between Informix and Firebird (has anyone heard of those? I certainly haven't. Maybe I'm just getting old.)
MongoDB, on the other hand, is ranked 5th, right after the traditional giants: Oracle, MySQL, Microsoft SQL Server and PostgreSQL. Why has MongoDB succeeded where others failed?
Many attribute MongoDB's success to the fact that it's very performant, and doesn't require heavy-duty hardware. I think the answer is much simpler than that.
MongoDB is simple.
It's simple to install, and simple to use. With web applications being the new rulers of the software world, a database which stores information in the exact same fashion as it is being accessed (BSON / JSON), and can be queried using the same language used in both front and back-end components (assuming you're using Node.js), is a huge benefit. It doesn't require going through the nightmarish process of installing a Hadoop cluster and isn't riddled with confusing modules and add-ins.
Nowadays, when I envision a small to medium scale project, MongoDB is my first choice for data persistence.
However, moving away from traditional technologies sometimes comes at a price.
I discovered this fact when I was trying to implement a seemingly simple logic - in my website, I wanted items to display in a quasi-random fashion, but be persistent across multiple pages.
In short, that means that every time someone would access the website, they would see the items ordered in a random fashion - but if they scroll to the next page, they would not get the same items that were displayed on the first page. Conceptually, this can be achieved by using a deterministic pseudo-random ordering of the items.
This logic, I discovered, was impossible to implement in MongoDB short of using a MapReduce function. Using standard queries I was unable to sort using a function or mutation of an existing field; I was only able to sort using the vanilla attributes. A simple problem which could have been solved in Oracle or MSSQL by using a seedable hash function has become a major problem which has consumed a few days of my time, until I finally caved in and used a seedable implementation of the Fisher-Yates (Knuth) shuffle in Node.js.
This approach would make any seasoned software architect cringe. The "data transformation at source" best practice is completely thrown out the window in favor of functionality. I could have used MapReduce - but I was, in fact, not convinced that it was the performance-superior approach!
Is MongoDB still the best solution for my project? The answer is a definite yes. The time I gained by using using MongoDB allowed me to further expand my project and streamline the development process. However, I would still like to see further improvements to MongoDB; Mostly around querying and aggregation. I believe with these few additions, MongoDB really could become the ultimate data persistence solution for most modern software projects.
(I like to keep my work and personal projects separate, which is why I haven't linked my website here. But, if you'd like to see it, message me and I'll send you a link.)
I am currently based in Toronto, Canada and advise on Compliance systems implementation for financial institutions in areas such as AML, Market Surveillance and KYC on behalf of Matrix-IFS (a global financial services company).
If your organization is in need of such services please contact me on [email protected] and we can discuss cutting costs and improving coverage for your business.
AI Governance for enterprises to harness the power of AI at scale with transparency, accountability, and safety
6 年Great piece of work, Omer. Let me know if you have any questions. Old colleagues, like you, get preferential treatment. :) All the best!?