As readers know, I am long on MongoDB, making a purchase recommendation in Nov 2019 for a 5 year holding period. Growth will be driven by their rapidly expanding product offering in an enormous addressable market. The database product is developer-friendly and represents a popular choice for start-ups and new software projects. The pace of enterprise migrations is increasing as companies revisit their legacy database infrastructure and MongoDB expands into adjacent use cases, like search and data warehousing.
I recently came across an article and a podcast that further reinforced the growth of opportunities for “NoSQL” solutions relative to relational databases going into the future. I will share the salient points and provide some analysis of what these trends mean for MongoDB’s growth trajectory.
To lay a foundation in terminology, NoSQL databases are non-relational, meaning that they have flexible schemas (the way structure is defined). This is opposed to a “SQL” or relational database, in which schema is defined up front, usually into normalized tables (akin to spreadsheets), with the objective to minimize the amount of data that is repeated. NoSQL database structures tend to be de-normalized, meaning that a lot of data is repeated, but is organized into groupings more useful for the software application. MongoDB is a document-oriented database, which is the primary category of NoSQL databases (XML and graph databases are other categories of NoSQL).
Morgan Stanley “New Stack” Motion
In June 2019, Morgan Stanley published an article “Is New Stack the Future of Business Software?” The thesis is that custom software development is being recognized as a competitive differentiator for all enterprises and has risen to board-level discussion (much like cybersecurity did over the last 5-10 years). Like anything viewed as strategic, there is a tendency to optimize it. This has led to new expectations for speed of delivery and accommodation of rapid change for software solutions.
This need to improve—and improve faster—has pushed the issue of software development outside the realm of the IT floor. “Much like cybersecurity over the past decade, we believe software development is emerging as a board-level discussion,” says Keith Weiss, head of Morgan Stanley’s U.S. software research team. “Business leaders increasingly recognize the close tie between how quickly a company can bring new software to market and their level of differentiation and competitiveness.”
In response, more companies are adopting new concepts such as agile development and DevOps—what Weiss and his team call “New Stack” software development methodologies—that drive higher developer productivity, more complete automation and, ultimately, faster software development.
Old Stack technologies typically incorporated one or two updates a year and were built with a tightly integrated structure that limited the ability to make changes—because everything in the structure typically had to be changed at once.
In contrast, New Stack approaches, such as Agile development, DevOps and so-called microservices architectures, have modular structures that allow for rapid iterative change that can incorporate customer feedback more quickly.
Morgan Stanley Article
Putting agile development methodologies and DevOps aside, the inclusion of microservices architectures and emphasis on rapid change is important. Many legacy enterprises built their original application stacks on relational databases. Due to switching costs and inertia, there hasn’t been a sufficient driver to consider other data storage options. However, newer expectations for stand-alone applications, like mobile apps, and isolation of infrastructure through microservices, allow for the selection of a new database implementations that match the use cases of a particular app or service. This bodes well for non-relational (NoSQL) solutions as they are either better suited for certain types of microservices (event data, user sessions, semi-structured product definitions, etc.) or are easy to connect to a mobile app by exposing a JSON-based API.
MongoDB is the leading document-based database solution, which is the largest category of NoSQL style solutions. In our prior MongoDB analysis, we explained this in depth. As software teams in enterprises have the opportunity to rethink their data storage choices on a more granular level, MongoDB will benefit. MongoDB doesn’t have to win every selection, and shouldn’t, as there are cases in which another technology might be a better choice, like a graph database. But, if MongoDB wins a significant number of these evaluations, it should result in large revenue upside over the next several years, as the installed total addressable market for database solutions is enormous.
The article goes on to estimate that investment in “New Stack” solutions should reach $48 billion by 2022. I am not sure how they derived this number, but we can assume it is driven by general digital transformation spending estimates and the associated software investments. Regardless, it is significant.
Morgan Stanley also breaks down the spending categories and project that NoSQL database solutions will represent the largest segment. “The Morgan Stanley team believes that the biggest piece of the pie will go to so-called NoSQL Databases, which allow developers to build applications quickly, that is, without the rigidity inherent in previous database management systems.”
With current year MongoDB (MDB) revenue guidance targeted at about $408M, growth to some percentage of a $13.3 billion spend by 2022 would be meaningful. Even 10% of this spend would represent a 3x revenue increase.
NoSQL Optimization Podcast on SED
Software Engineering Daily is one of my favorite podcasts about trends in modern software development. On yesterday’s show, they interviewed Rick Houlihan, who is a leader within Amazon Web Services. Rick’s role is to work with internal and external database teams to optimize Amazon products and database infrastructure for their needs. The podcast focused on the history of SQL / NoSQL solutions and applicability of both for modern software solutions.
A lot of the podcast dove into performance and scalability of both database types. This was approached from the historical perception that NoSQL solutions don’t scale or suffer from performance issues when applied to application development. NoSQL databases are generally easier to implement for developers, as the schema will more naturally mirror the object relational model being utilized and/or facilitate translation into a JSON response to support a REST API for mobile app consumption. But, the assumption has been that this flexibility comes at significant performance cost.
Rick represented the position that NoSQL databases can perform as well as or better than a SQL solution if structured for the query patterns associated with the application being built. In most OLTP applications this is the case, where the app is interacting with the database around a fairly finite set of read and write actions. Rick gave the example of a shopping cart on an e-commerce site. The site has a limited number of data interactions to support – add item to cart, list cart contents, remove item from cart, etc. For this case, the NoSQL database structure could be set up in a way that is optimized for this query pattern. Specifically, a document store (MongoDB is one) would be well-suited. Rick cited other generalized examples where this would apply as well, like user session data and event logs.
SQL solutions, on the other hand, are necessary if the query patterns are less deterministic, like for ad hoc analytics support. In this use case, data analysts might run all types of data queries, joining multiple data types and applying various sort/grouping operators. For these use cases, the structure of a relational database will perform better. This is generally the world of data warehousing and OLAP queries.
I realize there is more nuance to database selection and hard-core engineers will probably debate some of this. What was encouraging to me, though, was how strong an advocate Rick was for NoSQL solutions and the breadth of data storage use cases for which he felt they are appropriate. He pulled from his personal history at Amazon as well. Going back 5 years, he described a project he led to migrate Amazon databases off of an Oracle installation into a set of NoSQL systems.
Rick did briefly compare some scalability capabilities of MongoDB versus DynamoDB, specifically around how MongoDB handles sharding. He feels that Dynamo’s solution is superior as MongoDB will eventually bog down as more shards are needed, because MongoDB creates shards more deliberately. But, he also acknowledged that Dynamo’s approach of supporting more dynamic sharding impacts retrieval speed.
Regardless, my takeaway from this discussion was a broader appreciation for the number of application use cases that could be applied to a NoSQL database, of which MongoDB is the most popular document database.
Going Forward
I realize these are two specific examples and are somewhat technical, but I feel it is important to understand the fundamental drivers of a software solution’s growth in the marketplace in order to make a long term investing evaluation of it. In this case, I think that the report from Morgan Stanley and the perspective of a leader from AWS provide more justification for the momentum behind the NoSQL movement. The likely beneficiaries are the companies that provide NoSQL solutions. MongoDB (MDB) is a leader in this space.