On July 19th, MongoDB 6.0 was officially released. A number of the features were introduced during MongoDB World in June and are now available for customers to activate. This version continues MongoDB’s vision of providing a single data platform for developers to build modern software applications. It adds new capabilities to the platform that address additional application workloads, reducing the overhead of relying on point solutions for some data access patterns. The release also adds capabilities that improve security, ease of use and accessibility to new user types.

MongoDB’s end goal is for development teams to fold more data storage and processing use cases into their data platform, reducing costs and gaining productivity. Disparate point solutions for each data storage pattern create overhead in licensing costs, vendor relationships, training and coding interfaces. By consolidating all transactional application workloads onto MongoDB’s platform (or as many as possible), development teams reduce this overhead and simplify their operations. MongoDB is gradually chipping away at the need for multiple point solutions, as they leverage the flexibility of the document model to expand into adjacent data storage patterns.

As I discussed previously, in an environment of constrained IT budgets, opportunities for this workload consolidation should be well-received. Granted, I don’t expect that to kick off a slew of new migration projects (which can introduce cost themselves), but I do think it adds more emphasis to the question “Can’t MongoDB just do that?” for new application workloads. Supporting open source databases and home-grown data storage solutions require dedicated staff. Those engineers can be redeployed to new revenue-generating application work, instead of maintaining a data processing capability that MongoDB offers from its cloud-based Atlas service.

Let’s review the major new features and capabilities included in the 6.0 release:

Time Series

I think MongoDB’s continued focus on fleshing out support for time series data storage patterns aligns well with the rapid emergence of this data format. Previously relegated to telemetry data from systems logs, time series data creation is exploding, driven primarily by connected devices, asset tracking, industrial IoT and real-time events. With version 6.0, MongoDB is adding support for secondary or compound indexes on any field. This enables geo-indexing (retrieval of data by location, handy for fleets of IoT devices) and improved read performance.

MongoDB Investor Session, June 2022

Version 6.0 also adds support for last point queries, which can quickly retrieve the most recent entry from a unique source. These round out the features added in prior 5.x releases, including time series collections, visualization support, windows functions, native sharding, archive support, columnar compression and gap-filling. At this point, MongoDB offers a fairly feature complete solution for time series data processing, allowing the platform to absorb these data workloads from other point solutions.

Atlas Search

Search represents another common application workload in which MongoDB continues to chip away at any argument for a development team to maintain a separate tier of servers for search use cases. MongoDB’s search implementation is based on Apache Lucene, which is the same open source search engine at the core of Elasticsearch and Solr. Integrating Lucene into the MongoDB platform makes a lot of sense, as search indexes typically provide a fast method to perform data retrieval on an array of documents. This enables use cases in full-text search, faceted search (e-commerce product catalogs) and a variety of new data retrieval workloads for fraud detection, customer insights and order management.

MongoDB Investor Session, June 2022

While Atlas Search was first introduced in June 2020, the feature support initially was focused on full-text search. Over the following two years, the team added incremental improvements with support for synonyms and function scores. The version 6.0 release brings major new capabilities to MongoDB Atlas Search. The big addition is full support for search facets, which enable the collection, indexing and fast retrieval of metadata associated with documents. The common case is e-commerce product metadata, like size, color, price, etc. This metadata can be extended to many sophisticated “search” use cases outside of e-commerce, like the examples listed above. Version 6.0 also brings cross-collection searching, stored source fields and embedded documents in arrays.

As an example use case, leading real estate company Keller Williams uses Atlas Search to enable consumers and agents to search for properties on their web site KW.com. Leveraging the underlying MongoDB application data platform with integrated Atlas Search, the search engine supports full text searches by property address and faceted search by different parameters like price, year built, bedrooms, square footage, etc. Prior to the availability of faceted search on the MongoDB platform, this workload would typically be addressed by Elasticsearch or Solr. This provides another example of a separate data storage installation that could be folded into the MongoDB platform.


Sponsored by Cestrian Capital Research

Cestrian Capital Research provides extensive investor education content, including a free stocks board focused on helping people become better investors, webinars covering market direction and deep dives on individual stocks in order to teach financial and technical analysis.

The Cestrian Tech Select newsletter delivers professional investment research on the technology sector, presented in an easy-to-use, down-to-earth style. Sign-up for the basic newsletter is free, with an option to subscribe for deeper coverage.

Software Stack Investing members can subscribe to the premium version of the newsletter with a 33% discount.

Cestrian Capital Research’s services are a great complement to Software Stack Investing, as they offer investor education and financial analysis that go beyond the scope of this blog. The Tech Select newsletter covers a broad range of technology companies with a deep focus on financial and chart analysis.


In-app Analytics

MongoDB is not trying to become a data warehouse, but they recognize that an increasing amount of data processing is being allocated to drive features on applications that involve an analytical function. Examples extend to personalization, process optimization, preventative maintenance and risk assessment. A simple example is the prompt we often see on e-commerce sites highlighting that a product only has a few items left in inventory or recommendations for “products like this”. Previously, these analytical features would be calculated through a job in a data warehouse, with results pushed to a transactional database for the application to query. MongoDB can now short-circuit that process, by calculating the analytics summary and serving it within the same database.

MongoDB Blog Post

MongoDB has built in many capabilities into the platform that enable robust in-app analytics. These include a flexible data model, a framework to aggregate and query data within time windows, support for long-running queries that generate a snapshot view and workload isolation. The last item prevents analytics load from affecting the performance of operational database transactions. With Version 6.0, users get improvements to $LOOKUP and full support for sharding of lookup data. They are also wrapping up work on a new feature for column store indexes, which will dramatically speed up analytical queries, allowing MongoDB to handle more complex and broader calculations in real-time.

Event-driven Architecture Support

MongoDB introduced the concept of change streams to allow applications to listen for changes in a MongoDB dataset, without having to query the database continuously for an updated value. A change stream represents an API over the operations log that allows applications to subscribe to data changes in a collection. This capability makes it easy to support event-driven application architectures, which are designed to perform an action based on a trigger like a change in a data value or some other external event. These applications expect to react to data changes in near real-time. Change streams facilitate those event notifications, without requiring additional middleware (like Kafka) to generate and distribute those events.

MongoDB version 6.0 enhances change streams with new functionality that addresses a wider range of use cases. First, users can retrieve the before and after state of an entire document when it is updated. Second, the scope of change streams is expanding beyond data manipulation events to also include data definitions. This is useful event data to track for operators, in the event that an index or whole collection definition is changed. Finally, MongoDB rolled out performance improvements for filtering and transforming notifications by optimizing the position they take in the change stream pipeline.

As enterprises are moving to more real-time data use cases to optimize their businesses, change streams provides a useful mechanism for applications to “listen” for certain changes to the data and then react appropriately. A simple example might be an inventory management system for an e-commerce application. A change stream could be set on the count for each product and generate an event when the value drops below a certain threshold. This could trigger a re-order job. Like the other feature changes in MongoDB 6.0, change streams reduces the dependency on an external data streaming system to collect these data values, trigger on a change and then distribute that event to listening applications. With MongoDB 6.0, event-driven applications could subscribe to the change stream directly.

Other Changes

Besides these major additions in data storage types, there were a number of improvements to supporting platform infrastructure.

  • Atlas SQL Interface. This provides data analysts, who are used to working with SQL, with the ability to query and analyze Atlas data using popular SQL-based tools, like Tableau, Looker and PowerBI. MongoDB is also working on generic JDBC and ODBC SQL drivers.
  • Serverless. MongoDB Atlas Serverless instances are now generally available. Like other serverless database offerings, MongoDB Serverless provides development teams with the ability to access MongoDB data storage without having to worry about configuration or ongoing capacity management. Serverless is hosted on all three cloud vendors and provides tiered pricing. This approach can reduce costs for applications with variable usage patterns.
  • Cluster-to-Cluster Sync. This has been a heavily requested feature by customers which enables the ability to easily keep multiple MongoDB clusters in sync. The capability continuously synchronizes data between clusters across multiple environments, like Atlas, private cloud and on-prem. It can also be applied to specialized instances, like development, test, staging and dedicated analytics clusters.
  • Queryable Encryption. This new data security capability allows customers to encrypt sensitive data from the client side, store it as fully randomized encrypted data on the database and run expressive queries on the encrypted data. With the introduction of Queryable Encryption, MongoDB claims to be the only database provider that allows customers to execute expressive queries, such as equality, range, prefix, suffix and substring on fully randomized encrypted data. This is a huge advantage for organizations that use expressive queries and need to keep the underlying data encrypted.
  • Encrypted Audit Logs. MongoDB 6.0 adds compression and encryption to audit logs before they are written to disk, to protect the integrity and confidentiality of the event data. Customers can utilize their own key management system for the encryption.

As you can see, MongoDB packed a lot into the version 6.0 release. The additional support for four different data storage patterns (time series, search, in-app analytics and real-time events) should allow development teams to apply their existing MongoDB installations to address new application workloads. Other platform enhancements in 6.0 make MongoDB accessible to more teams, easier to manage and better protected from data breaches. All of this should help drive further adoption of the platform by enterprise development teams looking to increase productivity and reduce vendor sprawl.

NOTE: This article does not represent investment advice and is solely the author’s opinion for managing his own investment portfolio. Readers are expected to perform their own due diligence before making investment decisions. Please see the Disclaimer for more detail.