Investing analysis of the software companies that power next generation digital businesses

Cloudflare Platform Week Recap – Data

(Revised May 22nd. Added comparison to hyperscaler solutions at end)

Cloudflare held Platform Week from May 9th – 16th. This session focused on enabling developers to build rich applications using the Cloudflare platform. As we have been waiting excitedly for another innovation week, Platform Week didn’t disappoint. It was packed with more product improvements and new offerings than previous innovation weeks, once again demonstrating Cloudflare’s accelerating pace of product development. In fact, because the week included so many announcements, I am breaking up my coverage into several blog posts.

First up is data. In order to build a modern Internet application, developers need access to multiple mechanisms for data storage and processing. These include caches, databases, object stores, event logs and message queues. With the new products introduced during Platform Week, Cloudflare now checks all these boxes.

What is most amazing is how rapidly Cloudflare’s data offerings have progressed. It was less than two years ago that the second data storage solution, Durable Objects, was introduced to supplement their key-value store. Now, we have several more capabilities available, with hints of others on the way. Several years ago, I don’t think any analyst would have predicted that Cloudflare could one day generate as much revenue from data services as from their network-based products. That now seems to be a probable outcome as monetized data products proliferate.

With the addition of each new data processing solution, Cloudflare removes another objection to developer adoption of Workers. Feature parity with other cloud-hosted development environments allows Cloudflare’s inherent architectural advantages to shine through. Applications hosted on the Cloudflare platform respond to requests from every one of Cloudflare’s 270+ locations in parallel. This is because every data center runs code for every application – there are no designated regions or availability zones.

Before the recent progression of data storage options, use cases for Workers were limited to those requiring little state management. As Cloudflare’s data storage and processing capabilities expand, developers can consider hosting any kind of application on Cloudflare. At minimum, the most latency sensitive portions would benefit from the scalability, responsiveness, localization and security of Cloudflare’s fully distributed runtime.

Cloudflare’s architectural advantages outweigh the benefits of running all application services centrally. While many application use cases will still be well-served by a centralized architecture, that is no longer the default. Delivering some application services or experiences from a globally distributed runtime will improve performance. Data services should goose demand by opening up the Cloudflare Workers platform to many more use cases.

We are already seeing the tip of the iceberg. On the Q1 earnings call, the CEO referenced a couple of customer wins for Workers. A large social network (likely Meta) agreed to a $3M engagement over 5 years to utilize Workers for code authentication within their messaging product. This represents a high scale use case that benefits from the responsiveness and high throughput provided by Cloudflare’s distributed architecture. A large Australian software company (likely Atlassian) signed a preliminary $145k deal for Workers and Durable Objects to power a collaboration feature. Their utilization is expected to grow over time.

Cloudflare Investor Day Presentation, May 12, 2022

As Cloudflare’s data storage capabilities grow, we can expect more customer wins like these with larger financial commitments. At Investor Day during Platform Week, the CFO shared that 15% of paid customers in 2021 utilized the Workers product. With these additional data capabilities, penetration will likely expand further. Let’s dive into the announcements of data product offerings that emerged from Platform Week.

R2 Open Beta

While not a new reveal, this product will have significant impact. R2 is Cloudflare’s object store, modeled after Amazon’s S3 service. It was introduced in closed beta as part of Birthday Week in September 2021. As part of the Q4 earnings call, the CEO mentioned that R2 had over 9,000 customers in the closed beta, including some “incredible logos”, representing hundreds of petabytes of data.

With the open beta announcement, any user can get access to the product. Self-service users can activate R2 in their user dashboard. Enterprise accounts can work with their customer service rep to activate it. The leadership team has indicated that they expect R2 to move to GA status by Q4 of this year.

With the open beta announcement, Cloudflare reiterated pricing for R2. Fees will accrue based on total volume of data stored per month and the number of operations performed on it. These operations are divided into two categories, which are generally determined by whether the action is a data read or write operation. Writes are more expensive, as expected. In no case is the customer charged for egress (moving data out of the Cloudflare bucket). Overall costs run about 10-30% cheaper than the list price for Amazon S3.

  • Storage is priced at $0.015 / GB, per month
  • Class A operations (create, write) cost $4.50 / million
  • Class B operations (read) cost $0.36 / million

Cloudflare also introduced a free usage tier, in order to enable companies to begin using R2 without accruing costs. This allows for 10 GB of data storage, 1M class A operations and 10M class B operations. These thresholds are applied monthly. There is a qualifier in the blog post that “while in the open beta phase, R2 usage over the free tier will be billed.” I suppose this implies that Cloudflare may generate some revenue from R2 during the open beta, if a customer starts to make heavy use of the product.

While Cloudflare has made great progress with the R2 product, they still have work to do to bring it to suitable performance and feature completeness. This is the purpose of the beta period. Load on R2 will be capped at 1,000 read and 100 write operations per second per bucket. Cloudflare intends to raise these limits as they get more experience with the system. These are already pretty high, but wouldn’t be suitable for the largest enterprises.

One of the powerful capabilities of R2 will be that the user doesn’t need to designate the geographic region where the data will be located. Cloudflare’s goal is to make R2 data available to users anywhere on the globe without pinning it permanently to a particular location. In the near term, though, most of the data is located in North America and incurs latency when requested from other locations. Cloudflare is adding storage to more global locations and developing the ability to cache data closer to the geographies where it is being requested for better performance.

Finally, they are working on additional features. First is a TTL (Time to Live) which would allow older content to be purged out of a bucket after some time limit. This is useful to reduce storage costs. They also want to support public buckets, which exist separate from a Worker instance. Finally, they will add pre-signed URLs for access control and better integration with Cloudflare’s cache to facilitate faster delivery to multiple geographic points.

All in all, I think R2 will be a valuable addition to the Cloudflare platform. Initial feedback from enterprise customers is that they see R2 as an opportunity to distribute data across multiple cloud providers and to easily share data with partners. Combined with the new Pub/Sub product (described below), we can start to see the opportunity for Cloudflare to power a system for data distribution. In my first look at their Q1 earnings report, I performed some rough calculations of R2’s financial impact. Given the projections mentioned in the Q4 earnings reports (hundreds of petabytes of data), I estimated that R2 could generate more than $30M a year in incremental revenue. That would likely hit scale in 2023. It is possible we may see a smaller upward benefit in Q4 2022, following GA.

D1 SQL Database

This was probably the highlight of the innovation week. Cloudflare announced a new SQL database product. It will be available as a beta for select customers to start using in June. To indicate interest, customers can complete a short sign-up form. They haven’t announced pricing, but indicated that D1 would be priced similar to R2 and other data products, based on total storage and number of operations by type.

D1 is built on top of SQLite, which is a popular open-source SQL database engine. Technically, SQLite is an embedded database, which means it is implemented as a software library that developers can import into their own application. In this context, it is often used to provide a database function within mobile devices, browsers and stand-alone hardware. Unlike MySQL or PostgreSQL, SQLite doesn’t run a database server. This implementation works perfectly for Cloudflare, which effectively provides the runtime that can be invoked as a Worker process on request. Data for SQLite is stored directly in files on disk, which also aligns well with Cloudflare’s architecture. Durable Objects appear to be the backing data storage mechanism.

Using the SQLite implementation on top of the Workers and Durable Objects, D1 will function like any relational database. It is ACID compliant, meaning that it supports transactions (which are important to ensure data integrity). It contains all the standard features of SQL, including tables, indexes, triggers, views and the full range of SQL commands. It is compact, fast, well-tested and extensible. This represents a solid choice for Cloudflare, without any encumbrances of a restrictive license. I think it also accelerated time to market, versus trying to build a relational database from scratch.

Combined with Cloudflare’s fully distributed architecture, some interesting advantages emerge for applications that will make use of this SQL database to serve a global audience of users. Because the data set is backed by Durable Objects, D1 automatically gets the benefit of a having a single reference copy of the data for writes, but the ability to utilize a caching layer to distribute reads. D1 will also handle read replication automatically, by maintaining copies of the database near clusters of users and manage the distribution of updates to all copies. This removes the overhead for developers or DevOps teams to worry about managing replication for high scale applications.

D1 will support batching of queries, which will provide higher throughput when the Worker invoking the database request is geographically separated from the database instance. Along these lines, developers will be able to tag code that is specific to database operations. That code will then be run on a Worker close to the database instance, while the code initially responding to the user’s request is maintained in close proximity to that user. Finally, database back-ups will be performed automatically and stored in R2. Database restores can be performed from the Admin console.

All-in-all, D1 represents an important addition to the application developer’s toolset. As part of the announcement, Cloudflare provided a sample application implemented on Workers with D1 modeled after the “Northwind Traders” model. Northwind Traders has been a common example application used by Microsoft for many years to demonstrate a basic relational database application. You can see from Cloudflare’s implementation that D1 provides the typical data storage and querying patterns that you would expect for any e-commerce web application.

Cloudflare once again demonstrated the composability of their platform in building D1. The underlying data storage primitives make use of Durable Objects and R2 handles back-ups. These primitives accelerated the development of the product, by allowing Cloudflare’s engineers to re-use their own building blocks in the implementation of D1. This composability is a big part of the reason that Cloudflare’s product development cadence is able to accelerate.

Pub/Sub – Programmable Messaging

While D1 got most of the publicity, I think that Cloudflare’s announcement of a new publish-subscribe messaging service is equally important. This provides the first step of a framework for intelligent routing of data across Cloudflare’s network. The publish-subscribe model allows two entities to exchange data formatted as messages. Messages can represent any succinct data packet, like an event (a customer purchased something) or a sensor data value (temperature is 85 degrees). Publishers create the message packets and subscribers consume them. In a publish-subscribe system, there can be many publishers and many subscribers. Neither of which need to know about one another. The pub-sub system handles all the data collection and distribution. To allow for selective subscription to only certain types of data, messages can be tagged as having association with a “topic”. Subscribers can then subscribe to receive messages only for one or more topics.

Cloudflare’s implementation of publish-subscribe is based on MQTT, which is a commonly used industry standard generally associated with IoT data distribution. It has been embedded into millions of devices already. Common use cases include distribution of sensor readings, telemetry, financial transactions and mobile notifications. The protocol is flexible, allowing developers to make trade-offs in reliability, topic hierarchy and persistence specific to their use-case.

MQTT Web Site, Sample Implementation

MQTT has been implemented in a number of industries, generally for collecting data from many devices and distributing it to central systems for aggregation, decision making and control. Common uses are in manufacturing, transportation, energy, smart home and logistics. These are often bundled into the label IoT.

What’s interesting about Cloudflare’s implementation is that they will allow developers to write code to run in Workers that can filter, modify or aggregate messages as they are published to the broker, but before they are distributed to subscribers. Because Worker code can be invoked at any of Cloudflare’s data centers throughout the world, this “pre-processing” provides a major advantage over other centralized publish-subscribe systems. Those systems have to ship all messages to a central set of brokers for message processing. That approach requires the full message packet to cross the network before it can be filtered, combined or discarded, generating a lot of bandwidth consumption for a noisy system. This local pre-processing advantage is inherent to Cloudflare’s “run anywhere” architecture.

Cloudflare’s Pub/Sub offering is being introduced as a private beta. To gain access, interested customers can complete a short sign-up form. The Cloudflare Pub/Sub product team will review applications and start inviting developers to use the product in June. They intend to publish pricing at some point during the beta period.

I don’t want to get too far ahead of myself, but I think Pub/Sub will represent an important capability for Cloudflare and unlock many use cases in edge compute and data distribution. This is particularly grounded in the emerging space of industrial IoT and the need to collect and route large amounts of data packets between smart devices. Given Cloudflare’s distributed runtime, their implementation will have significant advantages over other approaches that are built on top of centralized infrastructure.

As the Pub/Sub offering evolves, it may also start to encroach on other systems for real-time data distribution, including those based on Apache Kafka (i.e. Confluent). Granted, Cloudflare’s implementation with MQTT represents a subset of functionality in a full-featured distributed event store, but Cloudflare has demonstrated the ability to continue iterating from the first product incarnation to a broader product set quickly (like they are doing with other data offerings).

Worker Analytics – Time Series Data

Another announcement that could become very interesting over time was the Workers Analytics Engine. On the surface, this appears to be a simple data store for telemetry data. Internal teams at Cloudflare have been using it to collect metrics for Workers performance and R2 utilization. However, the underlying implementation is more properly a time series database. This combines cleanly with the new Pub/Sub service to provide a destination for IoT sensor data.

Developers can create an Analytics Engine instance in their environment and then populate it through Workers code. Data can be written in the form of “data points”, which consist of labels and metrics (which are just numbers). Each data point also has a timestamp associated with it. Once written, data can be read through a rich SQL API. This allows for retrieving metrics data based on combinations of labels. The data can also be graphed by popular time series visualization tools like Grafana.

The Workers Analytics Engine is being introduced as a closed beta, for which customers can request access. We don’t have pricing for this service, but I suspect it will be similar to other data stores. What I think is interesting about this new product offering is that it addresses another common data storage workload for applications. There are other commercial implementations of time series databases, like InfluxDB which is managed by InfluxData.

Combined with Cloudflare’s other data offerings, they now offer a key-value store, object storage, a relational database and a time series database. Those provide developers with a lot of options. A time series database in particular serves IoT workloads. When combined with the new Pub/Sub service and Cloudflare’s distributed edge network, it should provide a powerful backplane for emerging use cases around industrial IoT and the coordination of fleets of smart devices.

Investor Take-aways

I think these new capabilities for data storage and processing take Cloudflare to the next level of application use cases. A full-featured database and object store provide developers with the tools that they expect from other cloud-based application hosting platforms. It significantly expands the surface area of applications that could be hosted on Cloudflare’s platform, going beyond the handful that were highlighted during the Q1 earnings call.

While Cloudflare’s platform wouldn’t completely replace centralized application hosting offerings from the hyperscalers, their distributed architecture allows developers to consider the parts of their applications that would benefit from the responsiveness of a globally distributed, serverless runtime. As service-oriented architectures proliferate through the adoption of micro-services, it becomes easier for developers to run some of these services on the edge, while keeping others in the central data center.

For example, an IoT data collection service would work well on Cloudflare’s platform, particularly if those devices were distributed across a broad geographic area. Device data could be sent to collection points running on Workers within each of Cloudflare’s data centers. Those instances could perform some initial pre-processing of the data and then use the Pub/Sub service to ship the data packets to centralized subscribers for permanent storage and deeper analysis. Smaller datasets could also be staged and queried in the new time series database offered as part of the Workers Analytics Service.

Cloudflare Investor Day Presentation, May 12, 2022

As part of Investor Day, Cloudflare leadership shared that 15% of all customers were paying for the Workers product in 2021. Additionally, as of Q1, about 450k developers had used the Workers platform since 2017. Cloudflare’s CEO mentioned that their internal goal is to more than double that value this year to 1M developers. That would represent a significant ramp in adoption.

As more developers use the Workers platform, they will consume more Workers resources. With Cloudflare’s new products announced at Platform Week, developers have several more options for data storage and distribution. These will each be monetized products, with a pricing model based on usage. If Cloudflare achieves the planned growth in adoption of Workers and data products start getting stuffed with application data, we could see material revenue contribution. In past earnings calls, the leadership team highlighted several 5-figure and even 6-figure customer deals that utilize Workers on existing data storage options like Durable Objects.

With an expanded data product suite, we should see the frequency and size of these Workers deals increase significantly in 2023. This growth will be combined with Cloudflare’s expansion in network services and Zero Trust, which each had 10% customer penetration in 2021. These adjacent product categories will provide Cloudflare with the customer spend expansion momentum to sustain their high levels of revenue growth going forward.

I also appreciate that these new products are driving increases to Cloudflare’s CapEx spend and capping free cash flow margin improvement in the near term. I discussed this as part of my first look at the Q1 earnings report. Due to several factors, like their distributed architecture, high server utilization and “all services run everywhere” model, I actually think this CapEx spend will turn out to be very efficient. Zscaler’s cash generation provides a reasonable proxy for expected leverage, at least in the sphere of network security services. Cloudflare should be able to maintain similar efficiencies as they grow their compute and data service offerings.

(Addendum – May 22)

I have received a few offline questions and a comment about how Cloudflare’s new data storage and processing solutions are different from and potentially better than those offered by the hyperscalers. For comparison, we can look at what is available from Lambda@Edge within AWS. For those not familiar with it, Lambda@Edge provides an AWS Lambda runtime that can be invoked from Amazon CloudFront locations (CloudFront is Amazon’s CDN). This allows developers to write scripts in JavaScript and Python that can be run in a serverless manner like AWS Lambda. However, there are a number of restrictions for Lambda@Edge that limit the scalability and resource access, as compared to regular Lambda. The concurrency thresholds and expected responsiveness for high throughput loads are much lower than what Workers could handle.

Even using CloudFront for Lambda@Edge imposes some geographic limitations. CloudFront has edge locations near about 90 cities globally (versus 270+ for Cloudflare). Also, any sophisticated functionality or state management for a Lambda@Edge function means it will be run from the closest CloudFront regional edge cache. There are only 13 of these globally and they are generally co-located within an AWS region.

The biggest restriction as it relates to data storage and processing has to do with access to any external resources, like data storage (DynamoDB), data stream processing (Kinesis) or object storage (S3). The key language is that Lamdba@Edge cannot support “configuration of your Lambda function to access resources inside your VPC.” Resources inside your VPC (virtual private cloud) refers to any other AWS services that would normally be accessible locally to the runtime. Those resources are only available from the origin AWS region, of which there are 25 globally. These resources can be duplicated across multiple regions for broader global coverage, but the developer has to explicitly select the regions where they expect their users to be located. To get full global coverage, the developer could activate all 25 regions, but that would incur substantial additional costs.

Also, the developer is responsible for determining which regional resource to utilize based on the entry point from CloudFront / Lambda@Edge. This can become very cumbersome for a truly global audience. DynamoDB can be used as a key-value data store with data replicated to different regions through global tables, but again, the developer has to select which regions should be used and which tables to replicate. This blog post from AWS summarizes the difference methods to access external data stores from Lambda@Edge and highlights the limitations mentioned above.

AWS Blog Post, “Leveraging external data in Lambda@Edge”

The diagram above, taken from the blog post, illustrates the recommended method of accessing a DynamoDB database from a Lambda@Edge function. This is routed back to AWS Lambda in the closest region and then directed to the DynamoDB global table that is available in that region. The developer has to designate which regions to use in these cases and manage the replication explicitly. So, it is possible to reference an external database, data stream or object store (S3) from a Lambda@function, but it incurs the network overhead of routing back to an origin region.

For these reasons, Lambda@Edge is best suited for functionality that requires limited state management. There are techniques that can be used to cache some state locally in the edge location, like global variables or JSON files, but these wouldn’t provide the convenience or capabilities of a full-fledged relational database, time series data store, object store or Publish-Subscribe system. Also, this functionality generally gets invoked in the regional edge cache, which is more constrained geographically.

With Cloudflare’s architecture, these data storage and processing services are available from all edge locations in parallel. The developer doesn’t need to consider where the database, object store or stream processing brokers will be located. They simply reference that resource from their Worker code and Cloudflare’s platform handles the rest, including data replication and access optimization automatically. This will result in much lower latency and higher throughput than trying to run complex, data resource intensive workloads on Lambda@Edge.

Similarly, for data streaming, without a performant local data store, more sophisticated pre-processing use cases couldn’t be handled easily by Lambda@Edge combined with Kinesis. For example, if the developer wanted to queue a set of sensor data messages and then perform some sort of aggregation, filtering or combining with other data sets, that would only be possible on Cloudflare’s implementation. With Lambda@Edge, only pre-processing scoped to each message body individually would be possible. Additionally, with the Pub/Sub broker service running in each of Cloudflare’s edge data centers, routing of messages between data centers would be possible, versus sending everything back to origin and then out to the edge. This would be applicable in point-to-point IoT use cases, versus those where all sensor data is shipped back to a central processing location to make a decision.

NOTE: This article does not represent investment advice and is solely the author’s opinion for managing his own investment portfolio. Readers are expected to perform their own due diligence before making investment decisions. Please see the Disclaimer for more detail.

13 Comments

  1. hhh

    Thanks Peter for another great article.

    Regarding Pub/Sub advantage of cloudflare –

    “Because Worker code can be invoked at any of Cloudflare’s data centers throughout the world, this “pre-processing” provides a major advantage over other centralized publish-subscribe systems.”

    Could this not done by Lambda@Edge in case of say AWS ?

    • poffringa

      Hi – thanks for the feedback. Based on this question and a few others I received, I provided an Addendum to the blog post above to address the differences and advantages that Cloudflare’s architecture provides over addressing similar use cases on Lambda@Edge.

  2. Priya

    Thanks Peter. With Cloudflare moving swiftly into the data space, will this impact MDB’s growth prospects since many of the new applications & use cases can benefit from the decentralized architecture of NET? They were able to quickly overtake ZS in Zero Trust even though ZS had a multiple year head start.

    • poffringa

      Currently, I think MongoDB and Cloudflare’s data offerings have a good amount of distance them. MongoDB is still largely a centralized database solution, versus distributed/edge for Cloudflare. Also, Cloudflare announced a partnership with MongoDB in November 2021.

  3. Michael Orwin

    Thanks for the article! Has demand for software engineers weakened at all? I’ve read that Meta, Uber and Snap are reducing hiring. I don’t know how much of that relates to software engineers, or if many other businesses are likely to be cutting back on hiring them, or if it would have much impact on the companies you write about.

    • poffringa

      Hi – I have read reports of companies, like those you reference, starting to slow down hiring. Also, I suspect many start-ups are doing the same. These are obviously macro related, versus a long term reflection of the demand for digital transformation. I see a few positive and negative effects.

      – If companies are cutting back on engineering hiring, they are likely looking for other ways to save money. This could temporarily reduce demand for some infrastructure services or put new projects on hold. It’s hard to predict the impact. We could see a slow down in spending growth or cuts. It’s possible some areas may be maintained spend a possibly small increases.
      – If engineering hiring is slowing down or even some lay-offs, this will have the effect of reducing wage pressure and employee churn. For those companies that are hiring, they will attract top talent. Datadog, for example, hired through the Covid pandemic and benefited from strong engineers leaving other organizations.
      – If some VC backed start-ups fold, it reduces competition for the more established public companies with large amounts of cash.

      • Michael Orwin

        Thanks!

  4. HC

    Thanks for the great article as always.
    One thing perplexing me about Cloudflare’s data storage at the edge is I’m not sure replicating the data to all locations (pops) is necessary for the majority of use cases. I believe customers except for very large enterprises rarely need a global presence at a Cloudflare scale and at the same time require very low latency.
    For most Cloudflare customers, data replicas in most of the locations wouldn’t be accessed at all. It’s a waste of storage even though Cloudflare may be able to optimize it e.g. using cold storage.
    What do you think?

    • HC

      Overall I think Cloudflare’s edge compute/storage strategy makes a lot of sense. They are providing lightweight versions of infra software on the edge to compete with hyperscalers’ centralized heavyweight infra software, e.g. SQLite based D1 vs. AWS Aurora, MTQQ based Pub/Sub vs. AWS Kinesis.

    • poffringa

      I agree that there are a subset of application use cases which would benefit from having logic and data readily available at the edge. These are generally latency sensitive, like the example with Meta and authentication. I can also think of some use cases where near real-time state management between multiple (finite) parties would be a benefit. An easy example is a multi-user collaboration or messaging feature, like shared document editing or group chat. However, as you point out, in those cases, the state would only be replicated to the locations in proximity to the set of users actively collaborating. After the collaboration session ended, the local copies would likely be purged after a decay period. I don’t see a reason to replicate all data to all data centers in all cases.

  5. Michael Orwin

    I’ve read a little about data meshes. I think it’s like pub/sub plus this: people responsible for a domain, often transactional (like checkout data) are also made responsible for publishing data, typically for analytics subscribers. Apparently with data meshes you don’t need data engineers in the middle to set up data pipelines with extract, load, transform (in whatever order). Is Cloudflare’s pub/sub good for that, or maybe too targeted to IoT to be used throughout an organisation? (As a non-techie I hope I’m making some kind of sense.)

    • poffringa

      Cloudflare’s new Pub/Sub offering could become a data pipelining tool that helps facilitate the movement of data through a data mesh to different domains. Initially, it is being focused on messaging to facilitate IoT type use cases, but data pipelines could be supported in the future, particularly given that Workers could apply logic (transformation) as the data transited.

      • Michael Orwin

        Thanks!