Corvus

Market analysis

Analysis

Positioning

Five-archetype market (hyperscalers, independent data clouds, enterprise-software incumbents, specialty platforms, open-format substrate) consolidating around an Iceberg-based open lakehouse architecture, with mid-teens CAGR through 2030. Snowflake and Databricks lead the independent tier; Microsoft Fabric is the most credible bundled challenger; SAP is the most active 2026 consolidator.

Competitors

SWOT

Strengths
  • Massive secular tailwinds from generative-AI workloads GenAI demand requires governed, queryable, semantic data infrastructure (lakehouses, vector stores, semantic layers), which is the DIAI market's exact product. The Semantic Layer Summit 2026 framed the semantic layer as 'critical infrastructure for enterprise AI.'
  • Consumption-based pricing aligns vendor revenue to usage Pay-per-query/credit/token monetization produces high net-revenue-retention dynamics evidenced by Snowflake's +26-34% YoY product revenue growth and Databricks's reported ~65% ARR growth.
  • Deep, network-effected ecosystem around Iceberg + Spark + Kafka Open-format and open-engine standards make the market interoperable end-to-end, lowering integration cost and raising platform-level demand.
Weaknesses
  • Independents depend on the same three hyperscalers they compete with Both Snowflake and Databricks run on AWS, Azure, and GCP; the hyperscalers control underlying compute economics and increasingly ship bundled DIAI offerings (Fabric, BigQuery, Redshift).
  • Pricing complexity / transparency thin in consumption models Enterprise buyers regularly cite hard-to-forecast spend, which compresses willingness-to-commit and motivates multi-vendor strategies.
  • Multi-cloud governance + data movement remains operationally heavy The DIAI promise of a single semantic substrate is undercut in practice by network egress, identity, and lineage discontinuities across clouds — the gap that semantic-layer and Iceberg adoption is trying to close.
Opportunities
  • Agentic-AI workloads Microsoft Fabric and SAP's Dremio + Prior Labs framing both target 'data platform for AI agents'; this is the next significant wallet expansion beyond classic BI.
  • Unstructured-data analytics + semantic-layer monetization Enterprises hold vastly more unstructured data than structured; vendors that ship governed access (via embeddings + semantic models) capture new spend without displacing existing BI budgets.
  • Vertical / regulated-industry expansion (healthcare, ESG, finance) Adjacent verticalized analytics markets (Healthcare Analytics, ESG Data Analytics, Clinical Analytics) are themselves forecast at 19-31% CAGR through 2030, broadening the DIAI demand base.
Threats
  • Open-source disintermediation via Iceberg + open engines Iceberg makes storage a commodity any engine can read; ClickHouse and similar open-source-led real-time engines compete on cost and speed without the enterprise tax.
  • Hyperscaler bundling (Microsoft Fabric / BigQuery / Redshift) Fabric's anti-Snowflake/Databricks positioning at Build 2026 indicates explicit hyperscaler intent to win bundled-distribution share at independents' expense.
  • Macroeconomic slowdown could compress consumption growth Consumption-priced revenue is more procyclical than seat-based subscription revenue; a recession or sustained IT-spend contraction would visibly slow growth even with structural tailwinds intact.

Porter's Five Forces

Threat of New Entry moderate

Capital + sales-channel costs to reach enterprise buyers are high, but Iceberg removes the proprietary-format moat that used to gate the market. Well-funded entrants (ClickHouse, Dremio pre-acquisition, vector-DB vendors) can carve credible niches in real-time, lakehouse, or AI-data subsegments; few will displace incumbents at platform scope.

Supplier Power moderate

Compute + storage supply is concentrated in three hyperscalers (AWS, Azure, GCP), which gives them structural pricing power over independents. However, independents can play hyperscalers off each other (Snowflake and Databricks both run on all three), and Iceberg lowers data-portability cost.

Competitive Rivalry high

Three hyperscalers + two large independents + multiple incumbent ERP vendors + specialty challengers (ClickHouse, MongoDB, Confluent) all converging on overlapping AI-data-platform positioning. Microsoft Fabric's explicit anti-Snowflake/Databricks framing at Build 2026 illustrates active head-to-head competition.

Buyer Power moderate

Enterprise buyers can demand consumption-based pricing, multi-cloud portability, and discounts on large commitments; switching cost is non-trivial but Iceberg + dbt + open BI reduce lock-in. Buyers are price-sensitive when consumption surprises hit budget.

Threat of Substitution moderate

Open-source / Iceberg-based stacks (Dremio pre-SAP, ClickHouse, Starburst, self-managed Spark on object storage) substitute for proprietary platforms at lower direct cost but with higher operating burden. Substitutes are credible for cost-sensitive or technically-mature buyers; less so for governance-heavy enterprises.