Welcome back -

Services for Organizations

Using our research, best practices and expertise, we help you understand how to optimize your business processes using applications, information and technology. We provide advisory, education, and assessment services to rapidly identify and prioritize areas for improvement and perform vendor selection

Consulting & Strategy Sessions

Ventana On Demand

    Services for Investment Firms

    We provide guidance using our market research and expertise to significantly improve your marketing, sales and product efforts. We offer a portfolio of advisory, research, thought leadership and digital education services to help optimize market strategy, planning and execution.

    Consulting & Strategy Sessions

    Ventana On Demand

      Services for Technology Vendors

      We provide guidance using our market research and expertise to significantly improve your marketing, sales and product efforts. We offer a portfolio of advisory, research, thought leadership and digital education services to help optimize market strategy, planning and execution.

      Analyst Relations

      Demand Generation

      Product Marketing

      Market Coverage

      Request a Briefing


        Ventana Research Analyst Perspectives

        << Back to Blog Index

        Dremio Enables Self-Service Analytics for the Data Lakehouse



        I previously wrote about the potential for rapid adoption of the data lakehouse concept as enterprises combined the benefits of data lakes based on low-cost cloud object storage with the structured data processing functionality normally associated with data warehousing. By layering support for table formats, metadata management and transactional updates and deletes as well as query engine and data orchestration functionality on top of low-cost storage of both structured and unstructured data, the data lakehouse enables enterprises to not only store and process data from multiple applications, but also enable it to be analyzed by multiple users in multiple departments for many purposes, including business intelligence and artificial intelligence. Vendors such as Dremio have added capabilities to the core concept to better equip enterprises to rely on the use of data lakehouses for self-service analytics and AI.

        Dremio was founded in 2015 to build a business around the Apache Arrow in-memory columnar data format, which was developed to enable high-performance analysis of large volumes of data. Apache Arrow underpins the company’s SQL Query Engine, which is designed to deliver high-performance BI and interactive analytics directly on the data stored in a data lake or other data platforms across cloud, on-premises or hybrid environments. The SQL Query Engine is one of three core components of Dremio’s Unified Lakehouse Platform, alongside governed self-service analytics and data lakehouse management based on Dremio’s data catalog for the Apache Iceberg table format. In combination, these capabilities are designed to enable enterprises to connect and govern data in on-premises and cloud data lakes as well as other data sources across the database estate and make it available to data analysts and business users to access and analyze on a self-service basis.

        With customers in a variety of industries including, financial services, healthcare, retail, manufacturing and consumer packaged goods, Dremio has raised more than $400 million in funding from the likes of Adams Street Partners, Cisco Investments, Insight Partners, Lightspeed Venture Partners, Norwest Venture Partners, and Sapphire Ventures. Most recently, Dremio raised a $160 million Series E funding round in January 2022, which valued the company at over $2 billion.

        It is common for enterprises to create data lake environments to persist structured and unstructured data in object storage, either on-premises or in the cloud. More than one-half (53%) of participants in Ventana Research’s Ventana_Research_BR_AD_Object_Stores_Use_2024Analytics and Data Benchmark Research currently use object stores in their analytics efforts and an additional 18% plan to do so within the next two years. Data lakes provide a relatively low-cost alternative to data persistence in traditional full-stack data warehouse environments, which combine compute and storage. Data warehousing providers have responded to take advantage of data lakes by adapting products to independently scale compute and storage by deploying on or alongside a data lake.

        Meanwhile, data lakehouse vendors have integrated the functionality associated with data warehousing into the data lake itself. This includes distributed SQL query engines; support for atomic, consistent, isolated and durable transactions; updates and deletes; concurrency control; metadata management; data indexing; data caching; schema enforcement and evolution; query acceleration; semantic models; data governance; version control; access control and auditing.

        Dremio’s Unified Lakehouse Platform is available as software for deployment on-premises and in the cloud as well as a cloud service. It is made up of three core sets of capabilities addressing SQL query processing and acceleration, lakehouse management and unified analytics. For SQL query processing and acceleration, the platform’s SQL Query Engine enables the processing and transformation of data in cloud data lakes as well as federated querying of metastores and databases on-premises and in the cloud. SQL Query Engine enables users to create virtual tables known as Views from the source data for query acceleration. It also offers pre-computed data summaries, known as Reflections, that accelerate complex aggregations and other operations as well as using Columnar Cloud Cache for in-memory data processing.

        Dremio’s lakehouse management capabilities provide a data catalog based on the Apache Iceberg table format that can be accessed using SQL Query Engine as well as other query engines such as Apache Spark or Apache Flink.Ventana_Research_2024_Assertion_DataIntel_Increased_Business_Access_42_S The lakehouse management capabilities include automated optimization of Apache Iceberg tables, centralized data governances, and Git-like data branching and version control as well as isolated and consistent data transformations based on Dremio’s Nessie open-source project. The unified analytics capabilities take advantage of Dremio's Universal Semantic layer to provide self-service access to discover data in the data catalog and take advantage of the SQL Query Engine to accelerate analysis for data and business analysts using a variety of analytics tools and applications from the likes of Alteryx, Domo, Google Cloud, IBM, Microsoft, MicroStrategy, Qlik, SAP and Salesforce’s Tableau. I assert that through 2027, almost all enterprises using data catalog products will increase business user access, facilitating self-service data discovery and accelerating data intelligence and democratization initiatives. Dremio has added generative AI-based capabilities to lower the barriers to accessing and working with data, including auto-generated descriptions and labeling as well as the conversion of natural language questions to SQL queries.

        While Dremio has always offered features and functionality of value to data engineers, the recent addition of lakehouse management capabilities enables the company to articulate a larger value proposition for technology decision-makers that addresses the advantages of self-service analytics and AI. I anticipate further investment in generative AI capabilities, such as vector search and automated semantic data modeling. I recommend that any organization considering the data lakehouse approach evaluate Dremio’s Unified Lakehouse Platform when evaluating options to take advantage of its combination of query acceleration and data management.

        Regards,

        Matt Aslett

        Authors:

        Matt Aslett
        Director of Research, Analytics and Data

        Matt Aslett leads the software research and advisory for Analytics and Data at Ventana Research, now part of ISG, covering software that improves the utilization and value of information. His focus areas of expertise and market coverage include analytics, data intelligence, data operations, data platforms, and streaming and events.

        JOIN OUR COMMUNITY

        Our Analyst Perspective Policy

        • Ventana Research’s Analyst Perspectives are fact-based analysis and guidance on business, industry and technology vendor trends. Each Analyst Perspective presents the view of the analyst who is an established subject matter expert on new developments, business and technology trends, findings from our research, or best practice insights.

          Each is prepared and reviewed in accordance with Ventana Research’s strict standards for accuracy and objectivity and reviewed to ensure it delivers reliable and actionable insights. It is reviewed and edited by research management and is approved by the Chief Research Officer; no individual or organization outside of Ventana Research reviews any Analyst Perspective before it is published. If you have any issue with an Analyst Perspective, please email them to ChiefResearchOfficer@ventanaresearch.com

        View Policy

        Subscribe to Email Updates

        Posts by Month

        see all

        Posts by Topic

        see all


        Analyst Perspectives Archive

        See All