Welcome back -

Services for Organizations

Using our research, best practices and expertise, we help you understand how to optimize your business processes using applications, information and technology. We provide advisory, education, and assessment services to rapidly identify and prioritize areas for improvement and perform vendor selection

Consulting & Strategy Sessions

Ventana On Demand

    Services for Investment Firms

    We provide guidance using our market research and expertise to significantly improve your marketing, sales and product efforts. We offer a portfolio of advisory, research, thought leadership and digital education services to help optimize market strategy, planning and execution.

    Consulting & Strategy Sessions

    Ventana On Demand

      Services for Technology Vendors

      We provide guidance using our market research and expertise to significantly improve your marketing, sales and product efforts. We offer a portfolio of advisory, research, thought leadership and digital education services to help optimize market strategy, planning and execution.

      Analyst Relations

      Demand Generation

      Product Marketing

      Market Coverage

      Request a Briefing


        Ventana Research Analyst Perspectives

        << Back to Blog Index

        A Data Pantry Speeds Development of Machine Learning Models



        A few years ago – somewhat tongue in cheek – I began using the term “data pantry” to describe a type of data store that’s part of a business application platform, created for a specific set of users and use cases. It’s a data pantry because, unlike a general-purpose data store such as a data warehouse, everything the user needs is readily available and easily accessible, with labels that are immediately recognized and understood.

        A pantry is consistent with the data mesh architecture concept. It is especially suited to business software that describes itself as a platform because, typically, platforms are designed to work with any number of other applications or data sources, using application programming interfaces to automate the integration of processes and data. Eliminating the need for manual integration of data is important because our Analytics and Data Benchmark Research reveals that individuals spend a considerable portion of their time preparing data for analysis and reviewing it for quality and consistency issues, activities that are no longer necessary when a data pantry is available.

        Increasingly, business applications – especially those involved in planning and analytics – describe themselves as platforms. Unlike the older connotation of a platform – upon which additional functionality is developed – platforms currently in use are in effect suspended across multiple computing systems to facilitate process or data interactions between systems. This could include any form of forecasting that uses historical data from multiple, disparate sources to better inform projections and present historical results, such as a business planning platform that uses data sourced from human capital management, disparate enterprise resource planning systems, supply chain or customer relationship management systems to support what Ventana Research calls integrated business planning.

        A data pantry makes it possible for analysts and business users to immediately access all necessary data gathered from multiple sources for analysis and reporting, without the need to repeatedly perform data extraction, enrichment or transformation motions. This delivers all of the information needed in a consistent and useful form and format. Data pantries are becoming increasingly common in business applications as software vendors adopt a platform approach to their architecture, although so far no one else is using this term. For example, in 2018, Ventana Research gave Workiva an Innovation Award for its Wdata offering because this method of data aggregation is especially useful for reporting corporate data from the wide range of systems necessary for, say, statutory reporting to securities regulators or regulatory bodies. Having a broader set of accessible data is also useful for creating richer and more insightful analyses and for communicating information and insights across an organization.

        More recently, it struck me that another compelling use case for a data pantry is to support machine learning necessary for training artificial intelligence capabilities that are part of a business application. This is likely to be the case where the authoritative source or sources of data necessary for training the system using machine learning exist both inside and outside of the application. The need to support machine learning will increase significantly over the next three years because I assert that by 2025, almost all vendors of software designed for finance organizations will have incorporated some AI capabilities to reduce workloads and improve performance. This will especially be the case for planning and predictive analytics purposes.

        In this respect, a data pantry has properties similar to what data scientists call a “feature store.” ML uses “features” (statisticians prefer the term “variables”) to build, train and adjust models capable of making performant predictions based on past experiences. A feature is a distinct and measurable characteristic of a phenomenon – for example, the factors that drive demand for a specific product or that predict the price sensitivity of a buyer. Typically, a feature store ingests raw data taken from data warehouses, streaming data sources, applications and other data sources, and then transforms the data to make it usable for discovering and testing inferences as well as for training the system.

        Feature stores and data pantries continually access primary sources to select, extract, clean and transform data related to the features or variables used in modeling. They are necessary where heterogenous data sets are used in machine learning because the data almost always must be transformed into a structure and format that facilitates creating and updating models. Staging the data in feature stores and data pantries minimizes time lags that can occur when data moves back and forth between systems, especially when those data movements require some form of transformation. And minimizing lags is essential to creating practical value: Machine learning computation windows are often measured in seconds or minutes, such as responding to queries, generating forecasts or performing full system learning cycles.

        Data pantries are similar to feature stores, but they are a distinct construct, also designed to support a wide range of analytical, business intelligence and reporting tasks. And, because they are created for a specific domain, the challenges data scientists typically encounter in feature stores – including data source integration, modeling constraints or inflexible data structures – are far less of an issue.

        I recommend that all vendors of business software – especially those with applications using data from multiple authoritative sources, and particularly those that incorporate AI capabilities – have a “data pantry” as part of the application architecture. I also recommend that organizations become familiar with the benefits of this type of integrated data structure and include it in an evaluation of software vendors’ offerings and roadmap, making it a part of the selection criteria.

        Regards,

        Robert Kugel

        Authors:

        Robert Kugel
        Executive Director, Business Research

        Robert Kugel leads business software research for Ventana Research, now part of ISG. His team covers technology and applications spanning front- and back-office enterprise functions, and he personally runs the Office of Finance area of expertise. Rob is a CFA charter holder and a published author and thought leader on integrated business planning (IBP).

        JOIN OUR COMMUNITY

        Our Analyst Perspective Policy

        • Ventana Research’s Analyst Perspectives are fact-based analysis and guidance on business, industry and technology vendor trends. Each Analyst Perspective presents the view of the analyst who is an established subject matter expert on new developments, business and technology trends, findings from our research, or best practice insights.

          Each is prepared and reviewed in accordance with Ventana Research’s strict standards for accuracy and objectivity and reviewed to ensure it delivers reliable and actionable insights. It is reviewed and edited by research management and is approved by the Chief Research Officer; no individual or organization outside of Ventana Research reviews any Analyst Perspective before it is published. If you have any issue with an Analyst Perspective, please email them to ChiefResearchOfficer@ventanaresearch.com

        View Policy

        Subscribe to Email Updates

        Posts by Month

        see all

        Posts by Topic

        see all


        Analyst Perspectives Archive

        See All