Ventana Research Analyst Perspectives

Data Orchestration Automates and Accelerates Analytics and AI

Written by Matt Aslett | Apr 24, 2024 10:00:00 AM

I recently wrote about the development, testing and deployment of data pipelines as a fundamental accelerator of data-driven strategies. As I explained in the 2023 Data Orchestration Buyers Guide, today’s analytics environments require agile data pipelines that can traverse multiple data-processing locations and evolve with business needs.

Given the increasing complexity of evolving data sources and requirements, it’s essential to automate and coordinate the creation, scheduling and monitoring of data pipelines. This is the realm of data orchestration, which enables the flow of data across the organization via capabilities for pipeline monitoring, pipeline management and workflow management.

Traditional approaches to data management are rooted in point-to-point batch data processing, whereby data is extracted from its source, transformed for a specific purpose and loaded into a target environment for analysis. These approaches are unsuitable for the demands of today’s analytics environments, which require sequential or parallel execution of a complete set of tasks via data pipelines, typically based on directed acyclic graphs that represent the relationships and dependencies between the tasks. I assert that by 2027, more than one-half of enterprises will adopt data orchestration technologies to automate and coordinate data workflows and increase efficiency and agility in data and analytics projects.

At the highest level of abstraction, data orchestration covers three key capabilities: collection (including data ingestion, preparation and cleansing), transformation (additionally including integration and enrichment) and activation (making the results available to compute engines, analytics and data science tools or operational applications). Whether stand-alone or embedded in larger data-engineering platforms, data orchestration has the potential to drive improved efficiency and agility in data and analytics projects. Data orchestration addresses one of the most significant impediments to generating value from data. Participants in Ventana Research’s Analytics and Data Benchmark Research cite preparing data for analysis and reviewing data for quality and consistency issues as the two most time-consuming tasks in analyzing data.

The development and orchestration of agile data pipelines is an important aspect of Data Operations, which provides an overall approach to automate data monitoring and the continuous delivery of data into operational and analytical processes through the application of agile development, DevOps and lean manufacturing by data engineering professionals in support of data production. Data orchestration is also integral to the development and delivery of applications driven by artificial intelligence and generative AI.

Almost one-half (49%) of participants in ISG’s 2023 Application Development and Maintenance Study expect to AI-enable applications by embedding AI and ML models into current applications and processes. Data orchestration automates and accelerates the flow of data from multiple sources, including existing applications and data platforms with the output of large language models and vector databases, complementing MLOps, which serves the collection of artifacts and orchestration of processes necessary to deploy and maintain AI/ML models.

Agile and collaborative practices were a core component of the Capabilities criteria we used to assess data pipeline tools in the 2023 Data Orchestration Buyers Guide, alongside the functionality required to support pipeline monitoring, pipeline management and workflow management as well as integration with the wider ecosystem of DevOps, data management, DataOps and business intelligence and AI tools and applications.

The orchestration of data pipelines is just one aspect of improving the use of data within an enterprise. In addition to the development, testing, and deployment of data pipelines, DataOps also encompasses data observability, which I will explore in greater detail in a forthcoming Analyst Perspective. Nevertheless, I recommend that all enterprises explore how the orchestration of data pipelines can help increase the potential for improved data-driven decision-making as part of a broader evaluation of the people, processes, information and technology improvements required to deliver data-driven decision-making.

Regards,

Matt Aslett