Databricks is a data engineering and analytics cloud platform built on top of Apache Spark that processes and transforms huge volumes of data and offers data exploration capabilities through machine learning models. It can enable data engineers, data scientists, analysts and other workers to process big data and unify analytics through a single interface. The platform supports streaming data, SQL queries, graph processing and machine learning. It also offers a collaborative user interface — workspace — where workers can create data pipelines in multiple languages — including Python, R, Scala, and SQL — and train and prototype machine learning models.
The technology industry throws around a lot of similar terms with different meanings as well as entirely different terms with similar meanings. In this post, I don’t want to debate the meanings and origins of different terms; rather, I’d like to highlight a technology weapon that you should have in your data management arsenal. We currently refer to this technology as data virtualization. Other similar terms you may have heard include data fabric, data mesh and [data] federation. I’ll briefly discuss these terms and how I see them being used, but ultimately, I’d like to share with you some research that shows why data virtualization can be valuable, regardless of what you call it.
Data governance is a hot topic these days. In fact, we are conducting benchmark research on the subject here. With increasing regulations such as the General Data Protection Regulation (GDPR) and the California Consumer Privacy Act (CCPA), organizations face more external oversight of their data governance practices. The risk of significant fines associated with these and other regulations, coupled with organizations’ internal compliance requirements, has brought more attention to data governance practices. We anticipate through 2023, three-quarters of Chief Data Officers’ primary concerns will be governing the privacy and security of their organization’s data.
Collibra is a data governance software company that offers tools for metadata management and data cataloging. The software enables organizations to find data quickly, identify its source and assure its integrity. Line-of-business workers can use it to create, review and update the organization's policies on different data assets. Collibra’s software uses a microservice architecture and open application programming interfaces to connect to various data ecosystems. Its data intelligence cloud platform can automatically classify data from various sources such as online transaction processing databases, master repositories and Excel files without moving the data, so the information assets stay protected.
The annual Ventana Research Digital Innovation Awards showcase advances in the productivity and potential of business applications, as well as technology that contributes significantly to the improved processes and performance of an organization. Our goal is to recognize technology and vendors that have introduced noteworthy digital innovations to advance business and IT.