You are currently browsing the tag archive for the ‘Splunk’ tag.

A few months ago, I wrote an article on the four pillars of big data analytics. One of those pillars is what is called discovery analytics or where visual analytics and data discovery combine together to meet the business and analyst needs. My colleague Mark Smith subsequently clarified the four types of discovery analytics: visual discovery, data discovery, information discovery and event discovery. Now I want to follow up with a discussion of three trends that our research has uncovered in this space. (To reference how I’m using these four discovery terms, please refer to Mark’s post.)

The most prominent of these trends is that conversations about visual discovery are beginning to include data discovery, and vendors are developing and delivering such tool sets today. It is well-known that while big data profiling and the ability to visualize data give us a broader capacity for understanding, there are limitations that can be vr_predanalytics_predictive_analytics_obstaclesaddressed only through data mining and techniques such as clustering and anomaly detection. Such approaches are needed to overcome statistical interpretation challenges such as Simpson’s paradox. In this context, we see a number of tools with different architectural approaches tackling this obstacle. For example, Information Builders, Datameer, BIRT Analytics and IBM’s new SPSS Analytic Catalyst tool all incorporate user-driven data mining directly with visual analysis. That is, they combine data mining technology with visual discovery for enhanced capability and more usability. Our research on predictive analytics shows that integrating predictive analytics into the existing architecture is the most pressing challenge (for 55% or organizations). Integrating data mining directly into the visual discovery process is one way to overcome this challenge.

The second trend is renewed focus on information discovery (i.e., search), especially among large enterprises with widely distributed systems as well as the big data vendors serving this market. IBM acquired Vivisimo and is incorporating the technology into its PureSystems and big data platform. Microsoft recently previewed its big data information discovery tool, Data Explorer. Oracle acquired Endeca and has made it a key component of its big data strategy. SAP added search to its latest Lumira platform. LucidWorks, an independent information discovery vendor that provides enterprise support for open source Lucene/Solr, adds search as an API and has received significant adoption. There are different levels of search, from documents to social media data to machine data,  but I won’t drill into these here. Regardless of the type of search, in today’s era of distributed computing, in which there’s a need to explore a variety of data sources, information discovery is increasingly important.

The third trend in discovery analytics is a move to more embeddable system architectures. In parallel with the move to the cloud, architectures are becoming more service-oriented, and the interfaces are hardened in such a way that they can integrate more readily with other systems. For example, the visual discovery market was born on the client desktop with Qlik and Tableau, quickly moved to server-based apps and is now moving to the cloud. Embeddable tools such as D3, which is essentially a visualization-as-a-service offering, allow vendors such as Datameer to include an open source library of visualizations in their products. Lucene/Solr represents a similar embedded technology in the information discovery space.

The broad trend we’re seeing is with RESTful-based architectures that promote a looser coupling of applications and therefore require less custom integration. This move runs in parallel with the decline in Internet Explorer, the rise of new browsers and the ability to render content using JavaScript Object Notation (JSON). This trend suggests a future for discovery analysis embedded in application tools (including, but not limited to, business intelligence). The environment is still fragmented and in its early stage. Instead of one cloud, we have a lot of little clouds. For the vendor community, which is building more platform-oriented applications that can work in an embeddable manner, a tough question is whether to go after the on-premises market or the cloud market. I think that each will have to make its own decision on how to support customer needs and their own business model constraints.


Tony Cosentino

VP and Research Director

Splunk’s innovated ability to access and use machine data for targeted operational insights can help improve IT and enhance business operational efficiency. Its work to capitalize on big data was part of my last analysis, while my colleague Tony Cosentino looked at its focus on search and operational analytics. Splunk also was a recipient of the 2012 Ventana Research Technology Innovation Award for IT Performance for Splunk Enterprise.

The latest Splunk release, version 5, advances its ability to provideVR_2012_TechAward_Winner_Logo operational intelligence to organizations by using data both from existing IT systems and in Hadoop. A new SDK offering supports Java, JavaScript, PHP and Python as part of its API. It can help developers get at data from social media and online applications and combine it with machine-generated data, which our research found to be the top data source according to 62 percent of organizations, followed by application data (53%) and historical data warehouse sources (43%).

Splunk also now provides a cloud computing offering called Splunk Storm that helps companies take advantage of machine data in the cloud or on-premises. It lets users quickly create projects and analyze machine data with charts that can be shared with others. I had a chance to go through the new offering and found it to be simple and quick to analyze data and present analytics based on events, IP addresses and other machine data. Our latest benchmark research into operational intelligence found that activity or event monitoring is a top need in 62 percent of organizations; Splunk Storm can address this through its search and analyze approach.

Splunk has made the pricing of the new offering simple. You select the storage volume you need in the cloud and get the pricing quickly. You can sign up for free and work with the product on data up to 1GB. Organizations can also analyze cloud data with the on-premises version of the product, but for many who need to quickly assess data without the hassle of using internal resources and systems, the software-as-a-service version is an easy way to get started. This is important, as the top barriers for operational intelligence are lack of resources (41%) and no budget (40%), and the on-demand approach is now preferred in 21 percent of organizations, which allows Splunk to address an expanding opportunity.

With its on-premises product and its latest cloud computing offering, Splunk provides customers a good array of options. The company is moving quickly to add alerting and APIs that can be used to integrate to other offerings to upcoming Splunk Storm releases. Splunk has little competition when it comes to combining machine data with other data from business and IT to help organizations in cloud and enterprise approaches. It lets businesses harvest existing sources without having to establish a specialized data store first. Organizations should take a look at what Splunk is providing and how it can help address a class of operational and analytic needs across IT and business data.


Mark Smith
CEO & Chief Research Officer

Twitter Updates

Top Rated

Blog Stats

  • 81,227 hits

Get every new post delivered to your Inbox.

Join 125 other followers

%d bloggers like this: