|By Kumar Srivastava||
|August 16, 2013 08:30 AM EDT||
When talking about Big Data, most people talk about numbers: speed of processing and how many terabytes and petabytes the platform can handle. But deriving deep insights with the potential to change business growth trajectories relies not just on quantities, processing power and speed, but also three key ilities: portability, usability and quality of the data.
Portability, usability, and quality converge to define how well the processing power of the Big Data platform can be harnessed to deliver consistent, high quality, dependable and predictable enterprise-grade insights.
Portability: Ability to transport data and insights in and out of the system
Usability: Ability to use the system to hypothesize, collaborate, analyze, and ultimately to derive insights from data
Quality: Ability to produce highly reliable and trustworthy insights from the system
Portability is measured by how easily data sources (or providers) as well as data and analytics consumers (the primary "actors" in a Big Data system) can send data to, and consume data from, the system.
Data Sources can be internal systems or data sets, external data, data providers, or the apps and APIs that generate your data. A measure of high portability is how easily data providers and producers can send data to your Big Data system as well as how effortlessly they can connect to the enterprise data system to deliver context.
Analytics consumers are the business users and developers who examine the data to uncover patterns. Consumers expect to be able to inspect their raw, intermediate or output data to not only define and design analyses but also to visualize and interpret results. A measure of high portability for data consumers is easy access - both manually or programmatically - to raw, intermediate, and processed data. Highly portable systems enable consumers to readily trigger analytical jobs and receive notification when data or insights are available for consumption.
The usability of a Big Data system is the largest contributor to the perceived and actual value of that system. That's why enterprises need to consider if their Big Data analytics investment provides functionality that not only generates useful insights but also is easy to use.
Business users need an easy way to:
- Request analytics insights
- Explore data and generate hypothesis
- Self-serve and generate insights
- Collaborate with data scientists, developers, and business users
- Track and integrate insights into business critical systems, data apps, and strategic planning processes
Developers and data scientists need an easy way to:
- Define analytical jobs
- Collect, prepare, pre process, and cleanse data for analysis
- Add context to their data sets
- Understand how, when, and where the data was created, how to interpret data and know who created them
The quality of a Big Data system is dependent on the quality of input data streams, data processing jobs, and output delivery systems.
Input Quality: As the number, diversity, frequency, and format of data channel sources explode, it is critical that enterprise-grade Big Data platforms track the quality and consistency of data sources. This also informs downstream alerts to consumers about changes in quality, volume, velocity, or the configuration of their data stream systems.
Analytical Job Quality: A Big Data system should track and notify users about the quality of the jobs (such as map reduce or event processing jobs) that process incoming data sets to produce intermediate or output data sets.
Output Quality: Quality checks on the outputs from Big Data systems ensure that transactional systems, users, and apps offer dependable, high-quality insights to their end users. The output from Big Data systems needs to be analyzed for delivery predictability, statistical significance, and access according to the constraints of the transactional system.
Though we've explored how portability, usability, and quality separately influence the consistency, quality, dependability, and predictability of your data systems, remember it's the combination of the ilities that determines if your Big Data system will deliver actionable enterprise-grade insights.
This piece is the first in a three-part series on how businesses can squeeze maximum business value out of their Big Data analysis.