DataOps in Analytics

Access to real-time, actionable insights can mean the difference between success and failure in business. Failure to get clean, relevant data to the right systems and teams can result in poor decisions or missed market opportunities.

The Challenge

At its heart, data analytics is about separating the signal (insight) from the noise (data). Modern businesses are continually challenged to rapidly find insights that help them understand their business and unlock innovation to drive competitive differentiation. Data analysts are tasked with divining these business truths from an ocean of data, in real time. But achieving this goal becomes increasingly difficult as data increases in size and complexity, even as demand for quicker insights mounts.

For analytics-driven enterprises struggling with outdated data, dirty data, and insufficient data environments for creating and testing models, DataOps provides an approach for delivering the right data, to the right place, at the right time.

DataOps Success Patterns

People

  1. Create cross-functional teams: Analytics should not be the responsibility of one particular person or function. Cross-functional teams should be oriented around business goals, and maintain a holistic view of data independent of their function.
  2. Understand the full data lifecycle: Both data consumers and data operators must understand the lifecycle of their data, from sourcing, aggregation, cleansing to analysis, visualization and action.

Process

  1. Set a review cadence: As a team, routinely review the data lifecycle to identify and eliminate inefficiencies.
  2. Consider downstream impact: When the structure source data changes, such as with a new application release, ensure that all downstream consumers are part of the review and release process to minimize disruption.
  3. Automate: Seek to replace human data controls with automated ones. For example, instead of protecting data by manually executing a checklist, test and validate via automated controls.

Technology

  1. Version all the things: Although version control is a capability associated with code, forward-thinking organizations are also applying it to their data, security controls, and even analytics reports and models.
  2. Toolchain optimization: Focus on tools that enable repeatability, self-service, scale, and automation. For example, implement a solution that enables self-service data requests and fulfillment across the enterprise, with proper role-based access and governance controls.

Additional information about the analytics application of DataOps in can be found at http://dataopsmanifesto.org

Data Operators and Consumers

Data friction exists between two groups of people: Data Operators and Data Consumers. Here are some examples of both for Data Analytics

Data Operators

  • Database Team
  • Storage Team
  • Server Team
  • Information Security
  • Data Architects
  • Data Engineers

Data Consumers

  • Analysts
  • Data Scientists
  • Business Dashboard owners