DataOps in the SDLC

Developers and testers require access to test data at multiple stages. DataOps can be applied to the software development lifecycle (SDLC) to increase release speed and improve application quality.

The Challenge

The majority of organizations are implementing continuous integration/continuous delivery (CI/CD) and DevOps into their SDLC. This typically includes codifying and automating infrastructure, application, and code rollouts. Provisioning data, however, is often left to a manual or bespoke process from request all the way to delivery.

This usually yields one of the following compensating approaches: 1) Data operators take hours, days, or even weeks to provision, restore, or refresh data to a waiting CI/CD pipeline, or 2) Data consumers (developers or testers) generate their own data to satisfy the demand themselves. The former approach impacts project timelines. The latter invites a host of quality and consistency issues that inject delays later in the pipeline. Furthermore, data model states and test data are rarely version-controlled and tested.

Efficient DataOps, however, eliminates data inconsistencies and delays that result in a toxic culture across Dev, QA/Test, and Ops. By aligning operators and consumers and bringing automation to data, DataOps helps organizations realize the promise of automated SDLC pipelines.

DataOps Success Patterns

People

  1. Understand the data lifecycle: Developers, testers, and operations teams must understand the full lifecycle of their data from sourcing, cleansing, subsetting, securing, provisioning, and retirement.
  2. Align Data Consumers and Operators: Agree on shared goals and objectives: e.g. increase release cadence, reducing data-related defects, or increasing test frequency.

Process

  1. Adopt an “automation-first” policy for data: Manual data processes should only be by documented exception.
  2. Eliminate manual security controls: Automation initiatives must encompass key data security processes to not only improve speed and productivity, but also mitigate risk.

Technology

  1. Avoid point solutions: Embrace platform solutions that have demonstrated interoperability with your existing SDLC toolchain, can be automated through APIs, and support multiple data sources and clouds.
  2. Empower teams with self service: Implement solutions that enable self-service data request and fulfillment while providing robust test data management and provisioning capabilities.

Data Operators and Consumers

Data friction exists between two groups of people: Data Operators and Data Consumers. Here are some examples of both for Software Development Lifecycle

Data Operators

  • Database Team
  • Storage Team
  • Server Team
  • Information Security

Data Consumers

  • Developers
  • Testers
  • QA