Organizations are turning increasingly to Agile for IT project implementation. Business and technology leaders understand the potential benefits of Agile, but they don’t always realize how challenging it can be to apply Agile principles across different kinds of projects—especially data analytics projects.
When I say “Agile,” I am referring to Scrum methodology, as opposed to other Agile variations such as AUP, FDD, XP and more. Businesses favor Agile development because the basic framework is built on the principle of work breakdown. If we drill down requirements to the most granular level, it makes it faster and simpler to develop, test and deploy. The second most important factor is getting business value from every sprint—and, finally, Agile provides teams with the flexibility for course correction or making changes.
Bringing Agile to big data
Agile is well suited for traditional application development projects, but the most data-intensive projects (such as data warehousing, business intelligence and big data) face particular challenges across the entire project life cycle, from requirements to UAT. These include:
Requirements: Agile requirements are captured under the structure of Themes > Epics > User stories, in that order. Each user story allows a user to accomplish a task or gain some business value. This poses a challenge for business analysts trying to articulate data integration and report or dashboard requirements, because much of what needs to be done may not fit the Agile definition of a user story. So how can they define the requirements for data quality, table creation or data ingestion?
Development: Development in Scrum begins on day one of the sprint. For a BI or analytics project, certain prerequisites must be accomplished before the development team can start work:
- Source system analysis
- Database or data mart design (both HLD and LLD)
- Infrastructure setup
- Data availability
Testing: Each sprint runs like a mini waterfall model. In an app development scenario, we would have development, testing, UAT and integration. But in the case of reporting, the sequence changes: development, testing, report development, report testing, UAT and deployment. This adds complications to an already tight timeline. Even more important, these activities need to happen in a certain order. Not everything takes place in parallel, which is another premise of Agile.
Tips for applying Agile to data projects
Over the past few years, Mindtree has executed data and analytics projects in Agile. Success relies on understanding what the customer and vendor know about Agile principles and how they want to adopt them in practice versus just understanding the theory behind it.
We’ve learned the following lessons from our Agile endeavors in data-intensive projects:
Educate and strategize: All stakeholders must understand and appreciate the fundamentals of Agile and be open to the fact that, as with every other process, this one requires customization. One example is sprint duration. Depending on user stories and effort estimation, we could plan for a three- to four-week sprint instead of a two-week sprint. The effort required for a BI project ideally warrants a longer sprint.
During a recent customer requirements workshop, we were creating a product backlog for the analytics platform. The product owner was unsure of how to capture technology-driven user stories, rather than business-driven, due to the data components involved. We came to an agreement about how to track these stories in a structured way, which enabled us to deliver additional business value and complete the necessary technical work.
Sprint zero planning: Scrum teams should have all prerequisite activities and tasks planned and executed in order to begin development in sprint one. This will help teams avoid any roadblocks in the first few sprints, which is when the customer will be looking to validate the approach and capability of the Scrum team. Typically, these activities include:
- Infrastructure setup
- Environment setup
- Access of tools and software access to team members
- Data architecture
- HLD and LLD implemented for the data model
- Product backlog with prioritized user stories
- Sufficient number of sprint-ready user stories
- High-level sprint plan
- UAT plan
Activity sequencing: During sprint planning, teams must understand the sequence of activities in a data and analytics project. Many novice scrum teams fail to include this information in their effort estimation. Always account for interdependencies between tasks such as data ingestion, report creation and testing. For example, if two reports will be created in a sprint, that does not mean the tester will have two weeks for her task. She will commence testing only after all other development work is complete, which may happen toward the end of the sprint, and ideally will still have sufficient time for bug fixing and retesting as needed.
Taking advantage of everything Agile has to offer means being prepared. If your next data-intensive IT project needs an Agile boost, but the prospect of planning and executing all the necessary steps seems intimidating—Mindtree is here for you. Contact us to begin getting your organization ready for Agile success.