Tackling Data Deluge in the Era of Industry 4.0
The manufacturing industry is on an exciting path to unlock new possibilities with Industry 4.0 – a paradigm shift towards machines becoming more connected, intelligent, and eventually, autonomous, using the power of IoT and data. Even today, shop floors generate a vast amount of data. But, are we using it effectively to move towards Industry 4.0? In the 15th Manufacturing Leadership Summit held in 2019, Mindtree moderated a think tank - Coping with Data Deluge in Manufacturing 4.0. A number of leaders from a range of manufacturing companies discussed four key aspects of the topic:
1. What are the challenges presented by this data deluge?
2. What are some of the paradigms to deal with these challenges?
3. How can one use this data for actionable insights?
4. What type of governance is required over the long run?
It’s easy to see that the key reason for the data deluge is the ever expanding number of data sources – from the machines on the shop floor to the enterprise, partner systems, and the world. The granularity of the data collected further compounds the challenge. One participant mentioned that their organization collects 12 billion data points in a year. Participants also agreed that often data is collected “just in case it is needed at some point.”
Let’s deep dive into some of the aspects of the discussion:
1. What are the challenges presented by the data deluge?
Brownfield is an accepted fact of life with the continued existence of factories that were setup some time ago. They have slightly different naming conventions, different tags, classification, hierarchy, and as one manufacturer mentioned, different fault codes for the same event. This results in the fundamental challenge of dis-ambiguating the data while still being able to look cohesively across multiple lines in multiple factories.
As another participant noted, their organization had been collecting data for 20 years without clarity on what to do with that data. In addition to the three Vs (volume, velocity, variety) of big data, we now have two more Vs (veracity and value) that must be considered when looking to collect it.
Another participant had questions about solving potential data conflict problems that exist in a siloed environment. Mindtree’s response was: instead of trying to connect the siloes physically, create unique identities to segment the data and create a data lake as a landing zone. As a next step, develop a catalog and to organize it. Finally, create microservices-based intelligent consumption models.
2. What are some of the paradigms to deal with data deluge?
If we split up the problem statement into three parts: (i) the cleanliness of data (ii) the processing of data (iii) use of data, then specific directions emerge to deal with the problem:
o Data dictionaries can help tackle different tag names/source variations, but number of tags can be challenging to process manually. Data catalogs come to the rescue here. Some catalogs can automatically disambiguate metadata while others use advanced techniques like machine learning (ML) and NLP to identify relationships. For new data, it’s best to establish a formal convention – a declarative approach and use a derivative approach for legacy data such as automated methods.
o Once that’s done, processing becomes much more straightforward, especially leveraging the scalability of cloud. Additionally, the data required to train an ML model is fine grained. But once this is accomplished, the model itself can be used to filter out pertinent data on the edge, reducing eventual processing volumes.
o The question of data usage elicited a number of interesting responses. One participant mentioned that they have 20-years’ worth transaction data and were able to generate a 200k impact with just a very quick, time-boxed analysis of production history and history of tickets and jobs.
o Another participant emphasized the Pareto principle or the 80:20 rule. Using the principle, their organization was quickly able to determine that they needed to hire additional sales folks in a specific region, something that was not obvious earlier. Discussion also revolved around how it is crucial to operate with a goal in mind and identify problem domains and fix use cases that have business value.
3. How can one use voluminous data for actionable insights?
A good data strategy is the starting point for this. But it’s important to keep it simple. There are three stages of analytics – descriptive, predictive and prescriptive. Descriptive is the here and now run/operations style; predictive is identifying what may happen - for example, an imminent failure; while prescriptive is what to do about it in terms of concrete actions. While a number of participants cited using data for mostly operational aspects, a smaller number indicated they have begun the journey to predictive models, and most are yet to use prescriptive models. In general, operational efficiency forms a key area of thrust for leveraging data, followed by industry-specific data monetization.
4. What types of governance must be established to sustain value in the long run?
Unsurprisingly, this topic was highly debated. Earlier, both the generation and consumption of data rested with the same organization. Now, with multiple consumers of data and multiple correlations across data sets, it is important to identify not just governance but also stewardship. A well-established method to achieve this is to use a cascaded model:
a) At the organizational level, establish semantics, security, and movement related “rules”.
b) At a domain level (for example, manufacture, supply chain), establish structure, design, and consumption models.
c) At a team/user level, enable only consumption.
According to another participant, their organization started at the domain level and never went to the organization or other levels as technology took the front seat, causing everyone to lose sight of the structure. Similarly, another company got caught up in the security aspects of data and didn’t look into its structure or other areas. It’s important to note that technology cannot be the leading agent.
Overall, the think tank corroborated a number of observations and approaches discussed above. The bottom line: there’s always opportunity in the guise of a potential problem. The most important aspect to identifying that opportunity is to move from an intuition-based approach to one that is insight-based and data-driven.
At Mindtree, we help global manufacturers use design thinking and ideation workshops to prioritize their data and analytics challenges, and drive business value. Coupled with Decision Moments, a platform that helps rapidly explore data and generate initial insights to establish direction, we can help set the stage for your Industry 4.0 journey. Schedule a design thinking workshop at one of our Digital Pumpkins. We’d love to hear from you.
Learn how to transform your manufacturing enterprise into an intelligent enterprise with Mindtree
Also feel free to write to us here.