There is a perception that predictive analytics is a solution looking for a problem*. It is often not clear how this capability fits in to the great jigsaw puzzle of SAP systems and solutions. It’s also often not clear to business decision makers how this mysterious force can be harnessed to improve their enterprise processes. Let me open your eyes to the potential…
In the past I would have explained that predictive analytics comes in several forms from an SAP perspective: from the “Predictive Analytics” tool to predictive algorithms running in SAP HANA using the “Predictive Analytics Library” (PAL). The selection and application of these tools or algorithm libraries would depend on the client’s use cases. Classic SAP Business Warehouse (BW) also has embedded functionality for predictive analytics in the form of “Analysis Process Designers” (APD). This could be used for regression and data mining, but the options were limited. This meant that there was a disconnect between the “New World” of HANA and all its enhanced capabilities and the “Old World” of BW due to the differences in user interfaces and toolsets.
Voyage to the New World
The introduction of “HANA Analysis Processes” (HAP) in BW version 7.4 SP5 has vastly changed the way predictive is deployed in BW solutions. APDs can now be substituted for HAPs; apart from providing us with yet another acronym, what does this actually mean? The key point to note is that rather than a limited and prescribed list of algorithms being available, this time the power of the HANA “Application Function Library” (AFL) as well as open-source libraries in the “R” statistical programming language are now available for use. And what do I mean by “use”? Well, now we can embed AFL and R script in data flow transformations which provides massive potential to enhance data flows with more advanced analytics. Quite apart from the ultra-fast, optimised PAL algorithms delivered by SAP that run in-memory, we now have access to thousands of open-source algorithms in R. These are widely used globally within the science community and also increasingly in enterprise applications.
Figure 1: Example HAP scenario
Example: fraud detection in supplier invoicing
To help demonstrate the difference that a predictive algorithm could make to a BW data flow, here’s an example where fraudulent supplier invoices are detected. This type of fraud does occur and can be hard to detect, particularly if the fraudulent amounts are small and sporadic.
Old World: Supplier invoices received in ECC flow through to BW via batch processing and are transformed and aggregated before being presented in an invoice report. Potentially, an exceptions report is used which is based on heuristic business rules to flag potential fraudulent behaviour. This is accurate as long as the patterns that indicate such behaviour are understood and included in the business rules behind the report. If conditions change, then it could be that a number of supplier invoices are flagged as suspicious although they are quite legitimate. The business rules would then need to be revised and re-deployed or worse, ignored.
New World: Supplier invoices flow from ECC to BW (on HANA, BW/4HANA, etc.) and are processed automatically by an HAP embedded in a data flow that feed the invoice report. This time, predictive algorithms kick in rather than heuristic business rules, which proceed to look for patterns in the invoice data that appear anomalous and therefore could be flagged as suspicious. The algorithms would have been tuned and refined using historical data and this process would be ongoing as new data arrived. Importantly, what constitutes anomalous behaviour would be decided by the algorithm and validated in an exception report by a human user. If the machine got the wrong answer, this would be taken into account for future processing. The machine would learn from its mistakes.
So, in the “New World” there is no need to re-configure the business rules that deal with detection of fraudulent behaviour, since the predictive algorithm is constantly learning and refining its method for detecting fraud.
A quick demo
HAPs have nine predictive functions available as standard. I applied the “Anomaly Detection” function (0BW_OPER_OUTLIERS) to the dataset provided in the example diagram above to see whether it could truly pick out the anomalous invoices. In order to set this up, I created a small BW data flow to load the transactional invoice data into a DSO as the source for the HAP. For convenience, the output went to a persisted “Analytic Index”, although this could have been another DSO. To compare the results with an industry standard I also processed the same dataset using the “outliers” library in R.
Figure 2: BW on HANA HAP setup
The results are shown below. As you can see, both the R and HAP algorithms identified the cluster of anomalous invoices up in the top right of the scatter plot, although there were some small differences due to a slight difference in the algorithms. What this does show is how useful it can be to be able to test the outputs of an HAP with an alternative but similar benchmark.
Figure 3: Comparison of performance of outlier detection algorithms: SAP vs. R. Detected outliners are coloured in red.
Deploying this capability in your business
If you have BW on HANA above version 7.4 SP5 or indeed BW/4HANA, then you already have this capability. The next step is to work to identify areas of your business that would benefit from advanced or predictive analytics. If needed, a small “Proof of Concept” (PoC) can then be built that tests whether useful results can be achieved by the application of this technology, before you take the plunge to fully integrate it with your BW landscape. It’s often a good idea to run a PoC or a pilot on the timescale of a few weeks, as this ensures the value of these seemingly complex solutions can be quantified and demonstrated to the business as quickly as possible.
SAP continues to commit to building predictive analytics into its product suite and this is undoubtedly a good thing for all businesses who invest in the newest technologies that SAP has to offer. Both algorithms and computer power are constantly improving and so the quality of predictive processes will also keep improving and new use cases where predictive can play a valuable role will continue to emerge.
*The optical laser was also considered to be a solution looking for a problem, back when it was invented in 1960. The rest, as they say, is history.