Data-driven automation in the NOC
Let us share with you a case study on automatic incident formation, root-causing and self-healing scenario that we have worked on as part of our research.
We have applied Machine Intelligence principles - data mining and data science - to discover patterns of behavior from large historical datasets. These behaviors or patterns essentially mean correlation between alarms and co-occurrence patterns. One interesting aspect of our approach is that we evaluated it not only as a time-series data, but also considered how to process the largely symbolic or categorical information collected from the network and identify latent behaviors from it.
This approach aids domain experts in learning unknown and evolving patterns of behavior when the environment is multi-technology and multi-vendor. Such correlated and grouped patterns enable automatic grouping of alarms which sets the stage for automated network incident detection, root causing and self-healing.
Using this approach, we can achieve intelligent grouping of alarms and tickets with minimal manual involvement; we can reduce or altogether avoid manual rule development by automatically identifying important and missing groupings and we can reduce the overall number of trouble tickets.
Automatic incident detection
Fault conditions and alarm grouping is made possible by
- Embedding network information, such as alarms and events, into a telco knowledge graph which includes raw as well as insightful, derived information that forms the basis for enabling automated intelligent behaviors at the NOC
- Automatically capturing the behaviors in the network data – alarms and events – in a data-driven manner into digitalized versions which we will refer to as machine learning (ML) generated rules
Using this approach, we can use automatic identification of amalgamated and enriched conditions instead of looking at individual alarms one at a time. The data-driven capabilities automatically create composite conditions from historical information. In other words, pattern mining techniques are used to perform intelligent grouping of cross-domain alarms. These composite conditions are transposed as ML generated rules which aid in detection of groups of alarms which we call an incident.
No comments:
Post a Comment