Revenue Assurance & Python
Abdalwahed Frea
RAFM Manager @ Almadar-Aljadid | Python Developer | Odoo/ERP Developer
Throughout my career as a Revenue Assurance Engineer, I've learned that there is one thing for sure, which is there no strict rule to follow, neither there is no manual or guideline to do RA yet, above that I had read the following phrase about the RA in telecom operator (Jack of All trades),?this phrase summarizes a lot of what RA Engineer in large telecom operator should be, and after 5 years in the field I totally agree with that.?
Although there is an endless challenges in RA, starting from Data ETL and the need for a wide domain knowledge, especially of how each service and its technology works and to understand the OSS and BSS, but still the biggest challenge is how to pinpoint the revenue leakage, in addition, how it can be monitored and controlled, and how to find the extreme cases where no one come across, wither its margin recalculation or some technical detail that no one have thought of its risk,?
Of course, there are controls and measures that have matured enough and any RA department or vendor will deploy, mostly emphasize general rating, charging and usage counter, for MSC, billing, roaming etc.., but still the question of where should be this deployed exactly when dealing with jungle of Data Sources and complex Call Detail Logs CDRs?, this is still tricks the most expert in the field, taking in consideration each operator unique case and its deployed technology and business model.?
In a few lines, the RA team should meet two main requirements (in my opinion).
Firstly, they need a wide and in-depth understanding of telecom operations, preferably from BSS, OSS, and overall business models. Gaining this understanding takes time and it might take several years to grasp fully, depending on the RA's involvement and participation in all mentioned aspects starting from voice, roaming, data usage, ending in SMS A2P and P2P assurance. Overall, it's an endless journey of learning to lightning up the dark spots.
Secondly, this job is about minimizing losses and leakage, therefor setting up controls and audit measures as soon as possible is the most critical task here, there for what significantly boosts RA productivity and efficiency is a deep knowledge of a general-purpose programming language such as Python, and that's what we will in this article.
Now let's bring it all together, Mobile Data usage is now in raising demand in all telecom operator with LTE and emerging 5G, data usage is significantly contributing to the total revenue, Therefore, it's imperative for the RA team to have keen attention to data usage, but how and where to start? Through the following,
Gathering Info, through searching and fetching info from related entities, officially or informally, and don't skip any involved party, Core, billing and also Mediation team, as each may possess valuable information.
Reading Raw Data, is essential for building up the full picture, going through raw CDRs (GGSN, PGW, SGW and billing) helps to connect the dots, leading to a clearer overview. At this stage, gaps may become apparent, in this case, we can go back to point 1 for further searching and it might require some cycles of Gathering info and validating it the from raw CDRs.
Sort things out, now that we have acquired a good overview of how things work and what CDRs contain, we deploy the ETL process. Decoding raw CDRs, parsing, and mapping the data into the database is also challenging, especially when considering the large volume and amount of Data Usage CDRs, but it is part of the process, and even in this phase we still have to do some refactor and collaboration for what we learned in the previous steps.
Go Extreme (Risk assessment), this is where we really push the boundaries, exploring extreme cases and exploring into scenarios that others may overlook. It's a time for brainstorming with colleagues, forming hypotheses, and conducting thorough searches, throughout this phase you will eventually come across some cases confirmed revenue leakage is taking place.
This method is used in SMS A2P & P2P assurance, Roaming and TAP assurance, VAS & DCB assurance, etc. all of this is depending on each firm culture and environment,
Now we understood the data usage topic and we have the required data let’s draw the main controls and Audits:
Ok how to do all of that? of course we need a steady and continuous stream of (GGSN & Billing) CDRs, almost all CDRs are in ASN.1 format and we will keep that for another topic, in our case we will work with CDRs in CSV Format, in addition to that the amount of GGSN & Billing CDRs combined varies from each operator depending on factors such as the size of their subscriber base, network traffic, however we will assume that the average user generates 500 CDRs/day in this case a network with 1 million active data user will generate 500 million CDRs per day, now still the question is how to deal with such a huge data in classic RDBMS wither it’s Postgresql or Mysql or other, even with good indexing and partitioning it’s still a huge load to do and even some simple query might takes some hours to execute, in case we need to aggregate the usage of some user or do some simple joint.
Therefor we came up with new perspective to the issue, harness NoSQL can be really good choice in this case instead of doing the query in typical SQL tables, we aggregate the CDRs in real-time and update each subscriber document and retain it in NoSQL DB such as MongoDB, all of this using Python script.
领英推荐
?
?The result will be few millions of NoSQL documents as shown below,
?
The result will be a sit of Collections (TABLES) per day each Collection contains a few millions document (based on the number of active subscriber) Where each document contains a nested data of GGSN and Billing CDRs, the SI the radio access type RAT and the used amount of data, of course this can,
Please find more detail in the repository, https://github.com/abdofrea/ra_data_assurance
Dealing with such amount of data requires a lot of algorithm optimization and maximum resource utilization, doing python each day and dealing with large amount of data, can be a good discipline for any software or data engineer to encourages them to write clean, efficient, and scalable code regardless of the task at hand, in addition this script can be part of larger ETL orchestrating system such Apache Airflow.
In summary as Revenue Assurance Engineer, there is a lot to be done and endless TODO list, in addition there is no vendor or solution can provide the full package, however there is always some work around to do, and acquiring a solid understanding of python is like having Swiss knife in your packet.
Abdulwahed Frea
Revenue Assurance Unit Supervisor
Almadar Aljadid telecom