New reflections on telecom clouds
Thierry Van de Velde
Global Core Networks Presales Leader : showing business opportunities with our new Core Networks
As we telecom engineers built a first generation of on-prem data centers running 3GPP Network Functions (NFs), in other words Telco Clouds, time to reflect on the lessons learned and of course to sketch a path forward, avoiding the already known pitfalls.
On the plus side, we demonstrated that we could strike the right balance between code running on generic x86 processors and linux OS, and offloading telecom-specific tasks to Intel QAT, DPDK, smarter Network Interface Cards, Hardware Security Modules, leaf switches, 100GE user plane appliances and so on.
In on-prem deployment projects we have seen that most effort is spent in installing the Infrastructure as a Service (#IaaS) and K8s Container-as-a-Service (#CaaS) platforms, as these tasks are revealing deficiencies in the actual underlying hardware, e.g. packet loss in a NIC driver or insufficient write IOPS to persistent storage. The pre-qualification of various hardware models for IaaS/CaaS is a daunting task, as there will be increasingly demanding telecom applications over time, and older hardware generations will need to coexist with the very latest : we cannot assume homogeneity in the long run.
Climbing up the stack, another troubling choice is whether to
We learned that HELM charts can be either very complex or fairly simple, based on whether those values (variables making each NF instance unique) are
Finally, when the so-called "day 0 / day 1" configuration is not clearly delimited by Low Level Design (LLD) documents, the line is blurry between the
LLDs are a great practice but tend to get out of sync with reality when working late hours to set up a (pre-)production network on time.
Before any NF LCM can occur the K8s Cluster must be prepared, with NF-specific host tunings optimizing the performance and the security of the NF. When not coordinated between the NF product houses (SW development teams), and for third party NFs, those host tunings tend to rip our beautiful uniform IaaS apart in IaaS Host Aggregates (K8s Clusters aka "fragments") that are each only suitable for 1 type of NF. De facto re-introducing the "hardware dependency" that we were combating in the first place...
If we succeed to preserve this uniformity, we can preserve the K8s Scheduler's ability to place pods anywhere in the data center, based on observed load metrics. This will be extremely useful for rack expansions, energy savings at night and optimized resource utilization. Overlay networks (for IPVLAN and SRIOV VLANs, BGP, BFD, etc) will have to be either pre-configured or dynamically reconfigured as the pod moves, a process we have come to call Adaptive Cloud Networking (#ACN). An essential stepstone towards CI/CD and FinOps.
Under impact of increasing ordering delays for hardware, the on-prem telco cloud model is about to evolve in 3 directions, 3 Deployment Models :
领英推荐
Only the latter 2 models are compatible with traditional CAPEX-mode software pricing, where the Customer purchases an eternal (or at least annual) SW licenses, plus an annual subscription fee to updates/upgrades.
All 3 models are container-on-bare-metal deployments (aka "Cloud Native B", CN-B), which are under scrutiny of national Regulators, who are rightfully concerned about the security, i.e. container breakouts from privileged pods (which we combat with dockremap) or memory exhaustion attacks (we protect with linux cgroup2 memory limits). We have new global labs to demonstrate, negotiate and convince the Regulators and last month we were selected to lead a EU Horizon project in this space.
None of the 3 models relieves the NF vendor from having to pre-qualify the dependencies, performance and security of the PaaS/CaaS/HW combination, although in the third model it happens "just once" and economies of scale result from it.
None of the 3 models relieves the Operator (CSP, Enterprise) from managing the solution in terms of Fault, Performance, (day 2) Configuration, Subscriber provisioning, KPI reporting, etc.
Public Clouds (AWS Regions, Local Zones, GDC Hosted) could one day compete with these on-prem models, although we believe they will initially be complementary. The essential difference being the pricing of compute/storage that becomes linear with network traffic. We heard interest from our CSPs and nationwide Enterprises (Transport, Utilities, Public Safety) to try these out by building a Disaster Recovery site, as pricing appears to be prohibitive for 365-day operation.
24x7 Managed Services are available on top of any business model, where the NF vendor
We could agree to call this model #NFaaS, Network-Function-as-a-Service. It will be quite essential to avoid the looming confusion with SaaS, on-prem deployment models or public cloud. NFaaS is compatible with (orthogonal to) any deployment model and any NF vendor.
Finally the NF vendor may add subscriber provisioning, charging, peering, SIP trunking, interconnection, roaming, SIM supply and other services (although then we would probably have to call it #MNE / #MVNE (Mobile Network Enabler / Mobile Virtual Network Enabler) rather than NFaaS or SaaS.
Voilà, I hope you found this overview of models & challenges useful. In this complex world we will all need to agree on terminology, nothing's cast in stone yet, so feel free to agree/disagree/expand below.
Soon we are launching a moderated discussion channel on these topics, with polls to figure out market demand for various models, to discuss the security essentials; so stay tuned, we will soon be able to interact much more on this.
Unlocking Potential Through Technology, Innovation, and Creative Collaboration
2 年As Danielle Royston likes to point out, the public cloud deployment model isn’t about porting traditional on-prem VNFs - then you hit the economic boundaries, as you state. Instead, consider fully public cloud native functions built as serverless entities that get triggered only when needed, using native databases/storage features, etc. It requires breaking down the software to more fine grained entities than vms or containers. If you do it like that, the economics can make sense
5G Lead Solution Architect. “The future was made by those would could take a leap of faith.”
2 年Thanks Thierry Van de Velde for this very interesting notes, indeed CN-B model will be the one that each CSPs should focus on it despite all the security challenges. Regarding the NFaaS the NF vendor should also focus on preconfigured the overlay networks and also be flexible to use MPLS interAS option A&B ?? Thanks you Thierry Van de Velde