K8 + Telco some thoughts..
In order to support certain requirements of Service provider/Telco within K8 community, for the existing (VM-based/Openstack) network services it is certain that K8 as a platform has to evolve to be able to life-cycle its core components and vendor supplied components seamlessly.
Lets list down how these gaps/requirements are addressed by various opensource communities CNTT-Telecom working groups in K8, in openCore/OpenRAN, OpenNESS, and other communities:
** A lot of this understanding is derived from the work done by many network/IaaS forums, leaders and my attempt is to summarise, so credits to my NFV/SDN/IaaS ecosystem on this summary.
A critical gap/requirement, the K8 platform is working on is to ensure standard interfaces for SDN controllers are inline with the telco network requirements. This will allow the 3rd party SDN controllers to be able to interact and provide competitive solutions or differentiation. The present interfaces seem insufficient to address certain complex end-end use cases SDN vendors could manage to showcase their technical prowess. If this addressed we are well positioned to manage the infrastructure along with the SDN blocks in a consistent basis. Services which represent the topology of the entire network in a real time bases if it has to be made cloud native the complexity of breaking down this into smaller services without latency creeping in would be a challenge. A another example would be the services which are involved in the forwarding of topology/state change events to ensure the forwarding state of the network is best positioned, requires more attention if being less intrusive is a design goal. It is generally believed the network functions to build the control plane of the network is getting the required traction nowadays but has a long way to go to be fully cloud native.
There are greater challenges in modelling k8 platform to host data plane requirements of a cloud native service. It is the need of the hour to build standard interfaces to Hardware offload capable network accelerators so that other control path services can interact. Other services like security/load balancing would have need the Events/counters/state of various flows/connections from this fast data path services to take informed actions that have impacts on SLA adherence as well.
A another gap is load balancer requirements, K8’s present LBaaS is for enterprise IT, is standardised on http. The present resource arbitration to map the incoming http requests to available services works well. In the telco/service provider side there is a need to support a variety of protocols and services other than http. The modelling of services and service arbitration is largely governed by Quality of service, guaranteed resource allocation in this space. The point being made is, balancing incoming traffic is now moving from best effort (enterprise IT stack) to deterministic delivery model many telco’s have to guarantee for their end users. Dynamic provisioning of load balancers based on incoming traffic models coupled with perimeter/endpoint security integration is a another complex requirement to be handled when it comes to converting these services to cloud native.
The operational/business provisioning stacks (OSS/BSS) have to be revisited in the context of being relevant to the cloud native context.
One such (re)design challenge is managing the state of these network services aka to make them stateless, inline with cloud native design principles. In the present OSS/BSS systems, most of the state is derived from static provisioning of resources, it has to move to a more dynamic provisioning model or intent based resource arbitration/lifecycle.
As these two workloads (VNF/CNF) will co-exits on a cloud IaaS distribution of K8, it is important to keep in mind how to bind the people (netops engineers) process (CI/CD) alignment to debug misconfigs on ToR, incorrect EPA’s that result in network anomalies, troubleshoot packet drops, validation of control path settings post deployments, day-2 service telemetry events etc. It is important the steps the network operators need to follow have a sense of process ubiquity with ease of use/deployment as prime goals. The various observability/logging/alarms tools should be seamless interms of how these workloads are hosted and have single dashboard to report/act.
VNF’s will coexist with container platforms in their cloud native evolution, this is a clear requirement for evolution/justification of the NFV investments made so far and transition to cloud native model. End to end CI pipelines which have provisions to capture or report latency, packets per second and jitter in a fool proof manner is a critical requirement in the evolution. By having a reference for performance benchmarking, new feature introductions or evolution to CNF can happen keeping in mind how the perf metric is holding up. So having a RFC2544 which best reflects the infrastructure capabilities to packet process so as to characterise the load before packet drops are seen is a critical component that needs to be standardised for IaaS/SDN/HW accelerators/ Service providers during this evolution process.
Transforming KPMG with AI & Data ? Award-Winning Leader in AI & Technology Innovation
4 年Nine one VJ. How do you see K8 managed containers / CNF address the latency overhead? Telco networks as you well know have a pretty strict non-tolerance towards network and compute latency. Also compute resources is going to be a challenge to overcome at the edge.