Realising Network Cloudification

Realising Network Cloudification

1? Introduction

In this article/paper I have tried to address how a Network Operator can realise the network cloud over all different domain spaces. In today's technology world, cloudification is essential. This paper aims to share opinions and thoughts about various services chains for achieving the end architecture.

?


2??Need for a Telecom Network Cloud

The Telecom landscape is ever-changing, especially with the advent of NFV and the need to roll out 5G services. As a result, operators migrate their services to NFV Data centers or launch their new Services out of these NFVi Data Center.

?

Today, where scaling, real-time response, and the demands for a variety of heavy data throughput from voice, to video, to IoT devices is anticipated, extending traditional networks is no longer enough. Network cloudification made it possible to remove the need for specialised hardware and enabled software deployment to be independent of the hardware used, as long as the commercial-off-the-shelf hardware compute power was sufficient. Network cloudification is the next step in network evolution and is manifest in specifications for software-defined networking (SDN) and 5G mobile edge or multi-access edge computing.

When the Network runs on the cloud, operators can be innovative and offer Networking-as-a-Service with much lower?CAPEX and OPEX. In addition, with much greater flexibility and profitability, these "programmable networks" are simpler to design and easier to administer and manage.

?

3??Network Cloudification – Reference Architecture

The fundamental rationale behind the NFV Telecom Cloud (Network Function Virtualisation) is to host multiple Services from different vendors on a COTS platform. The NFV Solution helps CSPs move from the traditional proprietary hardware for individual Network Elements to a virtualised Stack that can host multiple Virtual Network Functions and provide the Service. Thus, the CSP will reduce their Opex and Capex costs drastically.

The picture below depicts ETSI's reference architecture to virtualise the Telecom Network Functions onto a COTS platform.

No alt text provided for this image

Figure 1: ETSI NFV Architecture Framework

To realise this solution, it is essential to understand the various facets of Network Cloudification.

1.?????NFV Infrastructure

This COTS infrastructure will have the necessary Compute, Storage, Network resources to host the VNFs and then essentially deliver the Telco Services like Voice, Data etc.

2.?????NFV Orchestrator

This function is responsible for the Management and Orchestration of the Network functions and the Services they offer together. It manages the lifecycle of these functions and abstract them to the layer above as multiple Resource functions or Network Service functions.

3.?????Virtual Network Functions and Network Services

The virtualised Network Functions like SGSN, PGW, MME etc. – the compute, storage, network resources required to host and connect with other Network functions to create a Network Service.

4.?????Integration to OSS, BSS, existing PNFs

The existing OSS, BSS, and PNFS must be integrated with the NFVi and its VNFs and Network Services required to create the Network Service. Therefore, it is essential to understand how the integration will be done – SNMP, REST API, NETCONF, CLI etc.

5.?????Security and Reliability

It is essential to address the points for the VNFs, Services to comply with the security standards of the local government, Operator. Moreover, the VNFs and the Network Services they offer should conform to the Carrier grade reliability standards like their PNF counterparts.

6.?????Managing the Network Cloud

The Network Cloud must be managed at various levels – monitor the underlying physical Infrastructure, NFVi, VNFs, Network Services and manage the lifecycle of the VNFs and Network Services

Apart from briefly touching on the above points, this document will cover the aspects of Governance, Scope, Project plan, Migration for Network Services like - Evolved Packet Core, Voice over LTE, SD-WAN etc.

4?????????????Enabling Network Cloud Roll-out

A good governance structure and planning are essential to effectively rolling out the Network cloud. Below is a delivery model/outline of what needs to be done.

?

No alt text provided for this image

Figure 2: NFV Delivery model

?

5??????????????Project Governance – Build, Deploy and Rollout

The Network Service that has been designed has to be rolled out and integrated into the more comprehensive Operator's ecosystems, e.g., OSS/BSS, the existing mobility building blocks etc. The integration must happen at various levels.

1.?????Core Network – Depending on the actual Network Service type, it must be integrated into the existing elements of Core Network. For example, if we are rolling out vEPC Network Service, e.g., we must integrate with not limiting Radio, IMS, MME, SBC, and other Inter-networking Network elements.

2.?????OSS systems – The new Network Service and the NFV infrastructure must be connected to the OSS Assurance systems, Inventory, and other key OSS elements

3.?????BSS systems – The Network Services created by the NFV Orchestrator must reflect in the BSS Catalog to be available for fulfilment.

4.?????IT systems – Connect the new Network Service to the IT systems like Mediation for Billing, Data WareHouse, Legal (Lawful Intercepts) and any other systems.

The Project Governance team should track all aspects discussed and ensure they are followed. High-level activities of the E2E project are:

No alt text provided for this image

The right resources with the appropriate skillsets should be onboarded onto the team to achieve this. Discussing the skillset requirements of the resources is beyond the scope of this document as it varies with the type of VNF, Network Services to be deployed.


6??????????????Designing the NFV Infrastructure

The Virtualised Network Functions to be hosted on the NFV Infrastructure mandate the requirements on how it is designed to match its physical counterparts. For example, RedHat's Openstack and vanilla Openstack are KVM Hypervisors. The KVM Hypervisor uses the Non-Uniform Access (NUMA), where the System's physical memory is divided into Zones that are allocated to particular CPUs or Sockets. As a result, access to memory that is local to a CPU is faster than memory connected to a remote CPU.

Physical NICs are placed in PCI slots on the computer hardware. These slots connect to specific CPU sockets associated with a particular NUMA node. Connect your data path NICs to the same NUMA nodes in your CPU configuration (SR-IOV or OVSDPDK) for optimum performance.

The performance impact of NUMA is significant, generally starting at a 10% performance hit or higher. Each CPU socket can have multiple CPU cores treated as individual CPUs for virtualisation purposes.

No alt text provided for this image

Figure 3: NUMA placement of NFVi

CPU pinning can run a specific virtual machine's virtual CPU on one particular physical CPU in each host. vCPU pinning provides similar advantages to task pinning on bare-metal systems. Since virtual machines run as userspace tasks on the host operating system, pinning increases cache efficiency.

In OpenStack, SMP CPUs are cores, NUMA cells or nodes are known as sockets, and SMT CPUs are known as threads. So, for example, a quad-socket, eight-core System with Hyper-Threading would have four sockets, eight cores per socket and two threads per core, for a total of 64 CPUs.

Hyper-V is configured by default to allow instances to span multiple NUMA nodes, regardless of the instances that have been configured to only span N NUMA nodes. This behaviour enables Hyper-V instances to have up to 64 vCPUs and 1 TB of memory.

The number of processor cores and threads impacts the number of worker threads that can run on a resource node. Therefore, design decisions must relate directly to the Service it hosts and provide a balanced infrastructure for all services.

Another option is to assess the average workloads and increase the number of instances running within the compute environment by adjusting the overcommit ratio. This ratio is configurable for CPU and memory. The default CPU overcommit ratio is 16:1, and the default memory overcommit ratio is 1.5:1.

While running high-performance workloads, vCPUs executing processes must be on the same NUMA node ensuring all memory access are local to the node and do not use the limited cross memory bandwidth, adding latency to memory access. Huge pages are assigned from memory and benefit from the exact performance requirements allocated using standard pages.

In OpenStack, Flavors define the compute, memory, and storage capacity of nova computing instances. In simple words, a Flavor is an available hardware configuration for a server. It defines the size of a virtual server that can be launched.

7??????????????NFV Orchestrator

The NFV Management and Network Orchestrator (MANO) manages the lifecycle of the Virtual Network Functions (xNFs), the Network Services they offer. These resources are required to operate them. Therefore, the Management and Network Orchestration should abstract the Resources and Services at different Network Architecture levels.

1.?????Abstraction at Services level

2.?????Breaking down the Services to multiple xNFs

3.?????xNFs installed on the different physical elements

The BSS and OSS should have only Service level views and not worry about resource level or vendor-specific commands. The Service Orchestration must convert and break down the Customer request to multiple Network Services with configuration data. The Service Orchestrator will send appropriate instructions depending on the Network Element vendor. MANO should be vendor-agnostic to provide E2E Multi-vendor Multi-domain Service orchestration.?

No alt text provided for this image

Figure 4: Multiple layers of Abstraction in the NFV Solution

The MANO is responsible for the Network Service lifecycle management, including operations:

1.?????On-board and management of Network ServiceDescriptors

2.?????Instantiate Network Service;

3.?????Scale Network Service

4.?????Update Network Service

5.?????Terminate Network Service

From the MANO's perspective - the following must be done to create the NFV stack.

1.?????Design and Test Network Service Templates

2.?????xNF Packages with appropriate flavours, metadata, affinity; anti-affinity, High-availability

3.?????Develop Resource Managers to interact with xNF Managers, Openstack / Openshift and any other Hypervisor

8??????????????Designing the Network Service (Build)

Before NFV, the Network Service is created by configuring Servers of different xNF vendors and then linking them together. This complexity has been removed today, and the Network Service is created in one of the popular formats like JSON, TOSCA, YML – the one preferred by the MANO / Orchestrator. What is being created today is more like Infrastructure as a Code or Service as a Code on top of the Infrastructure as a Code. In simple steps

1.?????The underlying Compute, Storage, Networks are created (NFVi) – Infrastructure as a Code.

2.?????The xNF software for different Network Elements is loaded on top of the Compute – Network Service as a Code.

3.?????The xNF configuration is applied on the loaded xNFs – Network Service as a Code.

4.?????The IP addresses are configured onto the xNF's VLANs, VRFs as per design – Network Service as a Code.

5.?????The initiated VNFs are now connected using the VLAN / VRFs and any applicable scripts, thus creating the Service chain necessary to fulfil the Network Service.

?

No alt text provided for this image

? Figure 5: NFV Stack orchestrated using the Service / Resource Orchestrator

?

The following tasks have to be completed to build the Network Service:

?1.?????Network Service Functional Design – Create the Network Service Templates with appropriate xNF encoded with the needed configuration per the xNF and Operator vendors duly considering Affinity / Anti-affinity, HA.

2.?????Service Networking and IP Planning – Create the necessary Networks, i.e. Virtual Router Functions / VLANs required to offer the Service. Identify the number of IP addresses needed by the xNFs, whether IPv4 or IPv6, Management ports etc.

3.?????Service Configuration – prepare the Configuration scripts that have to be included as part of the Network Service Template (NST) to form the Service chain.

4.?????Service Policy configuration – set the policies that must be enforced, KPIs to be met for the Services by the underlying Assurance systems.

5.?????Closed Loop policy creation – Define the closed-loop policies and ensure this is passed down to the underlying Assurance systems as part of the Orchestration or Integration if the Assurance system does not support the Zero-touch Service Management.

?

9??????????????Deploy the NFV Stack (Deploy)

9.1??????????Installation of the NFV Cloud

The foundation for the NFV Cloud is building the under-cloud on top of the chosen COTS Hardware. This could usually be RedHat Openstack / Open Shift on top of the RHEL. Then, the Storage, Compute, Networks are created. Finally, security hardening must be done, and the Overcloud deployed.

Once the NFV infrastructure is hardened and tested – it is primed for deploying the xNFs, CNFs.

No alt text provided for this image

? Figure 6: Creation of Under-cloud by the Orchestrator

?9.2??Installation of the xNF (Network Elements) and Network Services

Onboard the xNFs and create the Network Service

The overarching Service or Resource Orchestrator is used to onboard the Virtual Network Functions or Containerized Network Functions. The templates (Infrastructure as A Code) designed to onboard the xNFs, Network Services shall be used to onboard the xNFs, configure them and Service chain the multiple xNFs to create the Network Service.

This is typically done by following an MoP document prepared as part of the Low-Level Design that outlines the procedures to create the Network Service.

No alt text provided for this image

Figure 7: Orchestrator onboarding the xNFs

?

Integrate the Network Service

After creating the Network Service, it is better to follow a phased process of integrating into the existing Network. Typically, in the case of rolling out vEPC in an existing network, the steps below are executed:

?

1.?????Test the Service within the NFV infrastructure and verify the results

2.?????Connect one Network Element (external to the new NFVi) at a time. E.g., in the case of vEPC, connect to an existing MME and PCR, use existing HSS to ensure the connectivity to the new Network Service works.

3.?????After establishing that the new Service works with a few Network Elements of the existing Network, start rolling out the new Network Service entirely.

4.?????Once the new Network Service is working – the E2E testing plan is executed

5.?????The new Network Service is integrated with the OSS, BSS, and other IT systems.

No alt text provided for this image

Figure 8: Integrating the Network Service / NFV stack to existing systems

?

10????????Managing the NFV deployment (Manage)

10.1??????Managing the NFV Cloud locally

The newly deployed NFV can be managed through the local Element Management Systems, Openstack's Horizon and the NFV Orchestrator.

Though this is how it's managed till the final integration happens with the overarching OSS, that is not the goal we set out for. Instead, we want Zero-touch Service Management (ZSM) using integrated Orchestration - Assurance - Orchestration flow.

10.2??????Managing the NFV Cloud

Ideally, after the overarching Assurance system is integrated with the newly deployed NFV system, managing the System should be pretty simple as the 'Zero-touch Service Management' should have kicked in.

Managing the NFV Cloud is at different levels:

1.?????NFV Infrastructure – monitor the Virtual Machine Infrastructure using Openstack's Ceilometer or the interface of the Operator's VIM

2.?????Container Infrastructure – the popular mechanism to host the Containerised Network Functions (CNF) is the Kubernetes Platform (e.g., Red Hat OpenShift).

3.?????xNF (VNF / CNF / PNF) software – the Element Manager of this xNF should send the Events / Faults to the Assurance through the agreed interfaces to the Assurance system. The events should be in line with the defined policies at the time of Orchestration and Closed Loop Assurance

4.?????Automated Lifecycle Management – 'The Holy Grail' of NFV is to manage the lifecycle of the NFV Infrastructure, the xNFs and the Network Services without manual intervention. For this to work effectively, the below points are crucial:

i.??????Define scaling policies, KPIs for xNFs

ii.?????Define Policies, KPIs for the Network Service

iii.????Define Policies, KPIs for the whole NFV Infrastructure

5.?????Integrate with Ticketing System / ChatOps – if possible, integrate the Operator's Assurance system with the Ticketing System / ChatOps to provide mature ZSM.

?

11????????Conclusion - fully automated Telco Cloud

The objective of Network Cloudification is to build a fully automated Telco cloud. Using the guidelines and principles outlined in this document, ETSI / TMforum specifications and best practices of the Service Integrator – the Telco cloud can be built from the ground up.

As we have seen, below are the essential things to consider while building the Telco cloud.

·???????Governance – a rock-solid structured way of managing the various activities required to build the Telco Cloud. This governance team is crucial to managing different vendors, integration partners, and various Operator teams. In addition, project plans should be there for every phase of the project under the umbrella of the whole Program.

  • Design – that encompasses the underlying Infrastructure, Network Services it shall host for now and the future, keeping in mind how it will integrate without affecting BAU.
  • Build – plans for rolling out the Network Service, testing every possible scenario
  • Integration – integrating with the existing Network, OSS, BSS and other IT systems causing minimum disruption to BAU.
  • Automation policy – design, build and commission effective Automation policies required to serve the needs of a fully automated dynamic service.
  • Network Management – the methodologies and tools required to manage the fully automated Network Service

?

No alt text provided for this image

Figure 9: Building Blocks of the Telco Cloud

12????????Glossary

No alt text provided for this image

13????????References

?14???????Review and Contribution

I would like to thank the below IBM colleagues for their valuable review, and support:

Manish Pathak

Industry Expert and Thought Leader in Telco to Techco Transformation and cross industry enablememt

3 年

Well done ????

Tapas Kar

Telco and Private network | ORAN and Cloud infrastructure| Pre-sales and solution integration | Delivery executive | Sustainability expert | Solar and Renewable energy

3 年

Very detailed and insightful. Thanks for sharing

Ramaswamy Ganapathy

Data Science & Automation Specialist || Driving RPA, OCR, and Supply Chain Innovations || Building Scalable Data Products || Driving Business Transformation

3 年

Looks super cool! Congrats Srinivas Sivasubramanian.

Great article, thank you!

要查看或添加评论,请登录

Srinivas Sivasubramanian的更多文章

  • Realigning Telcos Billing (Business) model

    Realigning Telcos Billing (Business) model

    1. Introduction Telecom Networks paved the way and built the infrastructure that brought the Internet and its…

    4 条评论
  • PoV - How could Telcos benefit from OTT Apps?

    PoV - How could Telcos benefit from OTT Apps?

    1 Over the Top Services overloading the Telcos The Telcos are used to a business model of selling Services to their…

  • Realising Telecom OSS/BSS on AWS

    Realising Telecom OSS/BSS on AWS

    Cloud providers like AWS, Microsoft Azure, Google have revolutionised the world we live in by providing the IT…

    4 条评论
  • Getting Cricket to the world

    Getting Cricket to the world

    I understand the timing of this post may not be ideal. I am enjoying the current FIFA World Cup, especially the way…

社区洞察

其他会员也浏览了