Telco's Approach to Distributed Cloud
Unlike several other industry sector embracing Cloud Computing paradigm, a telecom company has a slightly different set of purpose, choices, risks, challenges, and operating models while adopting Cloud for digital transformation. A telecommunication network operator or service provider would have several Cloud infrastructure instances within the organisation to serve either internal use or external requirements, but Cloud itself is perceived as an enabler for Cost Savings and new Revenue Streams.
In a Telco perspective, it hosts different Cloud instances for own internal usage.
- IT Cloud for hosting of enterprise applications (e.g. BSS, OSS, etc) is built using VMware virtualization stack most of the time over commodity hardware platform and integrated with other public Cloud Service Providers (e.g. AWS, Google, Azure, Bluemix, Digital Ocean, Oracle, Salesforce, Workday, ServiceNow, Office 365, and so on). There may be several cloud environments (e.g. development, test, staging, production) to serve different needs (e.g. application hosting, big data & analytics, HPC) of the organization. Only a few data centre facilities (e.g. primary, secondary) are required by IT. All critical workloads are hosted in their private cloud only, and non-critical ones could be hosted in a virtual private cloud of a public cloud service provider to benefit in terms of costs, high-availability, elastic scalability, and so on.
- Network Cloud for hosting physical- or virtual- network functions is new trend. In legacy Central Offices, the network elements were vendor-specific proprietary physical equipment with unique requirements in terms of power supply, cooling, high availability, cabling, bandwidth, etc. which is incompatible with IT data centre. A carrier-grade Cloud infrastructure requires high performance guarantees due to time-critical nature of network functions and regulatory compliance requirements. Majority of operators are choosing OpenStack virtualization platform to build this private network cloud to reduce software licensing costs. There would be several geographically distributed data centre facilities (e.g. global, national, regional, edge, cell sites) leveraging a common hardware or software platform. In a best case scenario, standardization of underlying hardware and software offers portability of network functions thereby its mobility to any data centre facility that would enhance Telco's ability to implement high-availability while reducing overall costs.
There is lack of coherence among decision makers within organisation since the Cloud platforms are being chosen independently, therefore the virtualization stacks applied by different teams vary in many cases. Irrespective of virtualization stack, you also need to isolate its management stack from end-user application or function hosting stacks either physically or logically. One could do this using virtual PODs or multi-tenancy mechanisms within virtualization stack. Both the IT Cloud and the Network Cloud must be integrated some day using APIs to realize a truly programmable network for Telco.
Besides virtualized compute or storage infrastructure based on private cloud platform, all virtualized network resources are managed by a software defined network (SDN) controller using open standards, such as OpenFlow, that applies flow-based accounting model. It is easier to view a network itself in terms of graph, where flows are connected to nodes. A flow could be considered as a resource on which quotas and policies can be applied. An end-to-end network slicing view takes it further to allow us implementing multi-tenancy in a network graph. Therefore, next-generation heterogeneous network would implement network slicing as a feature offering flexibility and isolation. The end-to-end management of a sliced network resource requires efficient network function virtualization orchestrator (NFV-O). Besides the software orchestration of network topological models, it needs software driven Management & Orchestration (MANO) that enables operational efficiency for complex environments. Telecom companies are trying to adopt all of these components as ingredients to build their Network Cloud.
Broadly speaking, the key motives of a telecom company to build Network Cloud is:
- To eliminate dependency over evolution of network functions or services from the product vendors. While network function virtualization strategy seeks to remove dependency on a proprietary hardware, the software defined networking strategy strives to unbundle internal complexity of a vendor-specific network function and make it available in terms of standard models, templates, orchestration blueprints, or network service descriptors based on open standards. Management & Orchestration frameworks seek to standardize and automate all network operations using software and eliminate human intervention completely as far as possible.
- To retain all intellectual property of managing a programmable network within the organisation. It must be captured in automation blueprints and machine learning models that would drive a self-organising network without human interventions.
- To gain network slicing capability in order to restructure network resources on an ad-hoc basis depending on dynamically changing demand conditions. It can enable business or service agility for communication service providers, enhance efficiency, and let them serve niche vertical-specific markets easily for new revenue streams.
- To reduce costs by introducing lots of automation and standardization of software, hardware, and data facilities leveraging open source Cloud technologies. An ability to place specific network function in any data centre creates new cost optimization opportunity. Automation of cloud infrastructure and operational tasks reduces the overall costs further and reduces number of workers needed to run the network.
- To enhance customer experience and reduce churn. An ability to deploy network functions nearer to its consumers would improve latency. New smart applications hosted at the Edge network can change perception of the network itself. It won't be a dumb pipe anymore. In fact, geographic distribution of the compute resources also improves the contextual relevance of information delivered to consumers thereby justifying value proposition of a smart network. An intelligent flow of information to the right consumer, at right time, in right form would make a Smart Planet.
Obviously enough, most of the traditional product vendors understand this threat very well therefore they try to escape this new customer expectation as long as possible. So, some of are trying to find new methods of vendor lock-in and propose the following:
- An all-in-one fully integrated cloud solution. Trying to convince a customer by showcasing complexity of virtual network functions and associated performance issues vis-a-vis self-assembled solution. Certainly, some of these drawbacks exist today with open source software due to fact that networking specific knowledge remained hidden within the products and vendors charged premium amounts for offering services. But, improving your employee's technical skills and adopting a standards compliant Cloud Orchestration technique would help you capture that hidden knowledge and retain it forever as an intellectual property in the form of a software blueprint. An all-in-one solution is trying to hide knowledge from you.
- A high-performance hardware optimized, well tested, etc. solution vis-a-vis open source software (e.g. router, switch, firewall, load balancer, or other VNF) running on a general purpose hardware. Certainly, some of drawbacks would be overcome in near future after improving capabilities in network interface cards, memory access, processing units, storage, etc. in emerging converged infrastructure platforms. But, you should look into long-term gains (i.e. freedom, flexibility) as well.
- A standards compliant, feature rich, next-generation solution that runs on a Cloud platform, but it works efficiently only using vendor own stack fully. So, they try to introduce a kind of partial lock-in by discouraging you to adopt open tools since your staff cannot keep up with emerging standards. By the way, they try to influence the standards development itself.
It's a battle resembling MS Windows vs. Linux, but the closed source must die one day to benefit all of us. In fact, the more open source Telco adopts, the cheaper it would be to do business for all the sectors. The more intelligent Telco network becomes, the life would be easier and enriched for all.
Considering these factors, one needs to be careful about technology adoption options. One needs to prepare their staff with new technical skills related to Cloud, NFV, SDN, MANO, Big Data, Analytics, IoT/ M2M, Security, DevOps, etc. More than technology deployment skills, it is paramount to train them about SDN design and testing.
It is important to note that the Telco Cloud is not just a replacement of physical network elements into virtual network functions. In fact, it's would be Business Transformation exercise for the entire organisation. Therefore, one needs to think about new Business Models, Revenue Strategy, Pricing Models, Product Strategy, Market Strategy, Network Infrastructure, Data Centre Infrastructure Strategy, Supply Chain Strategy, Operating Models, etc. A competitive advantage is achievable based on the strategic investments in overall infrastructure, development of partner ecosystem, embedded intelligence of a middleware driving the network, rapid innovation, business agility, cost transformation, and so on. A differentiating value proposition must be created using Telco Cloud.
Besides deploying new set of technologies, a lot of legacy operational business process (e.g. related to order fulfilment, service assurance, invoicing, payments) would change. Therefore, one must consider change management thoroughly. Telecom organisation's capabilities matrix must be reviewed to understand current state, align with future state of business, and create a transformation roadmap for tangible gains over period of time. One would build a new ecosystem of partners, suppliers, and customers with emerging products & services catalog, therefore contract management would have significant impact. It must be fully automated leveraging real-time pricing techniques.
By adopting a set of Cloud technologies for the network functions, one basically tries to change the manner in which traditional network elements have worked in the past. For example
- The physical Network Elements were dimensioned to serve a specific capacity and performance beforehand. The capacity was planned frequently and manually. The performance problems were resolved by investigating issues manually. You could see and feel the physical network elements. However, virtual network functions do not remain static. It's instances can move from one physical machine to another automatically. It can startup or shutdown dynamically based on rules or policies. It can auto-scale. Thence, you need fully automated near real-time analytics to detect problems and resolve it quickly without human intervention as far as possible. You need skills to set thresholds rightly for triggering actions based on pre-defined rules.
- Lot of automated inter-operability testing is required now within your environment since most of the vendors use lot of 3rd party or open source software that could be pushed into your network functions as a software patch update and create havoc in the entire network. So, new designs must incorporate quality gates to contain the propagation of problems across the network. You must have automated means to verify vulnerable open source software components and assess risks using standard methods for VNF software patch updates. DevOps plays an important role here. You cannot use performance models as-is from a product vendor who has tested it in his isolated environment. Your team needs to own, design, and develop the performance models customized for your network now since it runs on a common platform that is partitioned very differently using parameters and unknown to the product vendors.
- The network slicing idea and dynamic nature of Cloud impacts Telecom Inventory and the Network Management Systems significantly. The placement or discovery of virtual network functions, event logging, performance monitoring, elastic scaling, metering, chargeback, policy enforcement, etc. and related cloud orchestration and operational automation must be reviewed. A resource leakage must be tested for an entire Network Cloud using in-life test tools. The orchestration deadlocks must be checked in workload placement and resource allocation engines regularly to resolve.
- The network must leverage set of machine learning models to make decisions quick enough without any human intervention. Such model libraries evolve very quickly as you start introducing more network-facing or customer-facing services into the programmable network. These models must be tested more frequently to assure correct behaviour of your self-optimising network.
- Traditionally, the precision of time and its synchronization in a physical network element was tested and assured by product vendors. Since virtual network functions running on a commodity hardware in a common platform Cloud does not have high precision clocks, you need to think about the time-synchronization strategies for a distributed cloud architecture since it's a demand of specific protocols in the Telco environment. Different strategies of deploying virtual network functions create new set of challenges for propagation of precise time and its control within the hardware or software components. In a commodity hardware, the clock is available on the motherboard, but the precise time requirements are spread across the device in the virtual machine, hypervisor, virtual functions deployed in network interface cards, and so on. Even if one of virtualized network function behaves incorrectly due to time precision, it must be possible to identify and isolate protocol misbehaviour.
The Distributed Cloud could be analysed from following different perspectives:
- Hosting Facilities offer data centres of different size, capabilities, locations, etc. to place physical- or virtual- network functions. The software defined networking is applied within data centres to connect workloads hosted in a network function virtualization infrastructure. The data centre can be created using containerized modules to ease cost management. Several physical containerized modules can be aggregated together to create a central pool of resources and divided into a set of logical zones for allocation of same depending on the workload characteristics (e.g. high- / low- bandwidth, large- / small- storage, high performance computing). The design of resource allocation physical or logical zones within a distributed data centre require careful consideration about security requirements, high availability, bandwidth demands, and so on applicable to the network functions or services. The data centre networking could also apply the network slicing techniques.
- Network Resources offer physical- or virtual- connectivity or bandwidth between hosting facilities. The software defined networking is applied across data centres to connect workloads using transport network (e.g. fiber, microwave). The network slicing could be applied to achieve performance guarantees in a connectivity link. Specific protocols and switching or routing methods also influence design decisions. A set of logical zones are created to simplify allocation of network bandwidth to the workloads hosted in various data centre facilities. Network topological patterns are availed to orchestrators and applied in network elements for flow-based control.
- Service Chaining offers an end-to-end topological view of services. The network function virtualization orchestrator platforms are applied to stitch the services together. The path computation elements and analytical models have a significant role in the design of service models. A set of network topological patterns must be integrated with service catalog and its policies to achieve fully dynamic services.
- Management View offer an end-to-end perspective of network operational tasks. Several management & orchestration platforms and frameworks have emerged recently. It requires careful analysis and well-defined selection method considering your business needs. An integration of OSS could be achieved using various models (e.g. using OpenFlow, SDN Controller, MANO). The benefits and complexity vary among available options. One might choose to apply different integration option for each network domain.
- Unified Service Catalog offers a product or service perspective that encompasses an integration of all self-hosted network services and partner products or services. The service catalog is designed to serve B2C, B2B, and B2B2C scenarios thence realizing a Digital Service Provider vision. However, it requires a lot of backend integration in terms of deployment automation, policy enforcement, event logging, debugging, remote support, etc. It is based on capability leverage principles.
- API Management offers network programability perspective. It also requires an integration with BSS, OSS, and MANO layers. Besides the order orchestration, it must integrate the metering, rating, chargeback, and billing in near real-time basis across the ecosystem. The APIs will be utilized by various service delivery channels to offer services agnostic to the end-user platforms. More complex services could be composed using APIs of partners and Telco's own products or services. So, it offers more freedom to innovate and bundle the services without limiting to the geographic boundaries.
- Operating Model of the Telco is disrupted drastically. Instead of network-centric view, the organization has a customer-centric view now there the teams must learn to work in a software driven multi-party infrastructure similar to the IT companies. The DevOps plays a key role in simplifying the process of data collection, analysis, and decision making in dynamic Distributed Cloud environment.
The Telco perspective is very different about external use of its Cloud services. It is driven by Cost Savings and new Revenue Stream motives. For example
- The decentralized deployment of Network Functions is expected to reduce overall cost of transporting bits & bytes to end users and deliver good quality of experience. A network function decentralization would demand software refactoring as well leveraging data replication, efficient signalling, local analytics & decision making, and self-optimizing based on dynamic resource's cost models.
- The edge computing concept expects the 3rd party software (e.g. IoT, cache) to be hosted at an Edge data centre for creating new services. It would introduce various business models leveraging user plane breakout at different geographic locations and peering or integrating with 3rd party services. A new set of smart application for IoT/M2M are candidate. An efficient transport of data and local decision making could enhance usability of the network for interested parties.
The edge computing methods could be applied to fixed, mobile, aerial, etc. networks. Due to increasing user demand for bandwidth and low latency from mobile network, the technology evolution is more focused towards multi- radio access technologies, multiple input, multiple output antenna technologies, carrier aggregation, radio access network infrastructure virtualization, standardization of fronthaul, etc. to be able to justify the investments in radio networks by showcasing new revenue opportunities. An ability to host 3rd party applications integrated with radio access network could be game changer and offers competitive advantage to Telco against traditional Cloud Service Providers. But, the Telco is still very slow in adopting such technologies due to a huge amount of investment required, immature market, and slow rate of returns over the investments.
Besides software-driven approach of Cloud Orchestration & Automation, it is important to realise that economy of scale proposition of a Cloud is applied differently in a Telco Cloud environment. Unlike a standard Cloud where physical resources (e.g. data centre facilities, racks, server, storage) are relatively concentrated in a centralised manner at different regions or zones and it could be pooled logically for seamless allocation, the Telco resources (e.g. cell sites, switching offices) have very little advantage in terms of pooling of lot of resources due to the fact that geographic distribution is a mandatory requirement and a minimum is definitely needed at each location for communication function itself. Therefore, standardization of underlying hardware or software becomes a key requirement, if economy of scale must be realized in a Distributed Cloud. Thence, a cost advantage is expected from optimal relocation of network functions to minimise data transportation and resource allocation just-in-time.
The public Cloud Service Providers try to build very large-scale facilities to optimize the costs associated with land, power, backup, cooling, fire suppression, security, etc. that gets apportioned over huge customer demands as well. However, the Telco would build medium-scale data centre facilities only since internal demand (i.e. for hosting network elements) and external demand (i.e. for hosting edge computing resources) is still very low. Perhaps, emerging IoT/M2M applications would take long time to justify huge investments and facility capacity planning is even more complex for Telco Cloud.
Typically, the enterprise IT data centres use AC power supply which is relatively easy to access and cheaper to distribute, but the legacy Switching Offices where majority of the network elements are hosted are using DC power supply for so many years. Therefore, a migration of these facilities to new architecture would take some time. In some case, a hybrid approach is being suggested for time being.
Considering the fact there are several 3rd parties involved in the creation, delivery, and management of services for the buyer, a federated security model would be the norm in a Distributed Cloud. Now, the Telco have an opportunity to evolve and implement their own security functions in an network function virtualization infrastructure, thereby reducing available options to the software hackers. Of course, the gateways in control plane and user plane must comply with regulatory standards, but internal network is free to evolve since the entire central office can run in a data centre now.
DevOps for network and IT is a key ingredient for seamless operations of a Distributed Cloud. It must be embraced since the start and network teams must align quickly. And, the integration of Machine Learning with DevOps can offer significant benefits.
Distributed Cloud will be more suitable for low latency, high-performance applications (e.g. Augmented Reality, Internet of Things) that offers local contextual intelligence by collecting lot of information from various sources, applying analytical model, and offer personalized experience to its end-users, which means the Telco Cloud must embrace an open standard based platform-as-a-service model so that new applications could be developed by the software developers such that their application leverage distributed infrastructure for generating intelligent response to the end-user needs. The intelligence of hosting smart applications, data routing, and presentation must integrate seamlessly within the Distributed Cloud.
Dark by Design ZeroTrust Principal Executioner.
7 年Network slicing? More accurate would be network virtualization.