Peek under the Hood of Telco-Grade OpenStack
By Charlie Ashton
Over the past couple of months, we’ve been involved in a series of conversations about the challenges of using OpenStack in telecom infrastructure. Since OpenStack was designed for enterprise-class IT applications, these challenges are formidable but we’ve demonstrated that they can be solved.
On June 15th we participated in a webcast with Peter Willis from BT. Peter described in detail the issues that he’s observed through some very detailed evaluations of vanilla (non-hardened) OpenStack distributions and we talked about solutions that Wind River has incorporated into the Titanium Server virtualization platform.
We discussed this topic in more detail in a white paper that you can download and we highlighted some additional, equally-critical OpenStack enhancements in a post published around the same time “Don’t be scared of OpenStack for telecom”.
Now we’re delighted to bring you a more detailed, technical deep-dive into this topic. On July 20th, SDxCentral will be hosting a webinar “Using OpenStack to solve real-world NFV problems today and tomorrow”, presented by Ian Jolliffe who is the Chief Architect of Titanium Server and has been working on these problems for longer than pretty much anyone else in our industry.
In this post, we’ll give a preview of some of key topics that Ian will address in this webinar. To get all the details, make sure you register here and block the time on your calendar to enjoy Ian’s in-depth explanation of what’s required to implement a telco-grade version of OpenStack.
Wind River: leaders in the community
If you review the list of companies contributing to the various OpenStack projects, you’ll see that the vast majority of them are in the enterprise market, working on patches and enhancements designed for enterprise needs. Wind River is one of very few companies focused on solving important OpenStack problems that are critical for telecom applications.
Our strategy is straightforward: once we develop a telecom-oriented patch or enhancement, we verify with our customers that it does indeed solve an important problem and then upstream it back to the community for inclusion in a future OpenStack release. This process ensures that our technology has indeed solved real-world problems before it’s upstreamed.
Since most of the maintainers are focused on enterprise topics, it sometimes takes longer than we would like for our contributions to be accepted, but we work with the maintainers to streamline the process as much as possible.
Our experts focus on ten core OpenStack projects with the majority of our work being on the Nova Compute project, which happens to be the largest project in the community. A recent glance at Stackalytics data showed us ranked in top 10% of contributors, right next to AT&T in the overall table.
System upgrades: the #1 issue for Operations teams
When we talk to the Operations teams at service providers about their strategies for network virtualization, their top concern is inevitably how to handle system upgrades. Even with traditional physical infrastructure, this complex activity can cause costly, unplanned service outages if performed incorrectly. It’s critical for service providers to deliver rolling upgrades in real time, with operating systems, applications, databases and protocols smoothly updated in a way that avoids any disruptions to customers’ services.
OpenStack hasn’t addressed this scenario because the current community focus is on API compatibility and configuration changes at the project level. This implies significant, error-prone manual intervention to implement upgrades. In an enterprise environment, downtime can be planned and systems can be shut down for hours while new software is installed, deployed, tested and if necessary rolled back.
As Ian will discuss in the upcoming webinar, support for seamless updates has been a major focus for the developers of Titanium Server. They’ve implemented a wide range of OpenStack enhancements that resulted in a telco-grade platform that enables hitless updates between releases. Live migration of Virtual Machines (VMs) ensures that there’s no service downtime during system updates and the installation of new releases is completely transparent to end users. None of this possible with a vanilla enterprise-class OpenStack distribution.
Critical challenges for OpenStack in telecom
In order for OpenStack to be usable in telco applications, our engineers had to add a number of critical new features.
Besides the hitless update technology mentioned above, these features include: the ability to perform live migration of VMs, including VMs based on Intel® DPDK; SRIOV support for direct VM access to Network Interface Cards (NICs), implemented in a way that is deployable and manageable; scheduler enhancements including NUMA awareness; fast fault detection with automatic recovery; Enhanced Platform Awareness (EPA); guaranteed recovery from startup storms and stampedes; telco-grade AAA security (authentication, authorization and accounting).
All these features and more will be covered in detail in the webinar.
Open solutions bring interoperability and portability
“Open solutions” remain a hot topic throughout the telecom industry. Service providers all say they want them and vendors all say they provide them, but what does the term really mean? The recent survey by TelecomTV resulted in some interesting answers. First, there was strong consensus amongst the respondents that open solutions ensure interoperability between vendor products while mitigating the risk of vendor lock-in. Second, there was clear acknowledgement that, while open-source projects and open standards provide a solid foundation, they need vendor support and enhancements in order to be commercially viable.
This topic will be a major focus for Ian’s presentation. He’ll explain how we work with ecosystem partners through the Titanium Cloud ecosystem, to validate the correct operation of their hardware and software products when used with Titanium Server. He’ll relate this to our strategy of developing telecom-oriented patches and enhancements, verifying with our customers that they solve important problems and then upstreaming them to the community.
Ian will talk about the comprehensive processes we’ve established to validate and guarantee the compatibility and interoperability that our customers expect, so that they can implement multi-vendor, end-to-end use cases while selecting from pre-integrated elements. This enables our customers to accelerate their deployment cycles while minimizing their schedule risk, representing major business advantages as the industry transitions to virtualized solutions.
Be sure to register for the webinar now: you can look forward to an in-depth discussion of all the topics discussed here as well as many other innovations that Wind River has implemented in order to make sure that OpenStack is finally usable in demanding telco applications.