思博伦环形标志
高速以太网

Optimizing Datacenters for RoCEv2 Performance

作者:

The demand for data is growing exponentially and the datacenters need to keep pace. RoCEv2 has proven its usefulness over TCP/IP by providing the lowest latency by far in a legacy Ethernet datacenter. Optimization will be key to achieving the best results in large scale deployments.

In today's world, a speedy transfer of data is critical to accessing information efficiently. Datacenter services, such as search, storage, database, financial or high transaction rate applications, are latency sensitive and bandwidth hungry. In addition, mobile applications, gaming, artificial intelligence (AI), machine-learning (ML) workloads, and over-the-top (OTT) video all use low latency to improve user experience.

Modern datacenters are struggling to meet the demand for high bandwidth with extremely low latency. RoCEv2, the latest development on Remote Direct Memory Access (RDMA), is the ideal technology to address requirements of high performance, low latency, and a low-cost data transfer network. It increases efficiency in existing network infrastructure, improves overall CPU utilization for running applications and host memory usage. This results in reduced power, cooling, and rack space requirements with lower cost of ownership and higher return on investment for organizations.

Brief history of RoCEv2

RDMA is an innovative technology that offers an efficient and fast way to move data between networked computers without involving their Operating System and CPU resources, thus improving performance of the hosts by reducing CPU load and the network performance with lower latency and higher bandwidth.

RDMA was invented in 1993 as a concept and initially applied to create low-cost supercomputer using distributed computing. InfiniBand was among the first to develop this concept into a mainstream RDMA specification which later evolved into the favored networking technology for high-performance computing (HPC) cluster design. This technology was productized by Mellanox Technologies (now Nvidia Networking) in 2001.

At that point, RDMA was supported only over Infiniband which worked extremely well but its adoption was limited. RDMA started getting noticed in 2010 when Infiniband Trade Association (IBTA) brought the RDMA technology over popular Ethernet networks. RDMA over Converged Ethernet (RoCE) brings all the advantages of RDMA in existing Ethernet networks.

This move made RDMA popular as it saved massive amount of capital expenditures for replacing Ethernet with Infiniband. RoCE is limited to Layer 2 domain and finally upgraded with routing capability in 2014 when IBTA introduced RDMA over Converged Ethernet version 2 (RoCEv2). RoCEv2 enables routing functionality by changing the packet encapsulation to include IP and UDP headers. With RoCEv2, RDMA technology now can be used across both L2 and L3 networks with multiple subnets. This allows efficient clustering for elastic and scale out
deployments.

Why invest in testing RoCEv2 networks?

To help improve network efficiency, many organizations started deploying RoCEv2 in their datacenters. In comparison to TCP/IP, RoCEv2 not only improves performance of the host machines by freeing up CPU resources, but also increases the bandwidth availability and reduce latency of the network.

Switch fabric performance is key in datacenters for achieving high bandwidth and extremely low latency. Incorrect or non-optimized network settings in a high scale datacenter can result in poor application or storage performance. To maximize the benefits of RoCEv2 deployments in datacenter network, the underlying interconnect must be optimized by running realistic traffic and measuring the relevant network KPIs (like throughput, end-to-end latency, frame loss, network stability) with varying switch/network settings (buffer size, QoS settings).

TCP/IP traffic vs RoCE traffic

Thus far, the need to test RoCEv2 network performance has been underrated and testing was limited to functional validation of servers and Host Channel Adapters (HCA). Now the rapid influx of data into the datacenters, is propelling RoCEv2 deployments in large scale and the need for network performance validation and optimization has finally started receiving due attention.

Testing the RoCEv2 switch fabric

Up until recently, RoCEv2 testing has been carried out in homegrown test beds with physical servers or open-source solutions to validate functionality. While this worked well in small scale, it falls short of addressing the scale and efficiency requirements of a modern datacenter. Building a test bed with racks of server is expensive and lacks the scalability to keep up with the growing performance demands of a network. Hundreds of physical servers in a setup are also difficult to manage effectively and the test results are limited and not repeatable.

RoCEv2 Switch Fabric

The right solution for testing RoCEv2 switch fabric should be able to emulate a real network at high scale, replacing the need for racks of servers in the test bed and reducing costs significantly. To achieve maximum performance, the switch or network should be stressed to its limits. To do that, the test solution should be able to generate RoCEv2 traffic at line rate consistently in a realistic manner, with a mix of bursty and continuous traffic flow, to validate per queue pair (QP) congestion control and verify lossless packet delivery with Priority Flow Control (PFC). It is also important to provide relevant measurements, repeatability of tests. Most importantly, the solution should easily scale up to keep up with the incremental demand for high performance.

A highly scalable RoCEv2 storage fabric test solution

There is growing demand to accurately measure performance of RoCEv2 switch fabric in datacenters. To address this, Spirent introduced an innovative solution with Spirent TestCenter and high-density multi-speed FX3/MX3 test modules. These test modules are already proven in the industry for performance and have a large install base.

RoCEv2 testing capability is offered for the FX3/MX3 high-density test modules via a firmware upgrade. RoCEv2 Traffic generation and congestion control is built in hardware to ensure reliable traffic rate and low latency. The solution is highly scalable, a fully loaded chassis can generate up to 3.6 terabits of RoCEv2 traffic and offers the flexibility to run popular performance benchmarking methodologies (e.g., RFC 2544) in the same test setup.

Learn more about the industry’s highest-density test solution for validating RoCEv2 storage fabric.

喜欢我们的内容吗?

在这里订阅我们的博客

博客订阅

Sudip Basu

Sr. Product Manager, Routing and Data Center Solutions, Cloud and IP Business Unit

Sudip Basu is Senior Product Manager for the Cloud and IP business unit at Spirent Communications. In his current role, he is responsible for the product definition, strategy and delivery of routing and data center testing solutions for Network Equipment Manufactures, Data Center operators, and Services Providers. Sudip has over 14 years of experience in networking test and measurement industry and his professional experience includes hands-on product development and product management. Prior to joining Spirent, Sudip worked as an engineer in Ixia and Agilent, where he helped develop products like IxNetwork and N2X. He holds a Bachelor’s degree in Computer Science and Engineering. To connect with Sudip, please go to LinkedIn at https://www.linkedin.com/in/sudip-basu-77510521/