Homepage / Searching for: centralized performance control for datacenter networks

Welcome to Read Book Online

Download centralized performance control for datacenter networks or read centralized performance control for datacenter networks online books in PDF, EPUB and Mobi Format. Click Download or Read Online button to get centralized performance control for datacenter networks book now. Note:! If the content not Found, you must refresh this page manually.

Centralized Performance Control For Datacenter Networks

DOWNLOAD
Author by : Jonathan Perry (Ph. D.)
Languange Used : en
Release Date : 2017
Publisher by :

ISBN : OCLC:1005140050

An ideal datacenter network should allow operators to specify policy for resource allocation between users or applications, while providing several properties, including low median and tail latency, high utilization (throughput), and congestion (loss) avoidance. Current datacenter networks inherit the principles that went into the design of the Internet, where packet transmission and path selection decisions are distributed among the endpoints and routers, which impede obtaining the desired properties. Instead, we propose that a centralized controller should tightly regulate senders' use of the network according to operator policy, and evaluate two architectures: Fastpass and Flowtune. In Fastpass, the controller decides when each packet should be transmitted and what path it should follow. Fastpass incorporates two fast algorithms: the first determines the time at which each packet should be transmitted, while the second determines the path to use for that packet. We deployed and evaluated Fastpass in a portion of Facebook's datacenter network. Our results show that Fastpass achieves high throughput comparable to current networks at a 240 x reduction is queue lengths, achieves much fairer and consistent flow throughputs than the baseline TCP, scales to schedule 2.21 Terabits/s of traffic in software on eight cores, and achieves a 2.5 x reduction in the number of TCP retransmissions in a latency-sensitive service at Facebook. In Flowtune, congestion control decisions are made at the granularity of a flowlet, not a packet, so allocations change only when flowlets arrive or leave. The centralized allocator receives flowlet start and end notifications from endpoints, and computes optimal rates using a new, fast method for network utility maximization. A normalization algorithm ensures allocations do not exceed link capacities. Flowtune updates rate allocations for 4600 servers in 31 ps regardless of link capacities. Experiments show that Flowtune outperforms DCTCP, pFabric, sfqCoDel, and XCP on tail packet delays in various settings, and converges to optimal rates within a few packets rather than over several RTTs. EC2 benchmarks show a fairer rate allocation than Linux's Cubic....

Tenant Level Network Performance Isolation In Flowtune

DOWNLOAD
Author by : Brendan S. Chang
Languange Used : en
Release Date : 2016
Publisher by :

ISBN : OCLC:1014181329

Performance isolation is a major concern for multi-tenant datacenters. Many service level agreements include a specification on the allotment of resources. For tenants, these resource guarantees are critical to the availability and efficiency of their services. While CPU, disk, and memory isolation are well-understood, network performance isolation is less straightforward. In this thesis, I investigate methods for enforcing bandwidth fairness guarantees for logical networks in a datacenter and implement network performance isolation in Flowtune. Flowtune is a datacenter network architecture which introduces a centralized arbiter to enforce congestion control at the flowlet level. Flowtune achieves rapid convergence to a desired allocation of network resources in addition to reducing tail latencies in various settings. However, Flowtune currently does not provide tenant-level network performance isolation....

Improving Datacenter Performance With Network Offloading

DOWNLOAD
Author by : Yanfang Le
Languange Used : en
Release Date : 2020
Publisher by :

ISBN : OCLC:1245425211

There has been a recent emergence of distributed systems in datacenters, such as MapReduce and Spark for data analytics and TensorFlow and PyTorch for machine learning. These frameworks are not only computation and memory intensive, they also place high demands on the network for distributing data. The fast-growing Ethernet speed mitigates the high demand a bit. However, as Ethernet speed outgrows the CPU processing power, it not only requires us to rethink the existing algorithms for different network layers, but also provides opportunities to innovate with new application designs, such as datacenter resource disaggregation [3] and in-network computation applications [4, 5, 6]. The fast network devices come with a programmability feature, which enables offloading computation tasks from CPU to NICs or switches. Network offloading to programmable hardware is a promising approach to help relieve processing pressure on the CPU for computation-intensive applications, e.g., Spark, or reduce the network traffic for network-intensive applications, e.g., TensorFlow. However, leveraging programmable hardware effectively is challenging due to the limited memory capacity and restricted programming model. In order to understand how to leverage the advantage of network offloading in developing new network stacks, network protocols, and applications, the following question needs to be answered: how to do judicious division between the programmable hardware and software for network offload given limited resources and restricted programming models? Driven by the real application demand while exploring the answer to this question, we first propose RoGUE, a new congestion control and recovery mechanism for RDMA over Converged Ethernet that does not rely on PFC while preserving the benefits of running RDMA, i.e., low CPU and low latency. To preserve the low CPU benefit, RoGUE offloads packet pacing to the NIC. Though RoGUE achieves better performance in extensive testbed evaluations, the architecture for optimal congestion control should be a centralized packet scheduler [7], which has global visibility into packet reservation requests from all the servers. Given all the hosts are connected through switches and the emerging programmable switch hardware can have stateful objects, we designed a centralized packet scheduler at the switch, called PL2, to provide stable and near-zero-queuing in the network by proactively reserving switch buffers for packet bursts in the appropriate time-slots. Congestion control is an essential component in the networking stack because application demand for the network is higher than link speed. To eliminate the net- work congestion control, the fundamental solution is reducing the network traffic such that the application demand for the network is no more than link speed. We observed that we are able to reduce the network traffic for distributed training sys- tems by offloading a critical function, gradients aggregation, to the programmable switch. Each worker in the distributed training system sends gradients over the network to special components, parameter servers, to do aggregation, which is a simple add operator. Thus, we propose ATP, a network service for in-network aggregation aimed at modern multi-rack, multi-job DT settings. ATP performs decentralized, dynamic, best-effort aggregation, enables efficient and equitable sharing of limited switch resources across simultaneously running DT jobs, and gracefully accommodates heavy contention for switch resources....

High Performance Datacenter Networks

DOWNLOAD
Author by : Dennis Abts
Languange Used : en
Release Date : 2011-02-02
Publisher by : Morgan & Claypool Publishers

ISBN : 9781608454037

Datacenter networks provide the communication substrate for large parallel computer systems that form the ecosystem for high performance computing (HPC) systems and modern Internet applications. The design of new datacenter networks is motivated by an array of applications ranging from communication intensive climatology, complex material simulations and molecular dynamics to such Internet applications as Web search, language translation, collaborative Internet applications, streaming video and voice-over-IP. For both Supercomputing and Cloud Computing the network enables distributed applications to communicate and interoperate in an orchestrated and efficient way. This book describes the design and engineering tradeoffs of datacenter networks. It describes interconnection networks from topology and network architecture to routing algorithms, and presents opportunities for taking advantage of the emerging technology trends that are influencing router microarchitecture. With the emergence of "many-core" processor chips, it is evident that we will also need "many-port" routing chips to provide a bandwidth-rich network to avoid the performance limiting effects of Amdahl's Law. We provide an overview of conventional topologies and their routing algorithms and show how technology, signaling rates and cost-effective optics are motivating new network topologies that scale up to millions of hosts. The book also provides detailed case studies of two high performance parallel computer systems and their networks. Table of Contents: Introduction / Background / Topology Basics / High-Radix Topologies / Routing / Scalable Switch Microarchitecture / System Packaging / Case Studies / Closing Remarks...

Management Of Data Center Networks

DOWNLOAD
Author by : Nadjib Aitsaadi
Languange Used : en
Release Date : 2021-05-11
Publisher by : John Wiley & Sons

ISBN : 9781119647423

MANAGEMENT OF DATA CENTER NETWORKS Discover state-of-the-art developments in DCNs from leading international voices in the field In Management of Data Center Networks, accomplished researcher and editor Dr. Nadjib Aitsaadi delivers a rigorous and insightful exploration of the network management challenges that present within intra- and inter-data center networks, including reliability, routing, and security. The book also discusses new architectures found in data center networks that aim to minimize the complexity of network management while maximizing Quality of Service, like Wireless/Wired DCNs, server-only DCNs, and more. As DCNs become increasingly popular with the spread of cloud computing and multimedia social networks employing new transmission technologies like 5G wireless and wireless fiber, the editor provides readers with chapters written by world-leading authors on topics like routing, the reliability of inter-data center networks, energy management, and security. The book also offers: A thorough overview of the architectures of data center networks, including the classification of switch-centric, server-centric, enhanced, optical, and wireless DCN architectures An exploration of resource management in wired and wireless data center networks, including routing and wireless channel allocation and assignment challenges and criteria Practical discussions of inter-data center networks, including an overview of basic virtual network embedding Examinations of energy and security management in data center networks Perfect for academic and industrial researchers studying the optimization of data center networks, Management of Data Center Networks is also an indispensable guide for anyone seeking a one-stop resource on the architectures, protocols, security, and tools required to effectively manage data centers....

Data Center Networks

DOWNLOAD
Author by : Yang Liu
Languange Used : en
Release Date : 2013-09-26
Publisher by : Springer Science & Business Media

ISBN : 9783319019499

This SpringerBrief presents a survey of data center network designs and topologies and compares several properties in order to highlight their advantages and disadvantages. The brief also explores several routing protocols designed for these topologies and compares the basic algorithms to establish connections, the techniques used to gain better performance, and the mechanisms for fault-tolerance. Readers will be equipped to understand how current research on data center networks enables the design of future architectures that can improve performance and dependability of data centers. This concise brief is designed for researchers and practitioners working on data center networks, comparative topologies, fault tolerance routing, and data center management systems. The context provided and information on future directions will also prove valuable for students interested in these topics....

The Policy Driven Data Center With Aci

DOWNLOAD
Author by : Lucien Avramov
Languange Used : en
Release Date : 2015
Publisher by : Pearson Education

ISBN : 9781587144905

Use policies and Cisco® ACI to make data centers more flexible and configurable--and deliver far more business value Using the policy driven data center approach, networking professionals can accelerate and simplify changes to the data center, construction of cloud infrastructure, and delivery of new applications. As you improve data center flexibility, agility, and portability, you can deliver far more business value, far more rapidly. In this guide, Cisco data center experts Lucien Avramov and Maurizio Portolani show how to achieve all these benefits with Cisco Application Centric Infrastructure (ACI) and technologies such as python, REST, and OpenStack. The authors explain the advantages, architecture, theory, concepts, and methodology of the policy driven data center. Next, they demonstrate the use of python scripts and REST to automate network management and simplify customization in ACI environments. Drawing on experience deploying ACI in enterprise data centers, the authors review design considerations and implementation methodologies. You will find design considerations for virtualized datacenters, high performance computing, ultra-low latency environments, and large-scale data centers. The authors walk through building multi-hypervisor and bare-metal infrastructures, demonstrate service integration, and introduce advanced telemetry capabilities for troubleshooting. Leverage the architectural and management innovations built into Cisco® Application Centric Infrastructure (ACI) Understand the policy driven data center model Use policies to meet the network performance and design requirements of modern data center and cloud environments Quickly map hardware and software capabilities to application deployments using graphical tools--or programmatically, via the Cisco APIC API Increase application velocity: reduce the time needed to move applications into production Define workload connectivity instead of (or along with) subnets, VLAN stitching, and ACLs Use Python scripts and REST to automate policy changes, parsing, customization, and self-service Design policy-driven data centers that support hypervisors Integrate OpenStack via the Cisco ACI APIC OpenStack driver architecture Master all facets of building and operating multipurpose cloud architectures with ACI Configure ACI fabric topology as an infrastructure or tenant administrator Insert Layer 4-Layer 7 functions using service graphs Leverage centralized telemetry to optimize performance; find and resolve problems Understand and familiarize yourself with the paradigms of programmable policy driven networks...

Virtualized Cloud Data Center Networks Issues In Resource Management

DOWNLOAD
Author by : Linjiun Tsai
Languange Used : en
Release Date : 2016-04-19
Publisher by : Springer

ISBN : 9783319326320

This book discusses the characteristics of virtualized cloud networking, identifies the requirements of cloud network management, and illustrates the challenges in deploying virtual clusters in multi-tenant cloud data centers. The book also introduces network partitioning techniques to provide contention-free allocation, topology-invariant reallocation, and highly efficient resource utilization, based on the Fat-tree network structure. Managing cloud data center resources without considering resource contentions among different cloud services and dynamic resource demands adversely affects the performance of cloud services and reduces the resource utilization of cloud data centers. These challenges are mainly due to strict cluster topology requirements, resource contentions between uncooperative cloud services, and spatial/temporal data center resource fragmentation. Cloud data center network resource allocation/reallocation which cope well with such challenges will allow cloud services to be provisioned with predictable network performance, mitigate service performance degradation and even guarantee service level agreements. Virtualized Cloud Data Center Networks: Issues in Resource Management tackles the challenges of managing cloud data center networks and introduces techniques to efficiently deploy large-scale distributed computing applications that require predictable performance in multi-tenant cloud data centers....

Improving Datacenter Network Performance Via Intelligent Network Edge

DOWNLOAD
Author by : Keqiang He
Languange Used : en
Release Date : 2017
Publisher by :

ISBN : OCLC:1013189973

Datacenter networks are critical building blocks for modern cloud computing infrastructures. In this dissertation, we show how we can leverage the flexibility and high programmability of datacenter network edge (i.e., end-host networking) [101, 102] to improve the performance of three key functionalities in datacenter networks -- traffic load balancing, congestion control and rate limiting. Datacenter networks need to deal with a variety of workloads, ranging from latency-sensitive small flows to bandwidth-hungry large flows. In-network hardware-based load balancing schemes which are based on flow hashing, e.g., ECMP, cause congestion when hash collisions occur. To solve this problem, we propose a soft-edge load balancing scheme called Presto. Presto load-balances on near uniform-sized small data units (flowcells) and spreads flowcells across the symmetric network via the virtual switches on the senders. Because of fine-grained flowcell-level load balancing, packets may arrive out of order at the receiver side, so we propose a mechanism to handle reordering in the Generic Receive Offload (GRO) functionality below the TCP layer. Presto avoids the hash collision problem and improves traffic load balancing performance significantly. Optimized traffic load balancing alone is not sufficient to guarantee high-performance datacenter networks. Virtual Machine (VM) technology plays an integral role in modern multi-tenant clouds by enabling a diverse set of software to be run on a unified underlying framework. This flexibility, however, comes at the cost of dealing with outdated, inefficient, or misconfigured TCP stacks implemented in the VMs. We propose a congestion control virtualization technique called AC/DC TCP. AC/DC TCP exerts fine-grained control over arbitrary tenant TCP stacks by enforcing per-flow congestion control in the virtual switch (vSwitch) in the hypervisor. AC/DC TCP is light-weight, flexible, scalable and can police non-conforming flows. Besides queueing latency in network switches, we observe that rate limiters on end-hosts can also increase network latency by an order of magnitude or even more. To this end, we propose two techniques -- DEM and SPRING to improve the performance of rate limiters. Our experiment results demonstrate that DEM and SPRING-enabled rate limiters can achieve high stable throughput and low latency....

A Scalable Adaptive And Extensible Data Center Network Architecture

DOWNLOAD
Author by : Mohammad Abdulaziz Al-Fares
Languange Used : en
Release Date : 2012
Publisher by :

ISBN : 1267250631

Today's largest data centers contain tens of thousands of servers, and they will encompass hundreds of thousands in the very near future. These machines are designed to serve a rich mix of applications and clients with significant aggregate bandwidth requirements; distributed computing frameworks like MapReduce/Hadoop significantly stress the network interconnect, which when compounded with progressively oversubscribed topologies and inefficient multipath forwarding, can cause a major bottleneck for large computations spanning hundreds of racks. Non-uniform bandwidth among data center nodes also complicates application design and limits overall system performance. Furthermore, using the highest-end, high port-density commercial switches at the core and aggregation layers incurs tremendous cost. To overcome this limitation, this dissertation advocates three major goals: First, the horizontal, rather than vertical, expansion of data center networks, using commodity off-the-shelf switch components, and rearrangeably non-blocking topologies such as fat-trees. We show that these topologies have several advantages in overall equipment and operational cost and power compared to traditional hierarchical trees. However, the corresponding increase in the degree of multipathing makes traffic forwarding more challenging. Traditional multipath techniques like static hashing (ECMP) can waste bisection bandwidth due to local and downstream hash collisions. To overcome this inefficiency, we next describe the architecture, implementation, and evaluation of Hedera : a centralized flow scheduling system for data center networks with global knowledge of traffic patterns and link utilization. Hedera computes max-min fair flow-bandwidth demands and uses one of several online placement heuristics to find flow paths that maximize the achievable network bisection bandwidth. Finally, to enable rapid network extensibility, we describe the system architecture and implementation of NetBump: a platform for data-plane modifications "on the wire." By using low-latency kernel bypass and user-level application development, NetBump allows examining, marking, and forwarding packets at line-rate, and enables a host of active queue management disciplines and congestion control mechanisms. This allows the prototyping and adoption of innovative functionality such as DCTCP and 802.1Qau quantized congestion notification (QCN). We show that augmenting top-of-rack switches with NetBumps effectively enables bypassing the slow adoption of data center protocols by commercial switch vendors....