Exploring Elasticity for Wide Area Traffic Management in Geo-distributed Datacenters

Project: Research

View graph of relations

Description

With millions of users worldwide, cloud providers deploy geo-distributed datacenters tosupport Internet-scale services. Wide area networks (WANs) are the criticalunderpinning of such an infrastructure. Architecturally, a provider operates two WANs:a user-facing WAN that connects users with datacenters, and a backbone WAN thatprovides connectivity among datacenters. As cloud services proliferate, how to manageand orchestrate the sheer amount of traffic flowing across datacenter WANs becomes apressing yet challenging issue.The main thesis of this proposal is to explore elasticity as a new degree of freedom fortraffic management in datacenter WANs. Specifically, for the user-facing WAN, trafficmanagement needs to route user requests to datacenters for processing. The commonapproach is to consider geographical diversity of energy prices, and balance the tradeoffbetween user latency and energy cost. We propose to consider another kind of datacenterworkloads, batch jobs, and their inherent elasticity. By reducing capacity for batch jobswe can make room at cost efficient locations for request routing to provide more costsaving. We will further consider elasticity of interactive requests, by allowing partialexecution to trade small response quality degradation for reduced energy use given theconcave quality–processing time curve. For the backbone WAN, traffic managementneeds to make routing decisions, i.e. traffic engineering. We will explore the elasticity ofbulky replication traffic for rate control as a new control knob. Since the provider ownsthe applications and servers in addition to routers, it can adjust flow sending rateaccording to the network state for improved performance.Our objective is to systematically investigate and prototype novel traffic managementsolutions with elasticity taken into account. We will develop optimization frameworks tocapture essential tradeoffs in each scenario and maximize efficiency. For practicality, wewill draw upon the recent advancement in the alternating direction method ofmultipliers to devise novel distributed optimization algorithms. These algorithms aremassively parallelizable to solve cloud-scale problems with millions of variablesefficiently. We will use both testbed implementation with software prototypes andsimulations with empirical traces for realistic performance evaluation. The researchresults can be readily adopted by industry for managing their user-facing and backbonenetworks with salient cost saving and performance improvement. The results will alsoappeal to a wide audience in optimization and networking communities, and may even beapplicable to large-scale convex optimization problems with big data in machinelearning, data mining, and other domains.

Detail(s)

Project number9048007
Grant typeECS
StatusFinished
Effective start/end date1/09/1411/02/19

    Research areas

  • traffic management,wide area networks,traffic engineering,cloud computing,