Bohr: Similarity Aware Geo-distributed Data Analytics

Research output: Chapters, Conference Papers, Creative and Literary WorksRGC 32 - Refereed conference paper (with host publication)peer-review

9 Citations (Scopus)

Abstract

We propose Bohr, a similarity aware geo-distributed data analytics system that minimizes query completion time. The key idea is to exploit similarity between data in different data centers (DCs), and transfer similar data from the bottleneck DC to other sites with more WAN bandwidth. Though these sites have more input data to process, these data are more similar and can be more efficiently aggregated by the combiner to reduce the intermediate data that needs to be shuffled across the WAN. Thus our similarity aware approach reduces the shuffle time and in turn the query completion time (QCT).
We design and implement Bohr based on OLAP data cubes to perform efficient similarity checking among datasets in different sites. Evaluation across ten sites of AWS EC2 shows that Bohr decreases the QCT by 30% compared to state-of-the-art solutions.
Original languageEnglish
Title of host publicationHotCloud'17
Subtitle of host publicationProceedings of the 9th USENIX Conference on Hot Topics in Cloud Computing
PublisherUSENIX Association
Publication statusPublished - Jul 2017
EventThe 9th USENIX Workshop on Hot Topics in Cloud Computing (HotCloud '17). - Santa Clara, United States
Duration: 12 Jul 201714 Jul 2017
https://www.usenix.org/conference/hotcloud17

Publication series

NameHotCloud: Proceedings of the USENIX Conference on Hot Topics in Cloud Computing
PublisherUSENIX Association

Workshop

WorkshopThe 9th USENIX Workshop on Hot Topics in Cloud Computing (HotCloud '17).
PlaceUnited States
CitySanta Clara
Period12/07/1714/07/17
Internet address

Bibliographical note

Research Unit(s) information for this publication is provided by the author(s) concerned.

Fingerprint

Dive into the research topics of 'Bohr: Similarity Aware Geo-distributed Data Analytics'. Together they form a unique fingerprint.

Cite this