CISC849: Advanced Topics in Big Data and Cloud Computing (Fall 2015)



Instructor


The outline will be updated over weeks. Please check out the readings one week before the date.

Lecture Date Topics Papers Notes/Resources
1 09/02 Class Overview
2 09/09 Introduction How to read a paper (CCR'07)
3 09/16 Cloud Computing Cloud Computing and Grid Computing 360-Degree Compared (GCE'08)
Above the Clouds: A Berkeley View of Cloud Computing (TechRep'09)
CloudCmp: Comparing Public Cloud Providers (IMC'10)
The Cost of a Cloud: Research Problems in Data Center Networks (CCR'08)
4 09/23 Pricing and Resource Management How to Bid the Cloud (SIGCOMM'15)
A PTAS Mechanism for Provisioning and Allocation of Heterogeneous Cloud Resources (TPDS'15)
A Two-Sided Market Mechanism for Trading Big Data Computing Commodities (Big Data'14)
Project Discussion
5 09/30 Cloud Federations Dynamically Scaling Applications in the Cloud (CCR'11)
Cloud Federations in the Sky: Formation Game and Mechanism (TCC'15)
A Framework for Data Protection in Cloud Federations (ICPP'14)
A Reputation-Based Mechanism for Dynamic Virtual Organization Formation in Grids (ICPP'12)
6 10/07 MapReduce MapReduce: Simplified Data Processing on Large Clusters (OSDI'04)
Improving MapReduce Performance in Heterogeneous Environments (OSDI'08)
MapReduce and parallel DBMSs: friends or foes? (CACM'10)
Project Discussion
Project Proposal Due
7 10/14 Parallel Processing Dryad: Distributed Data-Parallel Programs from Sequential Building Blocks (SIGOPS'07)
Spark: Cluster Computing with Working Sets (HotCloud'10)
Naiad: A Timely Dataflow System (SOSP'13)
An introduction to Docker for reproducible research (SIGOPS'15)
8 10/21 Resource Management Mesos: A Platform for Fine-Grained Resource Sharing in the Data Center (NSDI'11)
Apache Hadoop YARN: Yet Another Resource Negotiator (SoCC'13)
Project Discussion
9 10/28 Scheduling Dominant Resource Fairness: Fair Allocation of Multiple Resource Types (NSDI'11)
Large-scale cluster management at Google with Borg (EuroSys'15)
Omega: flexible, scalable schedulers for large compute clusters (EuroSys'13)
Hopper: Decentralized Speculation-aware Cluster Scheduling at Scale (SIGCOMM'15)
10 11/04 INFORMS No Class
11 11/11 Energy Energy-aware Scheduling of MapReduce Jobs for Big Data Applications (TPDS'15)
Evaluation and Analysis of GreenHDFS: A Self-Adaptive, Energy-Conserving Variant of the Hadoop Distributed File System (CloudCom'10)
Towards Energy Efficiency in Heterogeneous Hadoop Clusters by Adaptive Task Assignment (ICDCS'15)
Project Discussion
12 11/18 BigData The Google File System (SIGOPS'03)
Bigtable: A distributed storage system for structured data (TOCS'08)
Managing Data Transfers in Computer Clusters with Orchestra (SIGCOMM'11)
Low Latency Geo-distributed Data Analytics (SIGCOMM'15)
- 11/25 Thanksgiving Break No Class
13 12/02 Final Project Presentations
14 12/09 Final Project Presentations Project Paper Due

Final paper template