## Lecture Summaries

- 8 May
Undergraduate reports/presentations on project progress; finish AllPlaces.pdf for remaining graduate student proposals.

- 6 May
Some student-inspired ideas for future Big Data Infrastructure, slides are in AllPlaces.pdf (allowing each student to make 5-minute pitch).

- 1 May
MapReduce undergraduate projects, progress and further considerations on AWS EMR. A brief consideration of the topic Beyond Hadoop, which motivates things like Apache Hama and research such as HaLoop.

- 29 Apr
Looking a bit at the infrastructure of big data analytics tools, which exploit the systems technology. Examples are Mahout, Vowpal Wabbit, Yahoo! Samoa and the paper Iterative MapReduce for Large Scale Machine Learning

- 24 Apr
Solution to matrix multiplication (pseudocode and Python with mrjob). Then, topics about Hadoop, MRJob and AWS MapReduce, see additions to the page AWS EMR.

- 22 Apr
Last of round two Paper Presentations.

- 17 Apr
The penultimate from round two Paper Presentations, and more depth on Virtualization.

- 15 Apr
More Paper Presentations.

- 10 Apr
On track with Paper Presentations.

- 3 Apr
Two presentations from Paper Presentations. Reminder: undergrads should choose data for Project Schedule.

- 1 Apr
- Second Exam.
- 27 Mar
For undergrads, look again at Big Data (how can you choose). Consider some of the Glossary terms in more depth. Continued discussion on slots for Paper Presentations.

- 25 Mar
Resume after break. Schedule for more presentations, and Project Schedule for undergraduates. Exam on April 1st: for graduate students, will be over the Glossary and the papers from the first round of Paper Presentations; for undergraduate students, the exam will again be about MapReduce, asking how to solve some problem using mappers and reducers.

- 13 Mar
Looking at Hadoop Partitioners, and how they can be used for merge and sort tasks. Another demonstration of AWS EMR (see additional notes added for this lecture), plus the use of S3 Tools (which are installed on the test machine). Then a brief look at some MapReduce applications (added to the MapReduce page), and a presentation featuring one example in more depth, triangles.pdf

- 11 Mar
For undergraduates, a lecture devoted to applications of MapReduce (looking at Chapter 3 of the MapReduce textbook to understand memory constraints and how combiners can save network bandwidth). Then a short demonstration of using AWS EMR. Graduate students should watch the Raft lecture to understand a working consensus algorithm.

- 6 Mar
The final two Paper Presentations of the first round. New assignment (for graduate students) is to complete a glossary, assigned by pin numbers, of the Terminology and Buzzwords.

**Due 13th March.**- 4 Mar
Two more Paper Presentations and HomeworkTwo (for undergrads).

- 27 Feb
Continue Paper Presentations. More about MapReduce and the mrjob way of invoking MapReduce.

- 25 Feb
Two more from Paper Presentations. Some Terminology and Buzzwords and remarks about Trade-Offs.

- 20 Feb
Two presenters from Paper Presentations, plus some review of Databases.

- 18 Feb
One more try to show MapReduce demo in class. Brief coverage of Advice on paper presentations. Introduction to the Test Machine.

- 13 Feb
First lecture on MapReduce -- graduate students should skim/read the Dean & Ghemawat paper to prepare for this. Then assignments of papers to read and present: Paper Presentations for graduate students; for undergraduates, assignment is to select some Big Data and think about a MapReduce project.

- 11 Feb
- 6 Feb
Some practical things as background on working in the cloud. Starting with useful things to know about the Linux Environment

- 30 Jan
More on coroutines, illustrated with coroutine.py (just as one example). More on barrier synchronization and systolic computing in Parallel Programming. Reading assignment for undergraduates: please read pages 1--38 of this book as background to lectures on 6 and 13 February (in-class exam on 11 February).

- 28 Jan
- Finish coverage of FPL'85 impossibility proof.
- 23 Jan
Introduction to Parallel Programming.

- 21 Jan
Start of course (Syllabus and overview of topics). Last half hour initiated presentation over FPL'85 impossibility. Graduate students please download and read the paper (works on-campus).