BCS SPA2015

Software Practice Advancement Conference

SPA Conference session: Distributed Databases

One-line description:Exploring the main ideas behind distributed highly available databases.
 
Session format: Presentation and practical. [read about the different session types]
 
Abstract:We're going to explore the main ideas underpinning distributed, highly available databases - in particular the sorts of databases that are heavily inspired by the ideas in the Dynamo Paper (http://www.allthingsdistributed.com/files/amazon-dynamo-sosp2007.pdf) such as Riak and Cassandra. So we'll cover things like:

* what availability and consistency are and how you can trade them off for each other
* how to carry on reading and writing in the face of node failures
* eventual consistency and repairs

Then we'll try out these ideas by building our own simple distributed data store on top of a bunch of in-memory hash-maps, and see what happens when we wreak havoc by simulating node failures.
 
Audience background:Developers who are interested in learning more about dynamo-like distributed databases.
 
Benefits of participating:A familiarity with the main ideas underlying dynamo-like database distribution, that will come in handy when using or considering using these sorts of databases in the future.
 
Materials provided:I'll provide links to the slides, and a github repo that has:

a) an implementation of a hashmap wrapped in a simple HTTP API, which can be used as a single node in the distributed system

b) a clear lists of tasks to build upon when creating the database yourself. (will include a hint sheet too - with code snippets for those who get very stuck)
 
Process:First I'll present the ideas, and then I'll present how we're going to code our own version. This'll be outlined as a series of tasks that build upon each other. Then people can program on their own or in groups to get through the tasks, and I'll go around trying to help!
 
Detailed timetable:Not sure how long a session should be. Maybe 45mins presentation, 15mins exercise explanation, 90-120mins for the exercise?
 
Outputs:
 
History:None. I've presented on Cassandra before, including on the Dynamo paper. But this idea was generated just for SPA :-)
 
Presenters
1. Emily Green
SoundCloud
2. 3.