The Google Code Channel on Youtube is having some lectures about Cluster Computing and Hadoop. I am currently viewing them all and I can say they are worth watching. The first one in the serie is given by Aaron Kimball, "Problem solving on Large-Scale clusters".
Some quick nice quotes:
- Parallelization is "easy" if processing can be cleanly split into n units.
- Processing more data means using more machines at the same time.
- Cooperation between processes requires synchronization.
- Designing real distributed systems requires consideration of networking topology.
Aaron will go into the fundamentals and in the upcoming lectures we will dive into more details. you can review the presentation here below. You can also find all the shows on the Google code site and some of the questions and answers. I will also post the other shows with some of my comments after I have viewed them.