06/01 MapReduce
How map reduce works?
Developer's To-Do-list
- input/output files
- M Map tasks
- R reduce tasks
- W machines
No reduce can begin until map is complete
if map worker fails any time, task must be restart.
Tasks scheduled based on location of data
Applications of MapReduce
a,b, c
Observation
- No reduce can begin until map is complete
- If map worker fails any time, task much be completely re-run
- barrier between map and reduce
Fault Tolerance
- workers are periodically pinged by master
- no response = failed worker
- only one master, writes periodic checkpoints
- on errors, workers send 'last gasp' UDP packet to master
- Detect records that cause deterministic crashes and skips them.
- input file blocks stored on multiple machines
- When computation almost done, reschedule in-progress tasks
- Avoids "stragglers"
Questions
Q : Why some task need to execute twice?
Q: How does MapReduce handle the failure of a worker?
- Detect :failure via heartbeats
- Re-execute completed and in-progress map tasks
- Re-execute in-progress reduce tasks
- Task completion committed through master
ACK: https://kradnangel.gitbooks.io/operating-system/content/mapreduce.html