Medusa : An Experiment in Distributed Operating System Structure
Main point
- Build a distributed system which is memory-efficient, robust and modular
- Use several disjoint utilities that communicate with each other through messages
- Bunch of descriptors (UDL, PDL, SDL, XDL)
- Spin-waiting
- multi-user operating system
- distributed system structure
3 ways to distribute the control structure
- Centralized OS : A single OS serving all processors
- Pros:Easy to consistent, simpler, easy to maintain manager
- Cons:
- Reliability: single point failure leads to service down,
- Performance
- Replicate OS in every processor
- Cons: Consumes too much memory
- Divide OS into disjoint utilities, a processor executes codes for particular utility only if it can do so locally
- Reason: single point failure
- Pros: Reliability: redundant module in processors Performance:
Terminology
Utility(*)
- A single OS module, abstraction
- Communication via messages
- Distributed among cm nodes for parallelism
- Multiple instances for load balancing, reliability(like GFS and MapReduce, use multiple !copies as backups)
Task force
- a collection of concurrent activities that cooperate closely in the execution of a single logical task
Activity
- Process on a given node for a given utility
Pipes
- Communication channels among activities
Descriptors
References to kernel-managed objects
- UDL descriptors
- Entry point into utilities, like system call
- PDL descriptors
- Reference to private pages, pipes
- SDL descriptors
- Reference to shared pages
- contains descriptors that can be used by all of the activities of the task force
- XDL descriptors
- External descriptors
- Mapping a local descriptor onto a remote PDL/SDL descriptor
- Unsealing
Co-scheduling
Co-scheduling is the principle for concurrent systems of scheduling related processes to run on different processors at the same time (in parallel).
A task force is said to be coscheduled if all of its runnable activities are simultaneously scheduled for execution on their respective processors.
If a large task is not co-scheduled, a single activity might cause the whole task force to block on locks held by that activity
Spin-wait = busy waiting
- Makes sense in distributed system, but not in a uni-processor
- the paper is cited for spin-wait, need to get the lock but did not release CPU
- Why it is an optimization?
- Critical section should be short
- single machine/CPU, not good
- distributed system, better, other process might release lock, might be more efficient
- Spin-block: spin for some time, then block
Take away points
- distributed OS structure
- invoking a file system function is likely to be a remote invocation
- Co-schedule activities of a task
- spin waiting