Running a job on a collection of dynamic machines, with on-lin restarts
We consider the problem of running a job on a selected machine of a collection of machines. Each of these machines may become temporarily unavailable (busy) without warning, in which case the scheduler is allowed to restart the job on a different machine. The behaviour of machines is characterized by a Markov chain, and objective is to minimize completion time of the job. For several types of Markov chains, we present optimal policies.