Each backup VM serves its dedicated primary VM and can be shared by secondary VMs during its idle time using the heuristic time sharing policy.
Regret of the heuristic time sharing policy can be regarded as the lost reward value compared with the optimal policy when the states of all the backup VMs can be observed, which is defined as:
The description of the heuristic time sharing policy Algorithm: The Heuristic Time Sharing Policy Input: the state-transition matrices of backup VMs, the initial belief state [OMEGA](1) = {[[omega].sub.1](l), ..., [[omega].sub.N](1)} where [[omega].sub.f](1) = [p.sup.j.sub.01]/([p.sup.j.sub.01] + [p.sup.j.sub.11]), j [member of] {1, 2, ..., N}.
Theorem 1 The heuristic time sharing policy for backup VMs achieves optimality under the condition 1/E([D.sup.j.sub.f]) + 1/E([D.sup.j.sub.r]) < 1.
So the proposed heuristic time sharing policy for the backup VMs achieves optimality under the condition 1/E([D.sup.j.sub.f]) + 1/E([D.sup.j.sub.r]) < 1.
Theorem 2 The heuristic time sharing policy for backup VMs achieves optimality under the condition [[summation].sup.T.sub.i=1][([max.sub.j [member or] N] (E([D.sup.j.sub.f])-1/E([D.sup.j.sub.f]) - 1/E([D.sup.j.sub.r]))).sup.i] < 1.
Compared with the exponential complexity imposed by dynamic programming methods, the time complexity is significantly reduced using the heuristic time sharing policy.
In this section, the evaluation of the heuristic time sharing policy is given with numeric simulation experiment and a prototype system deployed in a small-scale cloud platform.