We consider a many-server queue in which each server can serve multiple customers in parallel. Such multitasking phenomena occur in various applications areas (e.g., in hospitals and contact centers), although the impact of the number of customers who are simultaneously served on system efficiency may vary. We establish diffusion limits of the queueing process under the quality-and-efficiency-driven scaling and for different policies of assigning customers to servers depending on the number of customers they serve. We show that for a broad class of routing policies, including routing to the least busy server, the same one-dimensional diffusion process is obtained in the heavy-traffic limit. In case of assignment to the most busy server, there is no state-space collapse, and the diffusion limit involves a custom regulator mapping. Moreover, we also show that assigning customers to the least (most) busy server is optimal when the cumulative service rate per server is concave (convex), motivating the routing policies considered. Finally, we also derive diffusion limits in the nonheavy-traffic scaling regime and in the heavy-traffic scaling regime where customers can be reassigned during service.

, , , ,
doi.org/10.1287/moor.2021.0051
Mathematics of Operations Research

Storm, J., Berkelmans, W., & Bekker, R. (2024). Diffusion-based staffing for multitasking service systems with many servers. Mathematics of Operations Research, 49(4), 2684–2722. doi:10.1287/moor.2021.0051