Fault tolerant placement of stateful VNFs and dynamic fault recovery in cloud networks

Authors: Yuan, G., Xu, Z., Yang, B., Liang, W., Chai, W., Tuncer, D., Galis, A., Pavlou, G. and Wu, G.

http://eprints.bournemouth.ac.uk/32997/

Journal: Computer networks (1976)

Publisher: North-Holland Publ Co

ISSN: 0376-5075

Traditional network functions such as firewalls are implemented in costly dedicated hardware. By decoupling network functions from physical devices, network function virtualization enables virtual network functions (VNF) to run in virtual machines (VMs). However, VNFs are vulnerable to various faults such as software and hardware failures. To enhance VNF fault tolerance, the deployment of backup VNFs in stand-by VM instances is necessary. In case of stateful VNFs, stand-by instances require constant state updates from active instances during its operation. This will guarantee a correct and seamless handover from failed instances to stand-by instances after failures. Nevertheless, such state updates to stand-by instances could consume significant network bandwidth resources and lead to potential admission failures for VNF requests. In this paper, we study the fault-tolerant VNF placement problem with the optimization objective of admitting as many requests as possible. In particular, the VNF placement of active/stand-by instances, the request routing paths to active instances, and state transfer paths to stand-by instances are jointly considered. We devise an efficient heuristic algorithm to solve this problem. For the fault tolerance problem without computing or bandwidth constraints, we also propose two bicriteria approximation algorithms with performance guarantees for a special case of the problem. Given the placement locations of VNFs, some of them may go faulty. We thus consider the dynamic fault recovery problem, for which we propose an approximation algorithm that dynamically switches traffic processing from faulty VNFs to normal ones. Simulations with realistic settings show that our algorithms can significantly improve the request admission rate compared to conventional approaches.

This data was imported from Scopus:

Authors: Yuan, G., Xu, Z., Yang, B., Liang, W., Chai, W.K., Tuncer, D., Galis, A., Pavlou, G. and Wu, G.

http://eprints.bournemouth.ac.uk/32997/

Journal: Computer Networks

Volume: 166

ISSN: 1389-1286

DOI: 10.1016/j.comnet.2019.106953

© 2019 Traditional network functions such as firewalls and Intrusion Detection Systems (IDS) are implemented in costly dedicated hardware, making the networks expensive to manage and inflexible to changes. Network function virtualization enables flexible and inexpensive operation of network functions, by implementing virtual network functions (VNFs) as software in virtual machines (VMs) that run in commodity servers. However, VNFs are vulnerable to various faults such as software and hardware failures. Without efficient and effective fault tolerant mechanisms, the benefits of deploying VNFs in networks can be traded-off. In this paper, we investigate the problem of fault tolerant VNF placement in cloud networks, by proactively deploying VNFs in stand-by VM instances when necessary. It is challenging because VNFs are usually stateful. This means that stand-by instances require continuous state updates from active instances during their operation, and the fault tolerant methods need to carefully handle such states. Specifically, the placement of active/stand-by VNF instances, the request routing paths to active instances, and state transfer paths to stand-by instances need to be jointly considered. To tackle this challenge, we devise an efficient heuristic algorithm for the fault tolerant VNF placement. We also propose two bicriteria approximation algorithms with provable approximation ratios for the problem without compute or bandwidth constraints. We then consider the dynamic fault recovery problem given that some placed active instances of VNFs may go faulty, for which we propose an approximation algorithm that dynamically switches traffic processing from faulty VNFs to stand-by instances. Simulations with realistic settings show that our algorithms can significantly improve the request admission rate compared to conventional approaches. We finally evaluate the performance of the proposed algorithm for the dynamic fault recovery problem in a real test-bed consisting of both physical and virtual switches, and results demonstrate that our algorithms have potentials of being applied in real scenarios.

The data on this page was last updated at 05:12 on February 26, 2020.