Premium
Diagnostic and troubleshooting of OpenFlow‐enabled switches using kernel and userspace traces
Author(s) -
Belkhiri Adel,
Dagenais Michel
Publication year - 2021
Publication title -
international journal of communication systems
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.344
H-Index - 49
eISSN - 1099-1131
pISSN - 1074-5351
DOI - 10.1002/dac.4920
Subject(s) - computer science , troubleshooting , openflow , forwarding plane , software defined networking , debugging , linux kernel , software , embedded system , networking hardware , operating system , distributed computing , computer network , network packet
Summary Although software‐defined networking (SDN) provides a flexible way to provision and control networks, it also makes network debugging and troubleshooting more complex. In SDN, the network is fully managed by software programs that increase flexibility and sophistication but are prone to bugs. Pinpointing those bugs is challenging because they can occur at multiple locations, such as the forwarding plane, the controller OS, and the network services running on top of the controller. Compared to functional bugs, performance bugs are particularly irritating due to their non‐failure semantics. They can lead to performance loss (reduced throughput, increased latency, and wasted resources) while maintaining the network connectivity. The literature reports several tools and techniques to diagnose SDN bugs, but, unfortunately, they are mostly ineffective against performance bugs. In this paper, we propose a novel monitoring and diagnostic framework capable of diagnosing performance bugs in the SDN data plane. The proposed tool works within Open vSwitch (OVS), a popular software switch, albeit it can easily be adapted to any OpenFlow switch. Tracing techniques are used to collect low‐level performance data from monitored switches. Our tool derives adapted performance metrics from kernel and userspace traces, and then displays them in time‐synchronized graphical views. These views provide valuable insights into OVS operation. They also enable practitioners to discover performance‐related issues and analyze their root causes. A few use cases are presented to demonstrate the efficiency of our tool in optimizing OVS performance and diagnosing its performance bugs.