4.4 System core and filtering rule updates
4.4.1 Proposal model for updating security rules
The proposed architecture for updating the DDoS detection engines, namely StateFit, and the work flow of the system are illustrated in Fig. 4.4.1. The proposed framework for evaluation involves two parts:
• StateFit App (located on the controller and represented for the central detectors/nodes): analyzes traffic and issues the filtering rules installed on P4- based switches. The core modules of this app consist of StateFit Detector and Policy
process.p4
dosfilter.java ONOS Core
StateFit App
P4 Compiler
StateFit Synchronization
Tables
StateFit Synchronization
Switch Port 1
StateFit
Interpreter Attacker Victim
Control data User traffic
P4 Runtime Table management
Extern objects e.g., add/remove
table entries P4 program
ONOS application
Configuration, resources
Distributor Rules
P4-based switch ONOS instance Policy Generator
Figure 4.4.1: The proposed architecture for updating the DDoS detection engines, namely StateFit, and the work flow of the system.
Generator, StateFit Rules Management and StateFit Distributor.
• StateFit Interpreter (located on each P4-based switch and represent for the local detectors/nodes): parses the traffic, interprets the received rules and triggers the proper forwarding actions such as drop. This app involves several modules: StateFit Parser, StateFit Forwarding and Filtering, StateFit Collector and StateFit Deparser.
Note that StateFit App can be installed on multi-instances of distributed SDN controllers, if any.
StateFit App: is a Java-based program packed and installed on the ONOS Application layer. The core of this app involves three key tasks:
1. Analyzing data statistics to identify attack traffic for further treatment, e.g., drop/pass or forward. The treatment actions are stored in the entries, so-called the policies, and then managed by StateFit Rules Management.
2. Initializing and setting up default network configuration for all connected P4-based switches (in the device/host discovery stage).
3. Listening and ready to delivery patches/new implementation for any switches.
The task (1) is handled by StateFit Detector and Policy Generator while StateFit Distributor keeps (2) and (3) under control. The traffic policy can flexibly be customized
and implemented in step (1) to address and support a large number of protocols. For this purpose, ONOS Core supports many APIs to handle the underlying layers such as for devices (org.onosproject.net), traffic flow and pipeline processing behavior. Using the APIs, we can build an extensive filter for any traffic such as IP-based packets. For architecture evaluation, we have built a simple traffic policy, that is, if the number of incoming ICMP packets into a switch port in a period exceeds a threshold, e.g., 1000 packets per second, a drop packet action will be executed on that detected port.
StateFit Interpreter: Unlike StateFit App, this interpreter is a P4-based compiled program integrated into an SDN switch. In its core, we define the abstract format to represent the packet header and parse the data from the buffer of a switch ingress port. Also, we build the ingress/egress parser and the processing mechanism for the incoming/outgoing ports. Another core processing parts are functions to filter traffic on ingress ports (traffic treatment). Two functions (IngressFilter, EgressFilter) are built to process the traffic on the ingress and egress pipelines. The IngressFilter does the following processing with the received packets. First, the packet received from CPU_PORT(a packet-out sent by the controller), it skips pipeline processing, and then sets the egress port as requested by the controller (packet_out header). Second, if the packet is received from the switch port, we apply the rule table to the filter. The table is composed of lookup keys and a corresponding set of actions and their parameters. The table entries follow the target-independent protocol format [79].
To make the system perform basic operations such as forwarding ARP requests/response, we initialize several default entries for the table. These records can be re-configured in the future. For example, the network configuration can be changed. The EgressFilter in this implementation does nothing, but it can change that output port selection; however, this creates more troubles to keep the consistency of the selected output port in the ingress processing step. The security policies have the same format as the forwarding rules of P4-based switches. However, there is a difference in the action field: egress port can be null (drop) or new value (path towards an extensive IDS or a clean network like scrubbing center). The StateFit Interpreter program is loaded onto the switches when it first connects to the controller. Therefore, in practice, a vendor can build its own interpreter with its default network configuration (for the ready-working purpose and optimization performance), to its sold switches. Note that these devices must install an interpreter such as our SI engine before they can handle the network traffic; otherwise, no action will be applied to the incoming traffic.
Interaction between StateFit App and StateFit Interpreter
The interaction requires a communication protocol. In this study, we use the P4 Runtime protocol [79] for communication role and Program Independent (PI) framework for coding the connection maintenance services. On the other hand, to keep policy up-to-update between StateFit App in the controller and StateFit Interpreter in P4-based switches, we have to build a new client-server program, so-called StateFit Synchronization. This service involves a P4Runtime Controller and a P4Runtime Agent. The former is a function to push the rule data to the target if requested. The latter is a gRPC server. The controller connects to each switch’s gRPC server (in the device discovery stage) and deploys the network configuration.
Consistent Policy Update
The delivery process of policies in the StateFit faces the same consistency issue of the SDN network updates [80] such as network security policy violation (e.g., untrustworthy traffic is incorrectly allowed to pass through the network) if the forwarding paths are updated on many switches simultaneously. Several researchers have paid efforts on solving the consistent updates in SDN such as the two-phase commitment [81] and time-based approach[80]. To guarantee this consistency, the switches need to be updated as fast as possible and the best is to follow a strict order (e.g., dependency model [80]) . In practice (e.g., asynchronous network environment), it is impossible to assure that the patches are updated at the switches simultaneously due to possible message delays and losses.
Therefore, there is a trade-off between guaranteeing the update speeds of switches, rule space utilization, controller overhead and waiting time of the packet in the switch queues.
Note that all of those solutions are based on the OpenFlow protocol.
There are more troubles to synchronize the states (e.g., the counter value) in the SDN stateful dataplane due to the runtime structure of the P4 program, i.e., it is impossible to predict what states until the P4 compiled program runs. For simplicity, we design the update module as an implementation of StateFit Interpreter and it works independently with the packet processing to update the StateFit policy rules. The counter states are updated directly to the StateFit App at the controller.
Regarding the update mechanism, the StateFit App assesses the existing policies and the counter states of each switch in the control app to build a set of update rules and a list of switches for updating. Each rule will update a specific flow (the longest-prefix matching of symmetric end-point information such as srcIP, protocol, dstIP, dstPort).
We update the switches by following atomic task: the StateFit App posts the updates (embedded in messages) to the gRPC server of the switches which need to be updated (e.g., in the blacklist of StateFit Detector), if it receives a positive acknowledge, that means the switches are updated successfully. The update is performed on all the switches and follows the client-server model. For the purpose of consistency, we use two tables at the SI engine: (1) live table, (0) temporary table; but the role can change by switching a bit flag. The former is only used, by the SI engine, to match the incoming packets while the latter is for receiving the new policies. The latter table has two states: locked, unlocked. If the update is done, the temporary table turns into the locked state. When the lock event triggers, it switches the role of two tables such that the temporary table becomes the live table and vice versa.
In our model, we calculate the delay time of delivering the updates as the round trip time of the atomic task. The delay time involves several network delay parameters, i.e., the interval time of statistics traffic, transmission delay, propagation delay, queuing delay and packet processing delay. The interval time of statistics traffic is set in the flow monitor of StateFit Collector. The delay may be longer if the switch is at the lowest level of hierarchical network topology (the update message must travel through several intermediate switches).
For the update frequency, we argue that delivering a security patch to counter an attack is more frequent, e.g., minutes or hours, than that of an interpreter program. A core update (the whole interpreter) may only be required if there are significant changes to the network configuration such as the topology. Also, the size of a patch with n policies is much smaller than the size of a compiled interpreter.