DPI in Controller Networks

10/28/2013

In a 3-tier software defined network (SDN) that has control and data plane separation leveraging a protocol such as OpenFlow, there are generally data plane devices, controllers, and applications/control programs. Pretty straightforward.

If a packet enters the network switch (data plane device) and doesn’t have a match in the flow table, it’s punted to the controller to see how to handle that packet and the subsequent packets in that flow. This is classic reactive forwarding. Due to latency and possible scaling issues, it’s recommended to leverage and deploy proactive flow forwarding whenever possible.

If you were able to define and know about every possible flow in the network, it would be possible to deploy a full blown proactive forwarding model in the network--- reducing the load and concern of scaling the controller(s).

Adding in Network Services

What about load balancers and firewalls in these networks? With OpenFlow, the network can be turned into a giant network security device (lightweight FW) and load balancer since you would be predefining what to do with all traffic. Maybe the OpenFlow action is drop for turning a switch into a basic Firewall – exactly what Goldman said they were 6 months ago. Maybe the action is to do an L3 re-write, punt to the “load balancer” application that is running on top of the controller, and see which real server IP to forward the traffic to. In both of these examples, very few to no packets are sent to the controller and policy is distributed and enforced, largely based on Layer 3-4 information, throughout the edges of the network.

What about DPI? Is the control path now the data path?

In the 3 tier SDN model, load balancing and firewalling are often drawn above the controller as “apps” in the network. With Deep Packet Inspection where there is a need to inspect every packet for compliance, security, layer 7 load balancing, visibility, etc., does this mean the controller becomes a full blown inline device? I hope not, but of course, there may be solutions that offer this. I want to point this out because of a recent article on F5’s DevCentral by Lori MacVittie. Lori states, “The closer you get to applications, i.e. every step that takes you above layer 4 (TCP), the closer you get to turning what's supposed to be a separated control path into part of the data path. … At some point, you cross the line between control and data path and make the controller by virtue of its hosting the SDN application, part of the data path.”

Just as it’s more optimal to deploy proactive forwarding, it would be more optimal to distribute policy and to keep the control and data paths de-coupled --- even for DPI.

So, what are the options?

Deploy a kernel-based software implementation of the L4-7 device and integrate with vSphere, KVM, etc. If the implementation integrates with network virtualization solutions, that is even better.

Put a virtual appliance on every physical host

LXC or virtual appliance(s) on each physical switch – how high do CPU on TOR switches get anyway? Almost all switches can do this --- you just need vendor support. It’s likely easier to do on Pluribus, Arista, and Cumulus switches.

Deploy a cluster of Layer 4-7 network services VMs so they aren't deployed on every server. Look at the services nodes NSX deploys for BUM traffic. It would be a very similar approach, but for more advanced network L4-7 services. Simply scale out as needed. Doesn't this sound like the Embrane approach?

Wasn't going to go here, but in case you were wondering, you can also deploy a cluster of physical appliances instead of going virtual.

That all sounds good, but, how do you get the traffic you want to want to the right network service device? Well, this is the million dollar question and will vary based on the network deployed, but from what I can tell right now, there are at least a few options:

Cisco vPath – dynamically insert services when running the Nexus 1000V. While this is a Cisco-only solution, pay attention to the detail because vPath is on its way to becoming a standard known as Network Services Header (NSH). I believe this is a thank you to Cisco (Kyle Mestery and others, I believe?)
Network Services Header (NSH) – already is making its way into Open vSwitch. I am unaware of backwards compatibility with vPath and have yet to hear a vendor say they are building towards NSH enabled load balancers, firewalls, and other L4-7 devices although I am hearing Cisco SEs start to mention this in general conversation.
With native OpenFlow fabrics such as Big Switch’s Cloud Fabric (<-- link to Tech Field Day presentation), you will rely on their controller to dynamically insert services by programming flows as needed in the network. This seems like a large undertaking especially as NAT is introduced to the environment to keep track of flows entering and exiting the L4-L7 appliances. Once NSH is supported by more devices and even OpenFlow, maybe that will streamline solutions like this.
Leverage solutions compatible with hypervisors and network virtualization solutions like the Palo Alto integration with VMware’s NSX and vSphere. Juniper’s virtual FW also follows a similar model. When you basically have a “programmable tap” in front of every vNIC, the possibilities are all there to enforce whatever policy you want on a per VM basis. This is slick --- you just need to integrate to the hypervisors APIs for these solutions.
Another solution I’m beginning to explore is leveraging Cisco Topology Independent Forwarding (TIF) as a way to scale out services. Re-direct traffic to a particular network service device based on source address or your favorite N-tuple, and make sure you have a different NAT/PAT (general L3 re-write) address on egress per FW/LB to assist with the traffic as it comes back in. This may just work for some use cases.
Here is an easy one – make the L4-7 device the default gateway. It’ll work for niche/small environments and for domain/edge enforcement.
This list isn't exhaustive as you may find other proprietary or vendor-specific ways to do this as well.

Oh, you want to service chain now too? I bet you also want to know how many of these are possible today. So demanding... Those are posts for a different day.

The underlying theme in many of these options is to eliminate monolithic choke points (L3-L7 devices) while having the ability to dynamically insert network services as needed, scale on demand, and offer linear pricing models for the consumer.

We definitely shouldn't be creating a data path choke point out of any SDN controller. Now we can wait and see which of these options start to get adopted. That’s the fun part. In the meantime, maybe we can start to draw L4-7 devices along with switches as data plane devices in the typical 3-tier SDN architecture with the emphasis on the control/management logic happening via Northbound APIs on the controller.

Does this make sense? Tell me what you think.

Thanks,
Jason

Twitter: @jedelman8

2 Comments

Alvaro Pereira

10/29/2013 05:37:35 am

As a note, the default behavior of punting unmatched packets to the controller has changed starting with OpenFlow 1.3.0. OpenFlow 1.3.3 section 5.4 says: "If the table-miss flow entry does not exist, by default packets unmatched by flow entries are dropped (discarded)."
PS: A table-miss flow does not exist by default.

Jason link

10/30/2013 12:28:54 am

Alvaro,

Thanks for sharing. Sounds like a safer and more secure approach, but based on use case, that implicit drop won't do any good unless it is a security application or you know EVERY flow on the network. In any case, it seems like it could make sense as a default action.

Thanks again,
Jason

DPI in Controller Networks

Leave a Reply.

Author

Categories

Archives