If a packet enters the network switch (data plane device) and doesn’t have a match in the flow table, it’s punted to the controller to see how to handle that packet and the subsequent packets in that flow. This is classic reactive forwarding. Due to latency and possible scaling issues, it’s recommended to leverage and deploy proactive flow forwarding whenever possible.
Adding in Network Services
What about load balancers and firewalls in these networks? With OpenFlow, the network can be turned into a giant network security device (lightweight FW) and load balancer since you would be predefining what to do with all traffic. Maybe the OpenFlow action is drop for turning a switch into a basic Firewall – exactly what Goldman said they were 6 months ago. Maybe the action is to do an L3 re-write, punt to the “load balancer” application that is running on top of the controller, and see which real server IP to forward the traffic to. In both of these examples, very few to no packets are sent to the controller and policy is distributed and enforced, largely based on Layer 3-4 information, throughout the edges of the network.
What about DPI? Is the control path now the data path?
In the 3 tier SDN model, load balancing and firewalling are often drawn above the controller as “apps” in the network. With Deep Packet Inspection where there is a need to inspect every packet for compliance, security, layer 7 load balancing, visibility, etc., does this mean the controller becomes a full blown inline device? I hope not, but of course, there may be solutions that offer this. I want to point this out because of a recent article on F5’s DevCentral by Lori MacVittie. Lori states, “The closer you get to applications, i.e. every step that takes you above layer 4 (TCP), the closer you get to turning what's supposed to be a separated control path into part of the data path. … At some point, you cross the line between control and data path and make the controller by virtue of its hosting the SDN application, part of the data path.”
Just as it’s more optimal to deploy proactive forwarding, it would be more optimal to distribute policy and to keep the control and data paths de-coupled --- even for DPI.
So, what are the options?
- Deploy a kernel-based software implementation of the L4-7 device and integrate with vSphere, KVM, etc. If the implementation integrates with network virtualization solutions, that is even better.
- Put a virtual appliance on every physical host
- LXC or virtual appliance(s) on each physical switch – how high do CPU on TOR switches get anyway? Almost all switches can do this --- you just need vendor support. It’s likely easier to do on Pluribus, Arista, and Cumulus switches.
- Deploy a cluster of Layer 4-7 network services VMs so they aren't deployed on every server. Look at the services nodes NSX deploys for BUM traffic. It would be a very similar approach, but for more advanced network L4-7 services. Simply scale out as needed. Doesn't this sound like the Embrane approach?
- Wasn't going to go here, but in case you were wondering, you can also deploy a cluster of physical appliances instead of going virtual.
That all sounds good, but, how do you get the traffic you want to want to the right network service device? Well, this is the million dollar question and will vary based on the network deployed, but from what I can tell right now, there are at least a few options:
- Cisco vPath – dynamically insert services when running the Nexus 1000V. While this is a Cisco-only solution, pay attention to the detail because vPath is on its way to becoming a standard known as Network Services Header (NSH). I believe this is a thank you to Cisco (Kyle Mestery and others, I believe?)
- Network Services Header (NSH) – already is making its way into Open vSwitch. I am unaware of backwards compatibility with vPath and have yet to hear a vendor say they are building towards NSH enabled load balancers, firewalls, and other L4-7 devices although I am hearing Cisco SEs start to mention this in general conversation.
- With native OpenFlow fabrics such as Big Switch’s Cloud Fabric (<-- link to Tech Field Day presentation), you will rely on their controller to dynamically insert services by programming flows as needed in the network. This seems like a large undertaking especially as NAT is introduced to the environment to keep track of flows entering and exiting the L4-L7 appliances. Once NSH is supported by more devices and even OpenFlow, maybe that will streamline solutions like this.
- Leverage solutions compatible with hypervisors and network virtualization solutions like the Palo Alto integration with VMware’s NSX and vSphere. Juniper’s virtual FW also follows a similar model. When you basically have a “programmable tap” in front of every vNIC, the possibilities are all there to enforce whatever policy you want on a per VM basis. This is slick --- you just need to integrate to the hypervisors APIs for these solutions.
- Another solution I’m beginning to explore is leveraging Cisco Topology Independent Forwarding (TIF) as a way to scale out services. Re-direct traffic to a particular network service device based on source address or your favorite N-tuple, and make sure you have a different NAT/PAT (general L3 re-write) address on egress per FW/LB to assist with the traffic as it comes back in. This may just work for some use cases.
- Here is an easy one – make the L4-7 device the default gateway. It’ll work for niche/small environments and for domain/edge enforcement.
- This list isn't exhaustive as you may find other proprietary or vendor-specific ways to do this as well.
Oh, you want to service chain now too? I bet you also want to know how many of these are possible today. So demanding... Those are posts for a different day.
The underlying theme in many of these options is to eliminate monolithic choke points (L3-L7 devices) while having the ability to dynamically insert network services as needed, scale on demand, and offer linear pricing models for the consumer.
Does this make sense? Tell me what you think.
Thanks,
Jason
Twitter: @jedelman8