Today, we define how much CPU, RAM, and disk space a virtual server gets. What do we define for virtual networks? I define virtual networks as not just a vswitch, but all related L2-7 services.
When virtual networks are created, several properties should be defined. Overall throughput of a network node, for example. This can be enforced at a variety of places, but why not keep it simple and enforce at network edge. If we had real network virtualization, an aggregate throughput could be defined per vswitch (vswitch per tenant), but since that doesn’t exist, it’ll have to be per interface or on a L4-7 device. Embrane can do this very nicely on their devices as they spin up new network services VMs.
Security and QOS are defined today in port groups that are relevant to the virtual switches. We need these abstractions to be tied into the physical network. Some physical switches support port profiles, but they are device specific as far as I know. These abstractions should be configured data center wide. In the case of Cisco, if the VSM managed more than the 1KV, this would be there already.
P2V
We should be able to profile networks and P2V them. This may entail making changes to the physical network (based on the vendor), but it should be an automated process. Will VMware or someone create a Converter that does this? How complex is it defining network pools, vApp networks, organization, and external networks? Those naming conventions will be different for every M&O tool out there. Being able to P2V a network would sure simplify our lives over the coming years.
Once a network is virtualized, the network should be able to be cloned and replicated with a final template created. Once in production, a snapshot can be taken. Imagine a network or data center wide rollback; it could also be helpful for compliance and audit teams to examine the configuration or even the traffic that was traversing a given network services VM at a selected point in time. Imagine what this can do for re-building a network used for Disaster Recovery. Based on whose offering it is, network templates may include components within the physical network.
The next two areas focus on bridging the gap between the physical and virtual network.
Network vMotion
Without having the money Google had to engineer their own switches and TE controller that aided them in achieving near maximum bandwidth utilization for their WAN links, I have to ask --- does it make sense to move away from port channels for traffic flow distribution? TOR switches have up to 16 x 10GE uplinks today. In a port channel, one of many hashes can be used --- but, are switch based hashes the best we can do? Can we look at what applications (who is communicating) are being used, then intelligently distribute the flows over 16 uplinks? This could be similar to what Plexxi is doing. Start thinking about Affinities. I call this “Network vMotion” because as a VM can be moved between hosts while it is in production, in actuality, the network flow is being moved as well. Without moving the VM, we should have the ability to modify which uplink is used for a certain flow. This would require a feature on TOR or edge based switches. VMware recently announced load based hashing coming out of a physical server. This may be a good start. Another added benefit based on your vantage point is that this eliminates the need to troubleshoot LAG, LACP, etc. between multiple vendors.
Network Driven DRS
Building on the previous example which was called Network vMotion, now we can implement automated vMotion for networks based on VM network I/O, namely Network Driven DRS. Based on who is communicating, can we try and bring these systems closer together rather than just look at Compute (Memory/CPU) resources. First, let’s put them on the same physical host (if possible). If not, let’s get them on a different server, but within the same rack (if possible). This means we need rack awareness. Lastly, that’s it! In large networks, you will need a CLOS based fabric --- or something like Plexxi.
It would still be nice to see a network hypervisor – I’m sure we will one day.
This is the second post I didn’t mention SDN. I think that is a positive thing because the focus is shifting to more concrete examples and relevant use cases. For now, I hope and assume any device (physical or virtual) going forward will have programmatic interfaces to be able to achieve maximum agility to meet the demands of those application and business owners.
As always, we’ll have to wait and see how this will all pan out in the end.
Regards,
Jason
Follow me on Twitter: @jedelman8