We know virtualization, server and network, offer a means of abstracting the underlying physical hardware. Once the hardware is abstracted though, how much visibility should there be into the virtual networks or virtual servers?
How much information about virtual servers can be gleaned from accessing a physical server OS?
With all of that said, while voice and video traffic are kept local to an Enterprise, network operators have full visibility to this traffic on a hop by hop basis. This comes in handy for those trying to troubleshoot why their CXO just had a call dropped for the third time in two hours. In current forms of network virtualization where an overlay is created between hypervisor switches, this wouldn’t be possible. Rather the network operator would likely troubleshoot a tunnel. Tunnel is up, network is up. Can it be that simple? The big question is, how many and which type of organizations will deploy network virtualization without regard of being able to troubleshoot individual flows in the physical network?
What about carriers and Telcos? What about Internet-based voice applications like Skype? Are those services good enough?
Random : In general APM/NPM are often deployed in a passive manner (via SPAN/tap) using something like RVBD Cascade mentioned above, but for those who have access to the actual servers to load agents on, Boundary has a slick solution that is able to gather flow level data from every server and correlate it together. This makes Boundary a great solution if servers are deployed in multiple data centers and in the public cloud. How did Boundary get that domain name is what I want to know? :)
In a recent post, I talked about Network vMotion and Network Driven DRS. I still believe there could be benefit integrating the virtual network and physical network. In this model, there would be intelligence exchanged between the network virtualization manager (NVP, etc.) and physical top of rack switches, not the full network including Core/Distribution. Just the virtual edge and physical edge.
Resource pools are created within vSphere today and a virtual server admin would not add a new VM to a resource pool that is already at 98%. Why would we allow a VM, or group of VMSs, to be added to physical hosts that connect to a Top of Rack (TOR) switch that already has its uplinks at 98% utilization. Simple answer. We can’t prevent it today because there is no communication between the physical and virtual networks to know this utilization or dynamically react to it, hence Network Driven DRS. Rather than being driven by the network, network I/O should be an attribute the hypervisor uses to calculate proper VM placement.
How do we mitigate this? Can we deploy something like Call Admission Control (CAC) for network virtualization?
- Ensure there is no oversubscription and unlimited bandwidth in the data center using a non-blocking fabric; this may be the recommendation from vendors who sell only network virtualization
- Use optical in the data center to bring bandwidth where it’s needed – a moving partial mesh; this would be the recommendation for those who sell physical networks with optical integration that tie back into the hypervisor
- Load a hypervisor (VMware, etc.) agent on each physical TOR switch (VMware tools-ish) so total bandwidth capacity and utilization was known and bandwidth could be a factor in VM placement; while this sounds interesting, I’m not sure incumbent network vendors would entertain this. Possibly on open platforms or whitebox solutions.
- Because what I described is applicable to network virtualization using VXLAN or VLANs, there could be parallel communication between the hypervisor manager (vCenter) and a focal point of physical network control. Hmmm --- we don’t have controllers in the physical network, yet. This could be an ideal solution once there is a controller in the data center communicating/controlling TOR physical switches. This solution would differentiate a particular vendor’s hardware solution.
- Develop rack awareness for physical hosts and monitor total network bandwidth utilization for all VMs in a given rack. This could be something a server virtualization vendor can accomplish fairly simple with CDP/LLDP and bandwidth tracking.
What do you think? I’d love to hear your thoughts.
Thanks,
Jason
Twitter: @jedelman8