Currently in Migration - Jason Edelman's Old Blog
  • Home
  • About
  • Contact

Visibility in the Network Going Forward

4/14/2013

0 Comments

 
What sort of insight should the physical network fabric offer network operators when it comes to deploying network virtualization? It is a great question and the answer is really going to vary based on who answers it.  Martin Casado and co. recently voiced their perspective here.  As always, Martin’s blogs are a great read and I encourage you to follow him at NetworkHeresy if you aren't already, although there haven’t been many posts since the Nicira acquisition.  Looks like he is making it a community based blog going forward, so let’s hope to see more soon.

We know virtualization, server and network, offer a means of abstracting the underlying physical hardware.  Once the hardware is abstracted though, how much visibility should there be into the virtual networks or virtual servers?  
As I said in before, I think it’ll come down to the type of customer and what they require.  For a large scale muti-tenant public cloud, they will likely view the network as an IP fabric.  For Enterprise, I could see them wanting to maintain end to end visibility because that is what they know and for the fear of not being able to troubleshoot in the case of a problem.  But in the long run, that could likely change.  Time will tell.
How much information about virtual servers can be gleaned from accessing a physical server OS?

While I usually agree with much of what Martin says, I’m not sure I like comparing voice to network virtualization.   The part that I don’t agree with would be for the Enterprise customer.  As he has stated with advanced codecs, abundant bandwidth, and some QOS, voice on an IP network has come a long way with no longer discussing RSVP when it comes to IP Voice.  Phone calls are peer to peer, with call setup being done by a call controller just like a tunnel setup in network virtualization.   One could draw even further similarities with call conferencing that require multicast to join multiple parties to what the current control plane, or lack thereof, looks like in some forms of network virtualization. 

With all of that said, while voice and video traffic are kept local to an Enterprise, network operators have full visibility to this traffic on a hop by hop basis.  This comes in handy for those trying to troubleshoot why their CXO just had a call dropped for the third time in two hours. In current forms of network virtualization where an overlay is created between hypervisor switches, this wouldn’t be possible.  Rather the network operator would likely troubleshoot a tunnel.  Tunnel is up, network is up.  Can it be that simple? The big question is, how many and which type of organizations will deploy network virtualization without regard of being able to troubleshoot individual flows in the physical network?

What about carriers and Telcos?  What about Internet-based voice applications like Skype?  Are those services good enough?

That doesn’t paint the best picture because some visibility can still be maintained within network virtualization, but it must be done local on the vswitch.  For example, NetFlow data can still be pulled from the virtual switch.  In addition APM/NPM tools such as Riverbed Cascade (as an example) can be deployed on each physical host to correlate data entering the tunnel (pre-encapsulation)  and exiting the tunnel (post- decapsulation) correlating a single flow within a particular tenant.  This would allow a network operator to see end to end latency, packet loss, along with full traffic analysis/statistics.  If SLAs were being offered, it would still be simple to prove they are being adhered to or broken.  This type of network visibility could be offered under or over the cloud.  If the physical network had different paths in their IP Core, they could then *possibly* shift where that traffic appropriately based on traditional routing metrics.

Random : In general APM/NPM are often deployed in a passive manner (via SPAN/tap) using something like RVBD Cascade mentioned above, but for those who have access to the actual servers to load agents on, Boundary has a slick solution that is able to gather flow level data from every server and correlate it together.  This makes Boundary a great solution if servers are deployed in multiple data centers and in the public cloud.  How did Boundary get that domain name is what I want to know? :)

In a recent post, I talked about Network vMotion and Network Driven DRS.  I still believe there could be benefit integrating the virtual network and physical network.  In this model, there would be intelligence exchanged between the network virtualization manager (NVP, etc.) and physical top of rack switches, not the full network including Core/Distribution.  Just the virtual edge and physical edge. 

Resource pools are created within vSphere today and a virtual server admin would not add a new VM to a resource pool that is already at 98%.  Why would we allow a VM, or group of VMSs, to be added to physical hosts that connect to a Top of Rack (TOR) switch that already has its uplinks at 98% utilization.  Simple answer.  We can’t prevent it today because there is no communication between the physical and virtual networks to know this utilization or dynamically react to it, hence Network Driven DRS.  Rather than being driven by the network, network I/O should be an attribute the hypervisor uses to calculate proper VM placement. 

How do we mitigate this?  Can we deploy something like Call Admission Control (CAC) for network virtualization?

  1. Ensure there is no oversubscription and unlimited bandwidth in the data center using a non-blocking fabric; this may be the recommendation from vendors who sell only network virtualization 
  2. Use optical in the data center to bring bandwidth where it’s needed – a moving partial mesh; this would be the recommendation for those who sell physical networks with optical integration that tie back into the hypervisor
  3. Load a hypervisor (VMware, etc.) agent on each physical TOR switch (VMware tools-ish) so total bandwidth capacity and utilization was known and bandwidth could be a factor in VM placement; while this sounds interesting, I’m not sure incumbent network vendors would entertain this.  Possibly on open platforms or whitebox solutions.
  4. Because what I described is applicable to network virtualization using VXLAN or VLANs, there could be parallel communication between the hypervisor manager (vCenter) and a focal point of physical network control.  Hmmm --- we don’t have controllers in the physical network, yet.  This could be an ideal solution once there is a controller in the data center communicating/controlling TOR physical switches.  This solution would differentiate a particular vendor’s hardware solution.
  5. Develop rack awareness for physical hosts and monitor total network bandwidth utilization for all VMs in a given rack.  This could be something a server virtualization vendor can accomplish fairly simple with CDP/LLDP and bandwidth tracking.

What do you think?  I’d love to hear your thoughts.

Thanks,
Jason

Twitter: @jedelman8

0 Comments



Leave a Reply.

    Author

    Jason Edelman, Founder of Network to Code, focused on training and services for emerging network technologies. CCIE 15394.  VCDX-NV 167.


    Enter your email address:

    Delivered by FeedBurner


    Top Posts

    The Future of Networking and the Network Engineer

    OpenFlow, vPath, and SDN

    Network Virtualization vs. SDN

    Nexus 7000 FAQ

    Possibilities of OpenFlow/SDN Applications 

    Loved, Hated, but Never Ignored #OpenFlow #SDN

    Software Defined Networking: Cisco Domination to Market Education

    OpenFlow, SDN, and Meraki

    CAPWAP and OpenFlow - thinking outside the box

    Introduction to OpenFlow...for Network Engineers


    Categories

    All
    1cloudroad
    2011
    2960
    40gbe
    7000
    Arista
    Aruba
    Big Switch
    Brocade
    Capwap
    Christmas
    Cisco
    Controller
    Data Center
    Dell Force10
    Embrane
    Extreme
    Fex
    Hadoop
    Hp
    Ibm
    Isr G2
    Juniper
    Limited Lifetime Warranty
    Meraki
    Multicast
    N7k
    Nexus
    Nicira
    Ons
    Opendaylight
    Openflow
    Openstack
    Presidio
    Qsfp
    Quick Facts
    Routeflow
    Sdn
    Sdn Ecosystem
    Security
    Ucs


    Archives

    May 2015
    April 2015
    February 2015
    January 2015
    December 2014
    November 2014
    October 2014
    September 2014
    August 2014
    June 2014
    May 2014
    March 2014
    February 2014
    January 2014
    December 2013
    November 2013
    October 2013
    September 2013
    August 2013
    July 2013
    June 2013
    May 2013
    April 2013
    March 2013
    February 2013
    January 2013
    December 2012
    November 2012
    October 2012
    June 2012
    May 2012
    April 2012
    March 2012
    February 2012
    January 2012
    December 2011
    November 2011


    RSS Feed


    View my profile on LinkedIn
Photo used under Creative Commons from NASA Goddard Photo and Video