Jason Edelman's Blog
  • Home
  • About
  • Contact

The OpenStack Network Node - Layer 3 Agent

6/2/2014

4 Comments

 
When networks are deployed in a box by box model, network admins know exactly what, where, and how something is being configured.  In highly dynamic environments, this may not be the case.  This is why it’s crucial to understand what is really going on behind the scenes.  In OpenStack, there are several components that together are comprised to make OpenStack Networking (aka Neutron).  These include the Neutron server, dhcp agent, metadata agent, L3 agent, and then the agents that would reside in the infrastructure to be programmed (on either physical and/or virtual switches).  For example, in Open vSwitch deployments, there would be a Neutron OVS agent on each host/server.  And this could vary based on which particular vendor plugin is being used too!
In this post, I’m going to mainly focus on the Neutron Layer 3 agent because I had a hard time grasping this one at first.  It turns out that it’s not so bad after all.

When I first started reading about Neutron, I saw many references that there was only one (1) layer 3 agent supported in a given deployment.  That just didn’t seem to make sense because that would surely be a massive chokepoint.  This was the case, but fast forward to the last few releases and multiple layer 3 agents are supported. 

Let’s dive in…

First, we need to understand a layer 3 gateway is required to communicate to the outside world, i.e. the non-OpenStack environment.  Sometimes this is referred to as the external network, provider network, or Internet.  It really varies based on who you’re talking to and if they have any type of network background.

For the rest of this post, I’m going to change it up and go with a Q&A style post to answer a few questions regarding the L3 agent/gateway and the ‘neutron-server’ that helped me understand this a bit better.  Hope it can do the same for others too.

Sample diagram below showing a network node with L3 agent installed.  With three tenants, there would be three virtual routers on the network node.
Picture
High Level showing OpenStack Neutron Network Node. Source: docs.openstack.org


What’s the different between a layer 3 gateway and a logical (or virtual) router? 

This may be where I was stuck initially, but you can take a beefed up Linux machine and call that the Layer 3 gateway.  So, this would be the typical “physical” perimeter router for North/South traffic.  Within, or inside, the physical server/router, there will be logical/virtual routers.  Said a little differently, there will be logical routers instantiated on the layer 3 gateway node.  Going a step further, each logical router is deployed by using Linux Network Namespaces.  The namespaces are analogous to VRFs in the network world. 

Note: the layer 3 gateway could be a physical or virtual machine.  However, it is common to use a physical machine due to potential high network I/O requirements.  But even more common to have all OpenStack services on one or two machines in test/dev OpenStack enviornments!

Tangent: I think my initial struggle to understand this is why would you only have one network node in a multi-tenant environment.  My instinct was to have one (1) virtual router per tenant using virtual machines to offer a scale out, guaranteed performance, network perimeter per tenant.  Turns out this is technically possible, but many OpenStack environments that need to operate at scale would then need to burn up an enormous amount of compute for gateway services.  This would be analogous to NOT using VRFs in the traditional network world.

Okay, that’s layer 3 gateways and logical routers, but what is a gateway service?

A gateway service is one step higher in the level of abstractions being used.  I can best describe this using VMware NSX.  NSX supports up to three (3) layer 3 gateway services.  Each gateway service supports up to ten (10) layer 3 gateways.  Each gateway could support 10s-100s or more (based on capacity) logical routers.  Don’t forget each logical router is a Linux namespace.  The tenant’s virtual router is deployed onto a gateway service which in turn picks a gateway to deploy it on.  The scheduling and the selecting of which gateway to use is still primitive and seems there are things coming in the next release OpenStack/Neutron release to improve this.

Note: OpenStack documents state Neutron itself supports multiple network nodes (layer 3 agents), but I couldn’t find any specifics to compare to the numbers I used for NSX above.  If anyone has these numbers, please feel free to comment below.


Update: While OpenStack Neutron itself supports multiple Layer 3 gateways, the "gateway service" is a VMware NSX specific construct (not Neutron).

Where does the default gateway reside for hosts in a given logical network segment? 

As you may guess, this would be the “inside” interface of the logical router, or network namespace.  The namespace would connect to at least two Open vSwitch bridges within the network node, one that connects to the inside OpenStack environment and one that connects to the “outside world.”

What if a single tenant needs multiple layer 2 segments? 

Multiple logical segments can be attached to a logical router.  In the backend, this just means, the namespace will have more than two interfaces. 

If a logical router can have multiple internal interfaces and route between them, does this mean all Layer 3 traffic between those network segments need to be hair-pinned all the way back to the network node where the L3 agent resides?

Interestingly enough, this was the case, and may still be the case based on a few factors.  But, it is in fact possible to leverage the layer 3 daemon (ovs-l3d) in Open vSwitch to enable and take advantage of distributed layer 3 routing in the kernel.  In this deployment, each local instance (per host) of OVS will use the MAC addresses of the logical router (namespace) interfaces on the gateway node to enable in-kernel routing on each host for all East/West traffic that share a common logical router.  Only traffic destined to the outside world would then need to transit the network node (layer 3 agent).

Note:  if NAT is a requirement, traffic would also need to transit the network node.

Does the layer 3 agent need to be on a dedicated “network node?”

No.  As can be seen by many development/test OpenStack distributions, the layer 3 agent is often found on the “OpenStack Controller” where the neutron-server would also reside.  These test stacks also make it more confusing to understand all of the components because they all reside on the same machine.

Does “OpenStack Controller” have anything to do with an “SDN controller?”

No, no, no.  From a network/neutron perspective, the OpenStack controller is where the ‘neutron-server’ service would reside and basically expose the Neutron APIs for other OpenStack services to consume.  As defined by OpenStack, “neutron-server provides a webserver that exposes the Neutron API, and passes all webservice calls to the Neutron plugin for processing.”  This means that in a SDN or Network Virtualization environment that happens to use a SDN controller, the neutron-server would then be passing the API calls to the SDN controller, i.e. OpenDaylight, NSX Controllers, Cisco APIC, etc.

These technologies are still emerging and if you happen to be someone deeply immersed into OpenStack Neutron and feel I may need make any corrections, please don’t hesitate to leave a comment below.


Thanks,
Jason

Twitter: @jedelman8

4 Comments
Michael
6/2/2014 01:22:10 pm

Hi Jason,
I've got a couple of questions:
1. Does Openstack have Openflow API? In other words how difficult is it to integrate with SDN controllers
2. You mentioned that Openstack is typically deployed on a dedicated physical machine. However I've seen somewhere that Openstack is typically deployed in virtual environments, sometimes using "linux containers". Is your conclusion based on real-world observations?
Cheers,
Michael

Reply
Jason Edelman link
6/2/2014 02:03:51 pm

Hi Michael,

1. As you hopefully read in my last question, neutron-server exposes the Neutron API that then passes the calls to the specific neutron plugin, which in turn configures the network resources. OpenFlow may be used as a southbound protocol for a given plugin, but it's in no way tied to Neutron or OpenStack.

To answer the second part of the question as far as difficulty, it's based on skillset and more importantly documentation of the plugin you'd like to use. Lean on the vendor who's network solution you are trying to use. For example, you will likely find detailed instructions and docs from VMware, Cisco, PLUMgrid, Nuage, BigSwitch and the list can go on if you wish to use their network platforms in an OpenStack environment. The public docs are always a good start. Ultimately, I'd like to say it wouldn't be difficult, but running into a bug or two wouldn't be uncommon :)

Please reference the link for the 'how-to' on configuring specific Neutron plug-ins:

http://docs.openstack.org/admin-guide-cloud/content/section_plugin-config.html


2. Need more clarity. I was specifically referring to the "network node" running the layer 3 agent as a physical machine in prod environments. OpenStack itself is for sure a cloud management platform (CMP) for highly dynamic virtual environments. As mentioned earlier, you can also go ahead and download an open source distro to get started with all services on a SINGLE node (VM) - this is done to ease testing and not require a bunch of kit to begin testing with OpenStack, but could also make things confusing when trying to learn what a solid design should be.

Does this all make sense? Clear as mud?

Thanks,
Jason

Reply
Chris Marino link
6/3/2014 04:15:16 am

The Distributed Virtual Routing (DVR) function you describe (to avoid tenant segment hair-pinning to Network Node) is not available in the current Icehouse release. Its supposed to be included in the upcoming Juno release.

Reply
Jason Edelman link
6/3/2014 04:58:10 am

Hi Chris,

Thanks for the clarification. Point to note for all readers (including myself) is that if solutions are offering this today or something similar, they are using custom vendor extensions/functionality beyond what Neutron can do today.

Reply



Leave a Reply.

    Author

    Jason Edelman, Founder & CTO of Network to Code. 


    Enter your email address:

    Delivered by FeedBurner


    RSS Feed


    Categories

    All
    1cloudroad
    2011
    2960
    40gbe
    7000
    Arista
    Aruba
    Big Switch
    Brocade
    Capwap
    Christmas
    Cisco
    Controller
    Data Center
    Dell Force10
    Embrane
    Extreme
    Fex
    Hadoop
    Hp
    Ibm
    Isr G2
    Juniper
    Limited Lifetime Warranty
    Meraki
    Multicast
    N7k
    Nexus
    Nicira
    Ons
    Opendaylight
    Openflow
    Openstack
    Presidio
    Qsfp
    Quick Facts
    Routeflow
    Sdn
    Sdn Ecosystem
    Security
    Ucs


    Archives

    May 2015
    April 2015
    February 2015
    January 2015
    December 2014
    November 2014
    October 2014
    September 2014
    August 2014
    June 2014
    May 2014
    March 2014
    February 2014
    January 2014
    December 2013
    November 2013
    October 2013
    September 2013
    August 2013
    July 2013
    June 2013
    May 2013
    April 2013
    March 2013
    February 2013
    January 2013
    December 2012
    November 2012
    October 2012
    June 2012
    May 2012
    April 2012
    March 2012
    February 2012
    January 2012
    December 2011
    November 2011


    View my profile on LinkedIn
Photo used under Creative Commons from NASA Goddard Photo and Video