Currently in Migration - Jason Edelman's Old Blog
  • Home
  • About
  • Contact

Leveraging Python on Network Devices to Monitor Interfaces in Realtime

2/9/2014

10 Comments

 
In a recent post, I wrote about some Python work I was testing on the Nexus 3000.  The end conclusion was that open Linux platforms will offer more flexibility --- for the consumer of the technology, ultimately the customer.  In this post, we’ll take a look at an example that integrates Python with the native Linux operating system.  
In the context of networking, the question often arises, what does having access to Linux really gain you?  For one, as you can see from my last post and you’ll see in this one, for native scripting within bash and Python is of extreme value in itself, not to mention you’d also have the ability to load any piece of software you want to that is compatible with Linux (think about tools, mgmt/monitoring platforms, etc.).

Okay, so you’d have the ability to use Python on a network switch.  So what?  What about running onboard analytics on the switch?  What about sending the exact data you need, the data you use to troubleshoot, the data part of your operational workflow, directly upstream to a head end server, or just simply to an existing syslog server.  Think about getting exactly what you want from each device rather than being told what you can have access to.

In this example application, the goal was 3-fold:  1) Build a lightweight network centric client/server application, but actually make the server optional 2) Learn and test basic Python network socket programming and 3) Run some basic analytics on the switch

The use case was detecting errors on an interface and being alerted if there were more than a certain amount of transmit errors in a given time interval.  As you’ll see in the code, I was detecting more than 3 packets sent in a 5 second interval, but the errors are there.  Using packets sent was easier to simulate and test.  It’s possible for SNMP to be polling for data like this, but usually the polling intervals are too long to actually capture the data desired.  Here, the agent on the Linux device that I call FBIAgent, will constantly be examining the packets/errors, etc. and will alert the head end server  (no polling required in this case).  As stated earlier the goal was to make the head end optional.  It’s actually optional because instead of triggering an alert through the socket built in Python, you can simply generate a normal syslog message if that is easier, and more importantly REACT on the event!

The high level architecture and flow goes like this:

  • FBIAgent actively issues a ‘netstat –i’
  • The result is then stored in a file.  This didn’t have to be stored in a file, but felt it was easier to parse individual versus parsing a string.
  • The result from the file is then parsed and all interfaces starting with ‘eth’ are entered into individual dictionaries.  By the way, dictionaries are pure awesomeness in Python
  • Actually, a dictionary of dictionaries is also created, just in case it may be needed in future applications
  • The dictionary is made up of key/value pairs of everything in the netstat –i output for each Ethernet interface.  Note: only the packets sent/errors were needed for this application, but storing it as a dictionary makes the other data accessible if ever needed like receive packets, receive errors, etc.
  • My interest was in monitoring Eth0 – so this application is specifically tracking Eth0’s total packets sent.  Remember the purpose is to track errors, but to simulate I’m using the total packets.  Since the same dictionary has the key/value pair for errors, it would be extremely easy to change.
  • If there are more than 3 packets (read errors) sent in the last 5 seconds, a trigger is sent to the server.
  • Due to the default nature of sockets in Python, the data sent over the socket needs to be sent as a string.  Ideally, I wanted to send as a dictionary, but this isn’t possible, so you’ll notice I sent it as a string.
  • In a future version, I may look into using a JSON-RPC library, so the data can be serialized and sent over the wire in JSON and this will eliminate the funkiness of what I’m doing today.
  • Then the server receives the data over the socket.  The data is received as a string, but then it is parsed back into a dictionary (funkiness) and output it in JSON format on the server.
  • The process repeats every time there are more than 3 packets sent in a 5 second interval.

I’m on a loaner laptop and don’t have access to Camtasia right now to create a video, so here are some screen shots of the client and server output.  

Note: The slides also have the code used to create this.  The code isn't very easy to read in the slides, but my blog platform is making it very difficult to post source code.  Hope to update it soon.  
I’ll never claim my code to be perfect or best practice, since I’m learning as I go, but feel free to comment below and let me know what you think about this or more generally about agent based network monitoring and analytics tools running on network devices.

By the way, this testing was done a server and will be tested soon on Arista EOS and hopefully Cumulus Linux.

Thanks,
Jason

Twitter:  @jedelman8
10 Comments
Michael Bushong link
2/10/2014 12:40:07 am

This is fantastic stuff, Jason.

Beyond the obvious (that you are providing examples of real-world automation), I want to point out that this is exactly the kind of thing that people need to be thinking about. SDN comes along, and a lot of people start looking at the future (ephemeral-state-based networks, machine learning, fully orchestrated systems). But there is a lot of ground to cover between where people are and the nirvana that is SDN.

There are practical things that people can do within their current architectures. This type of work is perhaps more impacting at times because it is approachable. And it helps network engineers add new skills to their repertoire.

-Mike

Reply
Jason Edelman link
2/10/2014 11:28:15 am

Mike, thanks for the comment and I agree on all fronts. Really hoping these examples start to show what's possible because users shouldn't be limited by a tool or a CLI, even if it ends up being 80% COTS and 20% custom development.

Reply
Todd Craw
2/14/2014 04:57:54 am

Really good example of the type of functions that modern network OS need to offer and why. It is also an example of the direction that Network Engineer skills have to start to move toward. Just being CLI jockeys is not going to cut it in 3-5 years IMHO. Great stuff Jason!

Reply
Jason Edelman link
2/14/2014 10:12:43 pm

Thanks, Todd. Yep, time will tell what skills will be needed for the future. I'm sure you'd like it if more network guys brush up on their Linux skills ;)

Reply
Brandon
2/14/2014 06:12:51 am

Great post. This is what network engineers of the future will be doing(its amazing what you can do with a decent amount of bash/python knowledge). Is there a Github/Gist that you could paste the code on?

Reply
Jason Edelman link
2/14/2014 10:11:25 pm

Thanks, Brandon. It is true, the more knowledge gained in bash/python, the more you can do. Until then though, if you ever have an issue, find yourself frustrated, or find yourself doing routine tasks, ask yourself why? Maybe it's not even network related. There is always a solution and who knows, maybe Linux and Python can help. I'm currently working with customers who don't have perceived problems, but spend lots of time logging into devices to get certain information out of them.

Unfortunately, I don't have the code on github yet. Soon though! If you'd like email me at jedelman8 at gmail dot com and I'll send it your way.

Reply
mat
2/25/2014 04:07:16 pm

hi jason,

i read your source code from the post, and client side, i see you read a file named 'ifacelog', and analysis the file.

so my question is, how do you get the file? it generated by another scripts? can i get the source code?

thanks.

Reply
Jason Edelman link
5/28/2014 04:15:34 am

Mat,

So sorry for not responding sooner. I must have overlooked your comment.

You may have already figured this out, but when I opened the file, I used "w+" which means the following:

w+ : Opens a file for both writing and reading. Overwrites the existing file if the file exists. If the file does not exist, creates a new file for reading and writing.

It probably could have been stored as a string too and then parsed. I used the file...well, just because :)

I would need to dig up the source code as it has been a few months, but if you're still interested, please write in via the contact page.

Thanks,
Jason

Reply
Network Devices link
4/28/2014 02:14:31 am

Great post! Been reading a lot network devices recently. Thanks for the info here!

Reply
Brant link
3/5/2015 05:59:45 am

Thanks so much for this post! I've been doing a lot of research on this, and this was a really great read. Keep up the great work on this blog!

Reply



Leave a Reply.

    Author

    Jason Edelman, Founder of Network to Code, focused on training and services for emerging network technologies. CCIE 15394.  VCDX-NV 167.


    Enter your email address:

    Delivered by FeedBurner


    Top Posts

    The Future of Networking and the Network Engineer

    OpenFlow, vPath, and SDN

    Network Virtualization vs. SDN

    Nexus 7000 FAQ

    Possibilities of OpenFlow/SDN Applications 

    Loved, Hated, but Never Ignored #OpenFlow #SDN

    Software Defined Networking: Cisco Domination to Market Education

    OpenFlow, SDN, and Meraki

    CAPWAP and OpenFlow - thinking outside the box

    Introduction to OpenFlow...for Network Engineers


    Categories

    All
    1cloudroad
    2011
    2960
    40gbe
    7000
    Arista
    Aruba
    Big Switch
    Brocade
    Capwap
    Christmas
    Cisco
    Controller
    Data Center
    Dell Force10
    Embrane
    Extreme
    Fex
    Hadoop
    Hp
    Ibm
    Isr G2
    Juniper
    Limited Lifetime Warranty
    Meraki
    Multicast
    N7k
    Nexus
    Nicira
    Ons
    Opendaylight
    Openflow
    Openstack
    Presidio
    Qsfp
    Quick Facts
    Routeflow
    Sdn
    Sdn Ecosystem
    Security
    Ucs


    Archives

    May 2015
    April 2015
    February 2015
    January 2015
    December 2014
    November 2014
    October 2014
    September 2014
    August 2014
    June 2014
    May 2014
    March 2014
    February 2014
    January 2014
    December 2013
    November 2013
    October 2013
    September 2013
    August 2013
    July 2013
    June 2013
    May 2013
    April 2013
    March 2013
    February 2013
    January 2013
    December 2012
    November 2012
    October 2012
    June 2012
    May 2012
    April 2012
    March 2012
    February 2012
    January 2012
    December 2011
    November 2011


    RSS Feed


    View my profile on LinkedIn
Photo used under Creative Commons from NASA Goddard Photo and Video