Ansible Performance – Moving to Jinja2 for Automated Documentation

When Ivan Pepelnjak has advice for you – take it!

I wrote a post about untangling dynamic nested loops in Ansible.

In another recent post about trying to improve Ansible performance I didn’t get very far – but this could be the silver bullet I’ve been looking for to both optimize and make my Fact / Genie parsing playbooks more elegant code but also to bring my run times down so I can bring this from the lab to production.

Jinja2 Templates

One of the reasons why I perked up at Ivan’s generous suggestion is because I am a big fan and heavy user of Jinja2 templates already to generate intended configurations (Cisco IOS, NXOS configurations; JSON files for API POST) and documentation (intended configs in CSV, markdown, and HTML) – but I had just never thought of implementing them to create my documentation from received data!

My old way involved taking the structured JSON and using lineinfile or copy to create my output files. This was slow. Very slow.

Copy method:

Line In File method:

How to refactor this?

So I already have everything I need content wise – a header row and the data rows – I just need to move this into Jinja2 format. As it turns out there are some added benefits beyond just performance that I will highlight.

My quick use case was my CiscoNXOSFacts.yml playbook against 2 7Ks just gathering facts (nxos_facts) and transforming the structured JSON into business documentation.

– Create Nice JSON file from facts – Ansible | to_nice_json filter
– Create Nice YAML file from facts – Ansible | to_nice_yaml filter
– Create CSV file from facts
– Create markdown file from facts
– Generate HTML from markdown

So the first refactoring is the actual task from using copy or lineinfile to using template. Template needs a source (a new Jinja2 template file we will create in our next step).

Template also needs a destination. Here is where we can use the programmatic capabilities of Jinaj2 to simplify, optimize, and massively improve performance by setting up a simple loop and create both files. Wait files plural? Yes. My old way involved creating 2 separate files in 2 separate tasks. Now that I am using Jinja I can use variables – one item being “csv” and the other item being “md” – and pass them to the template for processing.

So create a Jinja2 template file called CiscoNXOSFactsTemplate.j2 to create your CSV and Markdown files.

Before I show the template I want to highlight another massive improvement to using Jinja2 – Jinaj2 is able to iterate naturally over dictionaries while my previous method had to pass the structured JSON through the | dict2items Ansible filter (against adding processing time). This simplifies the code quite a bit.

In the template we will test if the loop is on csv or md and create either a csv or md formatted output file.

Else if item is md create the markdown file format

One last and very important comment and benefit of Jinja2 is that I do not need to use Regular Expressions “as much” to clean up the JSON. | dict2items leaves a lot of garbage JSON characters behind which I had to previously use processor intensive RegEx tasks to clean up. Now Jinja2 does this cleanup and conversion from RAW to Nice JSON for me!

Results

I have only tested 1 playbook but I am very excited about this new refactored code !

Again this playbook “only” touches 2 physical devices but I have playbooks that potentially could be gathering facts and generating artifacts for hundreds of devices. But the results are pretty clear particularly the system time

Old way:

New way:

So roughly half the “real” time but look at the system time – from 36 seconds down a third to 12 seconds! WOW!

Thanks again!

A big thanks to Ivan for taking the time to comment and point me in a better direction. You may not know this but when I started my automation journey one of my resources along with several books, Cisco DevNet, trial and error, was my IPSpace.net subscription. If you are looking for a very affordable and very comprehensive library of networking and automation knowledge this is a good place to start.

Can we make Ansible go faster?

When I describe Ansible to people I tend to use many positive adjectives. Amazing, incredible, easy, revolutionary, powerful, and a few others. One adjective I never use, however, is fast. I would not describe Ansible as a high performance tool. Compared to manually doing the things I’ve come to automate with Ansible there is still no doubt I am saving hours if not days of effort. But now that I’m using Ansible for almost everything and at scale it would be great if I could get better performance out of the tool.

Over the years I’ve learned to run the ansible-playbook command then – and chant it like the late night informercial – “Set it and forget it!”

Its the one, sometimes painful, drawback I can find with Ansible. There has yet to be an infrastructure problem I have not been able to solve with Ansible – provided I am comfortable with waiting. “How long does this take?” change managers or operations will ask. “A while.” Is usually as optimistic as I can be.

(Note: It is a bit ironic sometimes the same crowd with complaints about how long a playbook takes to run are usually the same people who were comfortable with pre-automation manual-at-the-cli-of-every-device execution times into the days or weeks. Now anything more than a 10 minute playbook run seems like a long time. Go figure)

TL:DR

– Ansible is an amazing automation tool
– Ansible is not known for its performance
– Three modifications tried to make it go faster
– LibSSH
– Forks
– Pipelines
– No real improvements found with any of the above

Moving to LibSSH

The driving factor for me to bring my Ansible ecosystem into the shop and put it up on the lift to get underneath and into the mechanics of the configurations is this latest official blog post from Ansible.

“Not only is the new LibSSH connection plugin enabling FIPS readiness, but it was also designed to be more performant than the existing Paramiko SSH subsystem.”

This particular section of the blog post is what drives my exploration today. And yes FIPS readiness is important to me – the hook for me here is “designed to be more performant” – and yes the link they provide is great but I want to take the Pepsi Challenge myself.

Playbooks tested

I will be using the following playbooks with 2 different scale sets.

Cisco IOS Facts – Against my Lab distribution layer (4, Cisco 4500s) and my access layer (about 20 – 25 devices of various Catalyst flavours (2960, 3560, 3750, 3850, 9300)).

Cisco NXOS Facts – Same idea but against NXOS. 2 Nexus 7000 and 2 Nexus 5000.

The above playbooks use the Ansible facts modules. Let’s do some Genie Parsing of a show command as well.

IOS Genie show ip interface status

Methodology

In Linux you can use the time keyword command and prepend any command. Linux then provides three different timer – the real time, the user time, and the system time – results showing how long the command took to execute.

Result Set #1 – Defaults

With no changes to default Ansible here are the results. I will be standardizing on the sys results because of the input and other factors the real times and users times may have deviations:

Install LibSSH and modify Ansible

First step is to pip install the Ansible library we need:

Then we to update our persistent connections:

Refresh your Git repo and re-run the playbooks.

Result Set #2 – LibSSH

This image has an empty alt attribute; its file name is image-72.png
This image has an empty alt attribute; its file name is image-74.png
This image has an empty alt attribute; its file name is image-75.png
This image has an empty alt attribute; its file name is image-77.png
This image has an empty alt attribute; its file name is image-81.png

Forks

Ansible can be set to fork which allows multiple independent remote connections simultaneously.

Forks are intelligent and will only fork for the maximum number of remote targets. So I will set my forks in ansible.cfg to 50.

Now I don’t think this will help playbooks with under 5 targets because I believe Ansible defaults to 5 forks but maybe this will improve the Access Layer Facts which targets around 25 hosts. So lets just test against that one playbook.

Results Set #3 – Forks

This image has an empty alt attribute; its file name is image-74.png

Pipelining

Enabling pipelining reduces the number of SSH operations required to execute a module on the remote server, by executing many ansible modules without actual file transfer. According to the documentation this can result in a very significant performance improvement when enabled.

Add pipelining=true to the SSH connection section in ansible.cfg:

Result Set #4 – Pipelining

This image has an empty alt attribute; its file name is image-72.png
This image has an empty alt attribute; its file name is image-74.png
This image has an empty alt attribute; its file name is image-75.png
This image has an empty alt attribute; its file name is image-77.png
This image has an empty alt attribute; its file name is image-81.png

Summary

I didn’t have much success making Ansible go any faster either with the new LibSSH library, with forking, or with pipelining. I absolutely love ansible but the above times are from my small scale lab – imagine these run times in production at 5-10x the scale.

Have you found a way to make Ansible go faster that I’ve overlooked? Drop my a line!

Sean also jumped in to mention the driver for LibSSH was FIPS not performance and there are some performance improvements coming soon! Great!