Getting VIP and Server Farm stats from a Cisco ACE Load Balancer

Sunday, 15 Sep 2019

Getting VIP and Server Farm Stats from a Cisco ACE Load Balancer

As a continuing reminder that we don't all work in Cool Kid Hipster Service Mesh-using Companies ("Kuber-Hetes? She's that 'Seven Stages of Grief Model' author, yeah?"), some of us still work in a fairytale land of Managed Service Providers, ITIL and old kit - like the Cisco ACE30 Load Balancer. At $work, I've got four of these bad boys; two per Data Centre (I know, I know - "Psscht, all the Cool Kids do Cloudless now..."), hosted inside a Cisco Catalyst 6500-series Chassis that does little else than power and water provide backplane for these ACE30 Modules.

 undefined

An ACE what-what now?

An Application Control Engine 30 (because there were 29 before that's how Cisco number things) was Cisco's prime Load Balancing offering, right after the ACE20, and about the same time as the ACE4710 Appliance; shortly before Cisco saw the writing on the wall in the ADC market and promptly exited Load Balancing entirely, stage right. But no matter, those of us who work for companies that have been around for longer than twenty minutes will likely have encountered one of these, and as it's so old, might be thinking of moving it to something more modern, classy, and less EoL/EoS.

While we're doing that, we may as well take the opportunity to clean up all the old cruft that has built up in it over the years; or in ACE-speak, that's:

  • Unused Virtual IP (VIP) Addresses
  • Unused Server Farm (SFarm) Pools/IP Addresses
  • Unused Real Server (RServer) IP Addresses

So what better time to pretend you're a DevOps Cool Kid and break out some Scripting Foo and scrape those stats and figures automagically!

Telnet-scraping: Never as easy as you think

My first attempt was flawed because I assumed Cisco might be up to their old tricks, and provide nothing but a Telnet interface in - which I wasn't half wrong at, because as these things go in the Enterprise real-world (with a variety of MSPs and technical silos running things like Firewalls, Networks, Servers, Data Centres), you get things like:

  • Only one of the two Data Centres lets you through the Firewall to Telnet to those ACE Load Balancers
    • From certain Source IPs in one and other IPs in another
  • Nobody ever bothered to initialise the RSA Keypair, so SSH doesn't work
  • We couldn't afford the separate ACNM NMS-like Solution to monitor all these
    • Because in old Cisco-land, an NMS was just a software 1:1 extension of the product; they ain't making no money having you abe to manage it from one of the many existing Cisco-based NMS Platforms* you've already got
  • Web Browsing to it needs to be done via RDPing to a Box behind the Firewall, and then opening a browser as old as Internet Explorer 9
    • At which point you're met with a hideously basic Page that provides little more than an XML DSD Schema

* = I grew up with CiscoWorks as an NMS for everything; I quickly realised it was just a poorly cobbled-together set of IBM bits, and Java crap - and unlike it's plucky name, it rarely ever did (work).

So, you start a Telnet scraper script in PHP - easy enough, you've done this before, and have a box able to run PHP and Telnet to the ACE Load Balancers via the poorly-made Firewall net (by luck, rather than Design). Roughly three hours in, you realise that it's got some weird non-standard Telnet Control Characters everywhere, so your "Expect Scripts" (Send <Username>; Wait n seconds; Second <Password>...) aren't gonna do jack. Hmm, not good; let's go back to the drawing board - didn't it have a HTML UI again?

undefined

Get to the Code already!

It does have an HTML UI, but no obvious clue as to what you can do with it... But that DSD Schema download thing, that's XML isn't it? Why would you provide an XML Schema, unless... *Ten minutes of Googling later* AHHH! It's got an XML-based API! One where someone has been through this pain before with.

The XML-API

It's not as well documented as the newfangled HTTP-based REStful APIs, but ignoring the configuration-set based options, for "show" commands there are two styles of data retrieval:

  1. Get via a Cisco IOS-like "show" command
    1. xml_cmd=<request_raw>show context | inc Name</request_raw>
  2. Get via an element in the XML DSD hierarchy
    1. xml_cmd=<request_xml context-name="ContextName"><show_serverfarm info-level="detail"/></request_xml>

Unfortunately, because Cisco's gonna Cisco, much like how their own Business Units rarely seem to talk to each other, so too does the XML API have some inconsistencies such as, for the "detailed flags" (i.e. "show command detail-flag-here"):

  • Sometimes it might get called "info-level"
    • xml_cmd=<request_xml context-name="ContextABC><show_serverfarm info-level="detail"/></request_xml>
  • But other times it might get called "info-detail"
    • xml_cmd=<request_xml context-name="ContextABC"><show_service-policy info-type="summary"/></request_xml>

The Script

Finally, onto the script. It's coded in PHP for no other reason than I'm familiar with it; it could easily be ported to a cool kid language like Python; the concepts are transferable. You'll note from the Input and Output Filename Constants (i.e. ACE_FILE), it's designed to be run on a Windows box; note that, with PHP on Windows, you have to flip-around the File Path designators from "\" to "/"; I don't know if the same is true for other languages, such as Python, on Windows:

  • This Path
    • D:\Folder\file.txt
  • Becomes this in a PHP-on-Windows variable
    • D:/Folder/file.txt

Whereas in a *NIX distro, this would likely just be something like:

/home/script/file.txt

Script Inputs

  • CSV file of all ACE Management Details
    • Variable (Constant):ACE_FILE
    • File: ace_ip.csv
    • Type: CSV file
      • Formatted like "ace_hostname,ace_mgmt_ip,ace_user,ace_pass", i.e.:
        • loadbalancer01,10.99.0.1,nmsuser,Pasword2019

Script Outputs

  • CSV file of all VIP stats on all Contexts of all ACE Load Balancers
    • Variable (Constant): OUTPUT_FILE_VIP
    • File: ace_serverfarm_stats_<Year>-<Month>-<Day>.csv
    • Type: CSV file
      • Formatted like "load_balancer,context,name,state,address,protocol,port,curr_conns,drop_conns,hit_count", i.e.:
        • loadbalancer01,ContextABC,CM-VIPABC,OUT-SRVC,10.99.0.2,tcp,443,0,0,0
  • CSV file of all Server Farm stats on all Contexts of all ACE Load Balancers
    • Variable (Constant): OUTPUT_FILE_SF
    • File: ace_vip_stats_<Year>-<Month>-<Day>.csv
    • Type: CSV file
      • Formatted like "load_balancer,context,name,type,state,description,predictor,rserver,address,port,state,curr_conns,total_conns", i.e.:
        • loadbalancer01,ContextABC,SF-Group1,HOST,ACTIVE,"Serverfarm for ServersGroup1",ROUNDROBIN,H.10.98.0.2,10.98.0.2,80,OPERATIONAL,0,165

Script Code

<?php
# Cisco ACE Load Balancer Stats Scraper via XML-API v0.2
# Description: Scrape the Server Farm VIP Stats from all Contexts on a Cisco ACE Load Balancer
# Input: (CSV Header) ace_hostname,ace_mgmt_ip,ace_user,ace_pass
# Author: notworkd.io
# Created: 12-Sep-2019

# Define constants
# Local IP Addresses CSV File
define("ACE_FILE","D:/ace-stats/ace_ip.csv");
# Process ACE Server Farm Stats CSV File
define("OUTPUT_FILE_SF","D:/ace-stats/ace_serverfarm_stats_".date("Y-m-d").".csv");
# Process ACE VIP Stats CSV File
define("OUTPUT_FILE_VIP","D:/ace-stats/ace_vip_stats_".date("Y-m-d").".csv");

# Define variables
$i = 0;
$outputfilea_content = null;
$outputfileb_content = null;

# Main program
# Functions
# Cisco ACE Load Balancer XML-API Call to get Contexts as array
function getCiscoAceApiContexts($ace_ip, $ace_user, $ace_pass) {
 # Initiate cURL Session
 $ch = curl_init();

 # Setup cURL Options
 curl_setopt($ch, CURLOPT_USERPWD, $ace_user.":".$ace_pass);
 curl_setopt($ch, CURLOPT_URL, "http://".$ace_ip."/bin/xml_agent");
 curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
 curl_setopt($ch, CURLOPT_POST, 1);
 curl_setopt($ch, CURLOPT_POSTFIELDS, "xml_cmd=<request_raw>show context | inc Name</request_raw>");
 curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);

 # Perform cURL Data Get
 $curl_data = curl_exec($ch);
 # Close cURL Session
 curl_close($ch);
 
 # Match Context name out, parse as array
 # Format: Name: Admin , Id: 0
 $api_xml = new SimpleXMLElement($curl_data);
 preg_match_all("/Name: (.*)\\s{1,}, Id(.*)/", $api_xml->exec_command->xml_show_result, $api_regex, PREG_PATTERN_ORDER);
 
 # Return each Context Name as array element
 return $api_regex[1];
}

# Cisco ACE Load Balancer XML-API Call to get Server Farms as array
function getCiscoAceApiServerFarms($ace_ip, $ace_user, $ace_pass, $ace_context, $ace_hostname) {
 # Initialise variables
 $output = null;
 
 # Initiate cURL Session
 $ch = curl_init();

 # Setup cURL Options
 curl_setopt($ch, CURLOPT_USERPWD, $ace_user.":".$ace_pass);
 curl_setopt($ch, CURLOPT_URL, "http://".$ace_ip."/bin/xml_agent");
 curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
 curl_setopt($ch, CURLOPT_POST, 1);
 curl_setopt($ch, CURLOPT_POSTFIELDS, "xml_cmd=<request_xml context-name=\"".$ace_context."\"><show_serverfarm info-level=\"detail\"/></request_xml>");
 curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);

 # Perform cURL Data Get
 $curl_data = curl_exec($ch);
 # Close cURL Session
 curl_close($ch);
 
 # Match Server Farm sf_entry (parent) and Real Servers sf_rs_entry (child) attributes out, parse as array
 # Format sf_entry: name,type,sf_reals,sf_state,sf_reals_active,sf_description,sf_predictor
 # Format sf_rs_entry: sf_realserver,address,rs_port,rs_state,rs_curr_conns,rs_total_conns
 $api_xml = new SimpleXMLElement($curl_data);
 
 # Loop through each sf_entry parent element
 foreach($api_xml->exec_command->xml_show_result->xml_show_serverfarm->sf_entry as $key) {
  # Output to logfile
  echo "  Processing VIP (context,name,type,state,description,predictor): [".$ace_context."],[".$key->name."],[".$key->type."],[".trim($key->sf_state)."],[".$key->sf_description."],[".trim($key->sf_predictor)."]... Done\r\n";
  # Loop through each sf_rs_entry child element
  foreach($key->sf_rs_entry as $inner_key) {
   # Output to logfile
   echo "   Processing VIP-RealServer (rserver,address,port,state,curr_conns,total_conns): [".$inner_key->sf_realserver."],[".$inner_key->address."],[".trim($inner_key->rs_port)."],[".trim($inner_key->rs_state)."],[".trim($inner_key->rs_curr_conns)."],[".trim($inner_key->rs_total_conns)."]... Done\r\n";
   # Augment Output File return string (parent)
   $output .= $ace_hostname.",".$ace_context.",".$key->name.",".$key->type.",".trim($key->sf_state).",\"".$key->sf_description."\",".trim($key->sf_predictor);
   # Augment Output File return string (child)
   $output .= ",".$inner_key->sf_realserver.",".$inner_key->address.",".trim($inner_key->rs_port).",".trim($inner_key->rs_state).",".trim($inner_key->rs_curr_conns).",".trim($inner_key->rs_total_conns)."\r\n";
  }
 }
 
 # Return output CSV
 return $output;
}

# Cisco ACE Load Balancer XML-API Call to get VIPs from a Context as array
function getCiscoAceApiVips($ace_ip, $ace_user, $ace_pass, $ace_context, $ace_hostname) {
 # Initialise variables
 $output = null;
 $i = 0;
 
 # Initiate cURL Session
 $ch = curl_init();

 # Setup cURL Options
 curl_setopt($ch, CURLOPT_USERPWD, $ace_user.":".$ace_pass);
 curl_setopt($ch, CURLOPT_URL, "http://".$ace_ip."/bin/xml_agent");
 curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
 curl_setopt($ch, CURLOPT_POST, 1);
 curl_setopt($ch, CURLOPT_POSTFIELDS, "xml_cmd=<request_xml context-name=\"".$ace_context."\"><show_service-policy info-type=\"summary\"/></request_xml>");
 curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);

 # Perform cURL Data Get
 $curl_data = curl_exec($ch);
 # Close cURL Session
 curl_close($ch);
 
 # Match VIP entry sp_class_map (parent) and Class Maps sp_loadbalance (child attributes) out, parse as array
 $api_xml = new SimpleXMLElement($curl_data);
 
 # Loop through each sp_class_map parent element
 foreach($api_xml->exec_command->xml_show_result->xml_show_service_policy->service_policy->sp_class_map as $key) {
  # Output to logfile
  echo "  Processing VIP Class Map (load_balancer,context,class_name,vip_state,vip_curr_cons,vip_drop_cons,vip_hits): [".$ace_hostname."],[".$ace_context."],[".trim($key->sp_class_name)."],[".trim($key->sp_loadbalance->sp_lb_vip_state)."],[".trim($key->sp_loadbalance->sp_curr_conns)."],[". trim($key->sp_loadbalance->sp_drop_conns)."],[".trim($key->sp_loadbalance->sp_hit_count)."]... Done\r\n"; 
  # Loop through each vip-address/etc as child element
  foreach($key->sp_loadbalance->{"vip-address"} as $inner_key) {
   # Output to logfile
   echo "   Processing VIP-Inner (vip_address,vip_proto,vip_port): [".trim($inner_key)."],[".trim($key->sp_loadbalance->{"protocol-type"}[$i])."],[".trim($key->sp_loadbalance->{"match-port"}[$i])."]... Done\r\n";
   # Augment Output File return string (parent)
   $output .= $ace_hostname.",".$ace_context.",".trim($key->sp_class_name).",".trim($key->sp_loadbalance->sp_lb_vip_state);
   # Augment Output File return string (child)
   $output .= ",".trim($inner_key).",".trim($key->sp_loadbalance->{"protocol-type"}[$i]).",".trim($key->sp_loadbalance->{"match-port"}[$i]).",".trim($key->sp_loadbalance->sp_curr_conns).",".trim($key->sp_loadbalance->sp_drop_conns).",".trim($key->sp_loadbalance->sp_hit_count)."\r\n";
   
   # Increment Line counter
   $i++;
  }
  # Zeroize the incrementer for the next loop
  $i = 0;
 }
 
 # Return output CSV
 return $output;
}

# Procedural
# Output to logfile
echo "Cisco ACE Load Balancer Stats Scraper v0.1\r\n";
echo "==========================================================================\r\n";
echo "JOB START: ".date(DATE_RFC2822)."\r\n";

# Output to logfile
echo "Opening input CSV file...\r\n";

# Iterate through CSV input file and make Telnet Call for each IP Address
if(!$fh = fopen(ACE_FILE, "r")) {
 # Output to logfile
 echo " Failed\r\n\r\n";
} else {
 # Output to logfile
 echo " Success\r\n\r\n";
 
 # Add header line to Processed Server Farm CSV file
 $outputfilea_content = "load_balancer,context,name,type,state,description,predictor,rserver,address,port,state,curr_conns,total_conns\r\n";
 # Add header line to Processed VIP CSV file
 $outputfileb_content = "load_balancer,context,name,state,address,protocol,port,curr_conns,drop_conns,hit_count\r\n";
 
 # Loop through each IP Address and Telnet Call to ACE Load Balancer
 while(($row = fgetcsv($fh, 0, ",")) !== FALSE) {
  # Increment line counter
  $i++;
  
  # Output to logfile
  echo "Processing ACE Load Balancer #".$i." ".$row[0]." (".$row[1].")\r\n";
   
  # Make Telnet call to get information about IP Address
  foreach(getCiscoAceApiContexts($row[1], $row[2], $row[3]) as $value) {
   echo " Processing ACE Context \"".$value."\"...\r\n";
   $outputfilea_content .= getCiscoAceApiServerFarms($row[1], $row[2], $row[3], $value, $row[0]);
   $outputfileb_content .= getCiscoAceApiVips($row[1], $row[2], $row[3], $value, $row[0]);
  }
 }
}

# Output to logfile for FileA
echo "\r\nIteration through all input ACE Load Balancers...\r\n Complete\r\n\r\n";
echo "Outputting Processed Server Farm Stats CSV file to ".OUTPUT_FILE_SF."...\r\n";

# Output Processed Server Farm CSV to file
if (!file_put_contents(OUTPUT_FILE_SF, $outputfilea_content)) {
 # Output to Processed Server Farm CSV file failed; output to logfile
 echo " Failed\r\n\r\n";
} else {
 # Output to Processed Server Farm CSV file successful; output to logfile
 echo " Successful\r\n\r\n";
}

# Output to logfile for FileB
echo "Outputting Processed VIP Stats CSV file to ".OUTPUT_FILE_VIP."...\r\n";

# Output Processed VIP CSV to file
if (!file_put_contents(OUTPUT_FILE_VIP, $outputfileb_content)) {
 # Output to Processed VIP CSV file failed; output to logfile
 echo " Failed\r\n\r\n";
} else {
 # Output to Processed VIP CSV file successful; output to logfile
 echo " Successful\r\n\r\n";
}

# Output to logfile
echo "JOB STOP: ".date(DATE_RFC2822);
?>

The End

There you go; if you liked (or didn't) this, or just have some suggestions, feel free to tweet me @notworkd

Automation - The "Script it" versus "Do it" continuum

Sunday, 03 Feb 2019

The "Script it" versus "Do it" continuum

recent tweet from @nickrusso4258 got me thinking about something I've been trying to express in my professional (don't laugh, people sometimes say I am) life for a while now, that can strike a nerve with the "Automate ALL THE THINGS!" crowd; scripting something (and by extension automating something), isn't always the right answer for an Organisation's use of Time (read: your 9-5 they pay for).

As I appreciate that not everyone is a Coder, DevOps or new-kid (some of us still get paid to be Cisco Mario; not everything is up in Toad Cloud yet...), this concept can apply a little wider than just to Developers, and even probably to the Business-y people all us IT Folk interact with on the daily. Using my finely-honed MS Paint skills (side-note: you've not lived until you've done a Network Diagram in MS Paint), here's a sexy graphical approximation of the theory:

undefined

Making stuff up #1 - Payback sweet spot

What the graph is trying to demonstrate is that the world of repetitive tasks can loosely be split into two partisan camps:

  1. "Script it"
    1. i.e. Put the additional effort (more than to just "bang another <repetitive task> out") in, and automate it/script it/somehow make it easier to perform than just doing the do over-and-over, with two tangible outputs:
      1. Completion of <the task>
      2. Automation of <the task>
  2. "Do it"
    1. i.e. Don't worry about the why, just repeat the manual steps you'd normally do and "bang another <repetitive task> out", with one tangible output:
      1. Completion of <the task>

The obvious sweet spot here is that, for a given number of repetitions of <the task> over time, eventually the additional effort of "scripting it" (the time taken to do the automation, on top of that of just <doing the task>) will eventually pay itself back, as after a given "Payback sweet spot", you've now got time back to do other stuff, which you'd otherwise have spent just doing <the task> again and again.

Alright, I'm buying it #2 - Positive opportunity cost

Or in other words, you're now in "Positive opportunity cost" - that is, <the task> is in someway automated, and you can dedicate your time to the other fifteen-million items on your "To Do" list, instead of this <task>. All is well in the world, you've automated all the things - and bar a little troubleshooting and debugging you unexpectedly have to do (i.e. when you discover your vGhetto VCB Cron Job uses a file that gets overwritten at ESXi System Reboot...), you're actually "earning time" saved through the script parallel-working the task for you.

Bully for you; your life is complete, you've moved all the things to teh Cloudz, and you're about to marry Princess Toadstool, and live in the Kingdom of the Mushroom Cloud forever mor...

Wait a minute, what's this #3 - Negative opportunity cost

But look over there on the left-hand side of this conceptual model; what's that pesky "Negative opportunity cost" all about then? I'm just about to pop the ring on Princess Toadstool, you saying I've got a problem here?

What I'm referring to here is the cold reality of Work; you're ultimately paid to produce output that a Customer wants - whether that's direct tangible stuff ("Hi, make this Network Switch go now please") or otherwise intangible stuff ("Hi, move these Apps to the Cloud? Mkay, you'll need to make a Project Plan for that, I get it...") - it's all output that's working towards a tangible goal.

You know what isn't output working towards a tangible goal? Scripting.

You know what you can't accurately do? Predict the future.

You know what the problem is here? Scripting a <task> that actually needs to be run, in future, with less individual repetitions of duration than just manually repeating the <task> would have taken you (and you get multiple lots of tangible output for that).

Let's give you a worked example; suppose you need to write a script to output all the IP Helper Addresses of a Cisco IOS Script, and (you don't know it yet), but you're not great with Bash Shell script (well, you do know that...), and it'll take end up taking you 16 hours. Sounds great; much easier than ripping through 500 devices and doing that manually, right - that'd take you maybe, I dunno, 5 hours and a bit of hand-cramp?

But what if I said to you that, unbeknown to you, we're about to swap out all that Cisco IOS kit for <SD-WAN Vendor XYZ> kit; where stuff like this (IP DHCP Relay Address) is pushed out in a programmatic, templated fashion anyway. What's happened now? Well, in Business Output terms, you've just wasted the time it would have taken to do it manually (5 hours) subtracted from the time it took you to script it and get the tangible output (16 hours), so you've cost the Business 11 hours of time you could have been doing something else productive.

Which would be 11 hours' worth of "Negative opportunity cost", and seems to be something the Automation Crowd rarely focus on; none of you are Mystic Meg; none of you have Crystal Balls; none of you can predict the future.

Something to dwell on. Meanwhile, I hope Princess Toadstool likes Hula Hoop crisps as engagement rings...

Home