John D'Emic's blog about programming, integration, system administration, etc...

Tuesday, March 3, 2009

Mule, Smooks and Nagios

I've been working on upgrading our integration infrastructure on and off since the new year.  This began with the OpenMQ migration I previously blogged about and was followed by upgrading our Mule 1.4.3 services to Mule 2.x.  In addition to the technology changes, I wanted to use the upgrade as an excuse to clean-up some messy stuff we had in place. An example of which being the amount of custom transformation we were doing in Java code.

Our integration implementation makes heavy use of the Canonical Data Model pattern.  To shortly sum it up , we accept data in a variety of formats (XML, CSV or proprietary) and map them to an XML schema and/or Java object model.  Beyond the standard transport transformations supplied by Mule, we needed to implement a zoo of custom transformers to move to the canonical format.  I was looking for a way to mitigate this complexity overhead in some sort of framework.

I had read this article on InfoQ about Smooks around when thinking about the above and it seemed like a good fit, especially since there is a Mule module for it.  To make a long story short, we were able to upgrade to Mule 2.x and, using Smooks, not have to implement any model specific Mule transformers.  

Smooks works by streaming data in, transforming it and streaming it out.  "Cartridges" supply various transformation capabilities and exist for common data formats like XML, JSON and CSV.  The streaming model means that the transformations themselves don't require the entire documents to be loaded in memory.  This allows for large documents to be transformed without requiring the associated memory footprint.

The transformations can  be accomplished via XML configuration assuming the data formats being used have an associated cartridge.  This is also the case if the data is in a format you can easily move to a different format.  For instance, we have Nagios 2.x instances that use a semi-colon delimited status.log to write alert data.  A simple Groovy script allowed me to replace the semi-colons with commas.  I was then able to use the CSV cartridge to convert the data to XML.

The above Nagios instances are being upgraded to Nagios 3.x.  In Nagios 3.x, the status.log format is different.  Instead of being semi-colon delimited, it is in a proprietary format that sort of looks like JSON.  Here's an example:


servicestatus {
host_name=liro_url_laces0
service_description=liro_https://acmesoft.com/VI/Pages/General/TestConn.aspx
modified_attributes=0
check_command=check_https!/VI/
check_period=24x7
notification_period=24x7
check_interval=15.000000
retry_interval=2.000000
event_handler=
has_been_checked=1
..
}
There obviously isn't a Smooks cartridge that supports this format.  One solution might be to try to convert the above format to JSON.  This will probably work but likely be error-prone  (and annoying to implement.)  An alternative is to implement an XMLReader to parse the above file and spit out an XML Document.  

Smooks uses implementations of XMLReader to parse arbitrary file formats as XML.  It then operate on the SAX stream or DOM as dictated by a configuration file.  The following illustrates an implementation of the parse method of XMLReader that will parse the status.log format above:



public void parse(InputSource inputSource) throws IOException, SAXException {
if (contentHandler == null) {
throw new IllegalStateException("'contentHandler' not set. Cannot parse Email stream.");
}

String currentBlock = null;

contentHandler.startDocument();
contentHandler.startElement(XMLConstants.NULL_NS_URI, "statusLog", "", EMPTY_ATTRIBS);

for (String line : getString(inputSource).split("\n")) {

if (line.startsWith("#"))
continue;

if (line.contains("servicestatus")) {
String block = StringUtils.deleteWhitespace(line.split("\\{")[0]);
contentHandler.startElement(XMLConstants.NULL_NS_URI, block, "", EMPTY_ATTRIBS);
currentBlock = block;
}

if (currentBlock != null) {
if (line.contains("=")) {
String[] fields = line.split("=", 2);
String fieldName = StringEscapeUtils.escapeXml(StringUtils.deleteWhitespace(fields[0].replace("=", "")));

contentHandler.startElement(XMLConstants.NULL_NS_URI, fieldName, "", EMPTY_ATTRIBS);
if (fields.length > 1) {
String content = StringEscapeUtils.escapeXml(fields[1]);

contentHandler.characters(content.toCharArray(), 0, content.length());
} else {
contentHandler.characters(" ".toCharArray(), 0, 1);
}
contentHandler.endElement(XMLConstants.NULL_NS_URI, fieldName, "");
}

if (line.contains("}")) {
contentHandler.endElement(XMLConstants.NULL_NS_URI, currentBlock, "");
currentBlock = null;
}
}

}

contentHandler.endElement(XMLConstants.NULL_NS_URI, "statusLog", "");
contentHandler.endDocument();
}

We can plug the reader into the Smooks XML config :


<smooks-resource-list xmlns="http://www.milyn.org/xsd/smooks-1.1.xsd"
xmlns:csv="http://www.milyn.org/xsd/smooks/csv-1.1.xsd"
xmlns:ftl="http://www.milyn.org/xsd/smooks/freemarker-1.1.xsd"
>

<params>
<param name="stream.filter.type">SAX</param>
<param name="default.serialization.on">false</param>
</params>

<reader class="net.opsource.osb.reader.NagiosReader"/>

<resource-config selector="servicestatus">
<resource>org.milyn.delivery.DomModelCreator</resource>
</resource-config>

<ftl:freemarker applyOnElement="statusLog">
<ftl:template><!--
<ApplicationResponseTimes>
<?TEMPLATE-SPLIT-PI?>
</ApplicationResponseTimes>
-->
</ftl:template>
</ftl:freemarker>

<ftl:freemarker applyOnElement="servicestatus">
<ftl:template>smooks/monitoring/application_response_time/metric.ftl</ftl:template>
</ftl:freemarker>

</smooks-resource-list>



Now we plug it into Mule using the Smooks module and we're ready to go.


<smooks:transformer name="nagiosStatusLineToXML"
configFile="smooks/monitoring/application_response_time/smooks-config.xml"
resultType="STRING"/>


I'm pretty excited about this because I'm no longer writing a dedicated transformer for each domain model I'm mapping data to. I just need to implement XMLReaders when I come across a data format not already supported by a Smooks cartridge.

2 comments:

Tom Fennelly said...

Cool John... missed this post :)

Regarding the nagios file for which you had to implement a Groovy script to convert the colons to commas... the csv:reader config allows you specify the field separator (default is comma), so fairly sure it could have handled your nagios file directly.

johndemic said...

That was obvious, I spaced on the csv reader when implementing that piece. Thanks for the pointer!