From JMeter to the HTTP Archive (HAR) format

Posted by Pieter Ennes on August 11th, 2010

Hi folks,

At WatchMouse we’ve been offering worldwide Web Application Testing (a.k.a. ‘scripting’) based on a customised version of Apache JMeter for some years now. This has been working really well in a wide variety of applications, like:

  • Full page load simulation (including page assets)
  • Website transaction monitoring (including HTTP headers, cookies etc.)
  • SOAP or REST-like API testing (including variable extraction and conditional execution)

But the real strength of using JMeter for our functional monitoring is that our users remain in control, as they can craft monitoring scripts on their desktop using powerful tools and then easily upload them via our website. Or they could reuse a set of already existing load testing scripts and upload them for periodical monitoring after deployment. There is no lock-in, so whenever you choose to leave, you can download your scripts and still be able to run them using free and open-source tools. WatchMouse just adds indispensible bling like professional alerting, visualisations, and a worldwide network of 50 stations.

The HTTP Archive format as common denominator

Even though our clients keep finding new ways to employ JMeter, what do these different types of scripts have in common? They are similar in that all sequences:

  • Are based on HTTP
  • Can be multi-step
  • Can contain subrequests
  • Measure ‘performance’

Looking at these factors, we were very excited to learn about the the introduction and rapid adaptation of the HTTP Archive format (HAR), an open specification for page load measurements. Because what HAR can store is exactly all of the above, plus more if you want.

JMeter to HAR converter

Based on the observed similarities between JMeter and HAR, I decided to build a simple JMeter-to-HAR converter as a quick proof of concept. The output of the first conversion turned out to be accepted by Jan Odvarko’s authoritative online HAR viewer which displayed the multiple steps and subrequests correctly, as well as some of the HTTP properties. In a second version of the script it also proved easy to convert many of the other elements common to both JMeter and HAR, e.g.:

  • HTTP status, response message, method and version
  • HTTP request and response headers
  • HTTP Cookies
  • Mime types
  • Redirected URL’s
  • Page sizes and compression
  • Most of the performance metrics (dns, connect, wait, receive)

Additionally, I chose to convert JMeter’s notion of assertions into a private _assertions element in the HAR file, although we don’t currently use this elsewhere. If someone wants to try it, I made the code to the converter is available on BitBucket. It’s written in PHP since that matches with our own web environment, and I’ll be updating the script occasionally to stay in line with the most recent version of HAR.

HAR at WatchMouse: Script breakdowns and waterfalls

Internally at WatchMouse, we have already been using the converter for some time. We’ve incorporated Jan’s excellent waterfall charts in our dashboard, making FireBug-like visualisations available for all standard script monitors. We are planning to add more support for HAR in the future!

Support for non-HTTP protocols?

However simple it is, one of the things the converter shows is the close relation between JMeter and HAR. Yet, there are incompatibilities, as JMeter can do much more than only HTTP monitoring. So can JMeter support HAR? As an output format for all it’s HTTP requests it certainly could. But the current HAR format is very much geared towards HTTP only (if only for it’s name), so as a generic output format it would not be suited.

In parallel, at WatchMouse we are steadily moving towards a new storage solution for our check data based on MongoDB, and for that, we are researching an archive format that supports storing any type of performance measurement in JSON format.

Sounds like a generalised version of HAR in both cases? It could be! The success of HAR shows that performance measurements are ubiquitous nowadays and that the sector can benefit a lot from open formats. I’m confident that a generic archive format, if done properly, can set a standard and be used by many tools (like JMeter) and companies (like WatchMouse), both internally or in public interfaces.

If you have any thoughts on this subject, please feel free to leave a comment or email me directly.

Repository: http://bitbucket.org/watchmouse/har/
Issue tracker: http://bitbucket.org/watchmouse/har/issues

Pieter Ennes
VP Engineering

Amazon CloudFront movements

Posted by mark on July 20th, 2010

This article, a follow up on an earlier blogpost, gives a more detailed look at the location of the Amazon CloudFront service. This location is derived from the time it takes to connect to it from a number of locations.

The summary is that CloudFront is on the average about 40-50 milliseconds away from a random point on the Internet.  This is pretty good compared to a site located in e.g. New York (120 milliseconds) and is in the same league as other content distribution networks. In specific markets, it is pretty near: San Francisco: 3 milliseconds, New York: 13 milliseconds, Western Europe: 1-30 milliseconds.

According to Amazon, CloudFront is in 16 locations, in contrast to the S3 storage service and the EC2 compute service, which have only 4 points of presence around the world.

The following table gives distances (in milliseconds) from selected locations of the monitoring network to Amazon CloudFront (cities annotated with CF have CloudFront locations):

CloudFront distances

As the table shows, the proximity of CloudFront is uneven around the world.

CloudFront changes its connectivity regularly, mostly for the better. An interesting data point for example is that on April 8, CloudFront created a presence near Hong Kong dropping the distance from 160 milliseconds to 4 milliseconds. The following graph gives more detail.

Amazon CloudFront movements

In line with our earlier research, this data too shows that maintaining good proximity in an ever changing Internet is not a trivial thing to do. See for example the ever changing proximities to New York and New Zealand.

Our research was done in collaboration with Jitscale (a cloud consultancy)  and WatchMouse. Distances are measured by measuring a TCP connect to an http URL of an object provided by Cloudfront, and does not include DNS lookup.

This is a guest column by Peter van Eijk, owner of Digital Infrastructures, a consultancy firm. He publishes a blog at http://petersgriddle.net and is currently setting up the Computer Measurement Group’s Dutch chapter. If you speak Dutch, follow the blog and join the NLCMG LinkedIn group.  Contact information is listed on LinkedIn and on www.digitalinfrastructures.nl/contact

LDAP monitoring? Yes we do…

Posted by stan on June 22nd, 2010

Not many people run public facing LDAP (Lightweight Directory Access Protocol) servers, but if you do, we can help you keeping tabs on the availability and performance of your LDAP services, as seen by our monitoring stations worldwide.

In this post I’ll explain how to set up an LDAP monitor in your account. Beware, this is not for the faint of heart! If you never encountered a LDAP server, or do not know the access parameters of your organisations’ LDAP service, this post it not for you… Okay, here we go!

Setting up an LDAP monitor

First, check that your LDAP server is indeed public facing, i.e. that it can be reached over the Internet. It may be secured with user name and password, and only accessible over SSL, but is should be reachable from our monitoring stations, or at least the ones you select for the monitoring of your server.

Then go to your monitoring settings, add a new monitor, and select ldap as its type. Then click on expert mode and fill out the following fields:

  • Host: The host name of the server
  • Port: The TCP port where the LDAP service is listening, typically 389, or 636 with SSL.
  • Path: The key to the information you want to retrieve. See below.
  • Encryption: SSL or none.
  • User name: If your LDAP server is only available to registered users, specify the account here. Note that this will probably be a LDAP path too, for example: cn=stan,ou=mt,dc=watchmouse,dc=com
  • Password: the password associated with the user name.
  • Match string: optionally a string (or regular expression, when enclosed in /-signs) to use to verify the output of the LDAP query.

Optionally you may want to change timing settings, IPv6 preference, interval, certificate checks, etc.

Now the path is a bit complex, as it is composed of four parts, separated by question marks.

/base_dn?attributes?scope?filter

Here we follow the LDAP URL Format (RFC2255):

  1. base_dn: Distinguished name (DN) of an entry in the directory. This DN identifies the entry that is the starting point of the search. If no base DN is specified, the search starts at the root of the directory tree.
  2. attributes: The attributes to be returned. To specify more than one attribute, use commas to separate the attributes (for example, ”cn,mail,telephoneNumber”). If no attributes are specified in the URL, all attributes are returned.
  3. scope: The scope of the search, which can be base, one, or sub. If no scope is specified, the server performs a base search.
    • base retrieves information about the distinguished name (base_dn) specified in the URL only.
    • one retrieves information about entries one level below the distinguished name (base_dn) specified in the URL. The base entry is not included in this scope.
    • sub retrieves information about entries at all levels below the distinguished name (base_dn) specified in the URL. The base entry is included in this scope.
  4. filter: Search filter to apply to entries within the specified scope of the search.

An LDAP example

Host ldap.itd.umich.edu and path dc=umich,dc=edu?uid,cn?sub?uid=mcs

LDAP monitoring settings

This path indicated that the search starts at dc=umich,dc=edu, that only the UID and CN attributes are retrieved, all levels below the start node are searched (sub), and only the nodes with uid equal to ‘mcs‘ are returned. Finally, we check if the string “Mark Christo Smith” is in the results.

Are you still with us? Good! Now try to set up a monitor for your own LDAP server. And while you set it all up, why not add this new monitor you created to your Public Status Page and share the status of your server with your visitors? Transparency is the new trend!

Hope you enjoyed this post, and if you have any problems setting up a monitor for your LDAP service, just open a ticket at the helpdesk, and we’ll follow up immediately.

- Stan

Filed under Hidden feature, LDAP Tags: , No Comments

Publish your monitor status to Twitter

Posted by stan on April 26th, 2010

There are many hidden features* in the WatchMouse portal and today I’d like to tell you about such a small gem: Publish your monitor status to Twitter.

Some of our clients are very open about the status and performance of their services and website(s) and like to inform their users ASAP if there is an issue. Twitter is a perfect channel for that: we can tweet for you when there is an issue with your website and when it’s okay again.

To set it up in the WatchMouse portal, go to the ‘Contacts’ page and add a new contact, type “twitter”. Normally, you would just type your twitter account in the ‘Screen name’ field and you will get alerts via the Twitter DM (direct message). To have tweets in your public timeline, type your twitter account name, followed by a colon, and your twitter password (sorry, no oAuth yet**). Enter a name for this contact and save.

Now you can test this new contact by clicking on the test button in the contact overview, and a test tweet should show up in your Twitter timeline. Also, the message will show up in the message log page of the WatchMouse portal. Next: add this contact to a contact group, or create a group with this contact in it. This is not really required, but in most cases it is convenient to have an (escalation) group, where the first alert is sent to e.g. your cell phone, and if the issue remains, a Tweet is inserted in your timeline.

To finalise the set-up, select a monitor in the settings page, and assign the newly created contact or contact group to it, and you’re all done!

This all fits nicely with the Public Status Pages (PSP) features of WatchMouse: using PSP, the current status and the performance in the last seven days is reported. In fact, we combined both the PSP feature and the Twitter timeline option discussed above, and created API-status (http://api-status.com/) and the Twitter profile: @api_status

- Stan

Twitter: @stannie

*) You may ask: why do you hide features? Well… We don’t hide features on purpose, but we selectively bring new features to the portal, and sometimes decide not to make the portal more complex when the feature may be of interest only to a small portion of our users. So if you think ”wouldn’t it be convenient if …” we may already have it under the hood. And if not, we certainly would love to hear about it, so please open a ticket at our helpdesk to tell us!
**) Yes, we will add oAuth too. How many days left? See www.countdowntooauth.com

Pre-release of BadBoy available

Posted by Pieter Ennes on March 11th, 2010

At the same time that we were working on some improvements for Apache JMeter, Simon from BadBoy Software has kindly made available a pre-release of his software that, next to many other improvements, particularly supports exporting scripts that use Variable Setter elements to JMeter using functionality from JMeter’s similar Regular Expression Extractor. This is relevant in cases where your application exposes its session ID via the web page body, for example in a (hidden) form element instead of using a cookie. This release will make the process of recording and uploading scripts directly from BadBoy much easier, which is terrific as we recommend BadBoy as a primary tool to record our script monitors. We’ve already got one customer using this enhancement!

The preview can be downloaded from: BadboyInstaller-2.1_beta_8_pre_1_wm.exe

On a related track: We’ve added a small island on our main dashboard reminding our scripting customers that if they use BadBoy to record and upload scripts in their WatchMouse account, they are in fact entitled to a free licence for BadBoy!

Thanks for the patch Simon!

Filed under Labs, Monitoring Tags: , , No Comments

Measuring connect times with Apache JMeter

Posted by Pieter Ennes on March 10th, 2010

I’m very happy to be able to publish another patch, this time for Apache’s Jakarta JMeter that adds two new metrics (a connect time and processing time) to all output channels. To cut right to the chase, this is how it looks:

JMeter connect time

Displayed is the breakdown of a typical HTTP redirect scenario, from google.com to google.co.uk in this case, in the JMeter GUI. Shown in the right pane, are the timings of the selected parent element, summarising both the initial 302 as well as the 200 response from the server. The original Latency and Load time metrics are shown, and in yellow the stuff that I’ve added.

Connect time
This one simply measures the TCP connect time in HTTP samplers, including any host name lookups that may have to be done. So the printed time indicates the amount of time between the start of the test and the socket being ready. In case of a redirect (as above), the connect times to the individual hosts (google.com and google.co.uk) are summed into the parent element to indicate the total connect time that was involved.

Processing time
The processing time is defined as the time-to-first byte just as the latency. But in contrast to the latency, the processing time from sub-requests is summed into the parent element, just like the connect time, page load time and page size.

Having both these numbers available can allow better analysis of a performance problem. For example: If the connect time stays the same but the processing time goes up, this may indicate a CPU-related problem on the server and not a network or DNS issue. But there are many other situations where a breakdown like this may help with performance research.

Resolve time
Another performance metric that is missing in JMeter is the DNS resolve time. We are currently looking for a way to measure this, but I expect to be able to include it in the next version of the patch. Having resolve time available as well, will allow for even better analysis of a problem.

Feedback
At WatchMouse we use JMeter as a back-end for our functional monitors, and we are interested in these extra measurements to give our users a more detailed performance breakdown of their transactions. After deployment to our monitoring network we will start integration with our performance charts and Root Cause Analysis pages.

As always, we’re keen on your feedback regarding this patch! What’s your opinion on the definitions of these metrics? What else would you like to know regarding the performance of your scripts? And how would you like to see this information come back in our dashboard?

Pieter Ennes
WatchMouse

References

Latest experiments

Categories