Posted by Pieter Ennes on February 15th, 2011
A few weeks ago we formally launched Real Browser Monitoring after a soft introduction last December. We’re really excited about this new monitoring capability, and are working hard to add more meat to this type of monitoring. Currently we’re adding support for:
- Filling forms (POST) in a page
- Multi-step monitors with Real Browsers
- Flash (that’s actually a hard one to scale over all our customers)
In this post I’d like to share some of the implementation details, what additional improvements we’re working on, and a bit on other approaches we’re exploring.
Why Real Browser monitoring
To satisfy this demand, many website monitoring companies have been competing to set up arrays of servers running Linux + Firefox, or Microsoft Windows + various flavours of Internet Explorer. For each real browser check, they need to fire up an instance of the browser (and in the case of IE, most likely including the surrounding virtual machine), load the web page, retrieve the results, and kill the instance. From an engineering point of view this scenario has room for optimisations, and from a operational point of view, with thousands of monitors that need to be executed every couple of minutes, scaling this up becomes key.
So at WatchMouse we looked at the problem from a different angle: by researching the main differences between browsers, then interviewing clients and partners which of these difference they found most important, and finally building a monitoring solution that is both satisfactory in terms of browser profiling, and has some extra features that are hard to offer using the more brute force method mentioned before. Let me try to describe some of the gory details.
A lot of research on key inter-browser differences has been done by the great people involved in browserscope.org, a community-driven project for profiling web browsers. In fact, the BrowserScope network tab (also containing Steve Souders’ excellent UA profiler work) is at the basis of our solution: Instead of firing up every browsers executable out on the market and trying to puppeteer it to load a page, on various platforms, for thousands of monitors, we embrace the BrowserScope network parameters and exploit the fine-grained control that we have over a single engine and simulate the different browsers.
Some of the key factors in browsers that affect the performance are:
- the number of parallel connections that the browser can open to the same host
- the way the browser handles concurrent downloads of scripts, images and CSS files
Using these parameters we realized the emulation of a number of actual browsers, which we fine-tuned by comparing the waterfall charts of different browsers on various test sites to the waterfall charts rendered by the emulation.
More on the different parameters can be found on the site of the browserscope.org project. Some factors, however, I would like to discuss in more detail below.
Resolving host names
DNS lookups are nowadays handled by most browsers in a similar way: By resolving host names asynchronously as soon as they are available, even before there is a free TCP socket. The differences between browsers are therefore negligible when examining the full page load.
Some modern browsers (notably Chrome, Safari 5, and some versions of Firefox) also have a feature called DNS pre-fetching. This option is unrelated to the above and doesn’t affect loading times of single web pages. In multi-step transactions, however, the option enables resolving host names in links the user may click on in the near future. There are some privacy concerns with this and we will most probably leave it disabled in any future multi-step monitors. With multi-step transactions in mind, I propose to add a new BrowserScope network parameter, indicating whether DNS pre-fetching is enabled by default or not. In single-step monitors there should be no measurable influence on performance anyway.
But what about IPv6, or AAAA/A vs. A/AAAA lookups?
Still on the topic of resolve time, in case one of our stations has IPv6 connectivity it is configured to perform DNS lookups in accordance with RFC 3484. On the other hand, early versions of Windows seem to have had no or non-standard IPv6 support. In other words: Even with IPv6 connectivity available, Windows may be preferring IPv4 above IPv6. More modern versions (i.e. Windows Vista/Server after 2008) seem to adhere to RFC 3484 just like the nodes in our monitoring network.
Mac OS X also had it’s share of problems with RFC 3484, but the other way around: it always prefers IPv6. This situation doesn’t arise on our network, and unless someone would be specifically interested in synthetic testing using this broken set up, we can ignore it.
Summarizing: We ignore any platform-induced differences with respect to name resolution, but as explained above, this should have minimal impact on the way a single page will be loaded.
TCP connect times and simultaneous connections
The overall connect times are significantly influenced by the number of connections a browser opens to each host, and to all hosts in total. Our engine sets the number of connections based on the previously mentioned research by Souders. We can therefore expect (and confirmed this in our experiments) that the connect times are very similar to those of the real browser.
Response times can be split in two components: the actual server response time and any network-induced latency. The latter will depend only on geographical location, and hardly on OS or browser version. The former, the server response time, can be influenced by the number of concurrent connections opened to the server by the client, or in theory the ordering of the requests. However, as our engine sets the correct number of connections for each browser profile, only the ordering of the requests may be different. As of yet we did not see any difference in practice due to this.
Total pageload time
After obtaining a HAR file describing the pageload, we use many of the remaining BrowserScope network parameters to do the last step in fine-tuning the pageload profile to the selected browser. For example, looking at the JS/CSS concurrency levels and information from the HTML, together tells us how the elements are to be ordered.
For now, we don’t touch these at all, at least until we have done more research on how to heuristically predict the changes.
Pros and cons of the profiling approach
- Accurate timings (using real browser events)
- Waterfalls for all profiles, not just one
- HTTP authentication and SSL client certificates for all profiles
- Matching on dynamic texts for all profiles
- Snapshots for all profiles
- Scalability, immediacy
- Rendering quirks and artefacts won’t show
- No interpretation of conditional comments (like <!--[if IE]>)
An independent expert opinion
Further considerations are pointed out by Aaron Peters, an independent Web Performance consultant:
It’s great to see WatchMouse acknowledging the added value of Real Browser Monitoring and adding this as a service to their customers. Measuring page load times with real browsers is key to understanding the end user experience.
he says, while adding:
WatchMouse takes an interesting and innovative approach by using a single platform and then applying algorithms based on browser profiles to calculate the ‘actual’ page load times. The results I have seen so far are impressive
Aaron does believe, however, that for certain web pages the above mentioned drawbacks can have a significant impact on the results:
We believe that we have built a product that allows us to offer competitive, scalable and user-friendly real browser monitoring in a different way. We show that the profile approach has advantages as well as limitations, but when used in a synthetic monitoring environment, the benefits outweigh the imperfections by far.
If you feel you don’t get enough browser-specific detail from one of your checks, then it’s easy to do a single-shot measurement on the page in question using one of the many terrific tools out there.
When more insight is needed into different browsers on a continuous basis, passive (i.e. non-synthetic) methods may be a suitable alternative. Naturally, we’re working hard to get our version of Real User Monitoring (RUM) out there as soon as possible!
Having this product out in the open allows us to gather bigger volumes of data required to tune the profiles further. A current topic of interest is to see if we can predict how the first-visual event translates across browsers. We’re also researching to install real browsers on a small set of nodes to be used as a reference platform.
But that’s not all and we’re certainly not finished yet! So let us know in case you have any thoughts on this matter.
- UA Profiler, http://stevesouders.com/ua/
- DNS Prefetching, http://www.chromium.org/developers/design-documents/dns-prefetching
- DNS Prefetching and Its Privacy Implications, http://www.usenix.org/event/leet10/tech/full_papers/Krishnan.pdf
- RFC 3484, http://www.ietf.org/rfc/rfc3484.txt
- RFC 3484 in Windows, http://support.microsoft.com/kb/969029
- Measuring and combating IPv6 brokenness, http://ripe61.ripe.net/presentations/162-ripe61.pdf