Logo

February 20, 2009

How do you measure Latency (RTT) in a network these days?

I had a debate with one of our customers the other day on how to measure latency in their network.

You may wonder why this even matters:  Well, as I outlined in a previous blog, for almost all TCP based applications (Mail, Web, ftp, custom sockets applications…) latency matters just as much, or more, than bandwidth.  If bandwidth is the “speed of our road”, then latency is the “journey time”, which is related to “the length of the road” as well as all the “holdups” (competing traffic, delay at routers, switches, repeaters etc) along the way.  If we want to test how an application will work in a real network using a network emulator then we need to understand the latency, bandwidth etc that it will experience.

The simple and traditional answer in measuring latency is to use ping, or a tool based on ping or its underlying protocol ICMP.  These send an ICMP  packet to the destination which it turns around and sends back and the round trip time (RTT) is calculated.  But this has problems in most modern networks! 

Why?  This is due to most modern networks, the Internet, corporate WANs, and MPLS included, implementing some form of traffic prioritisation (or QoS – Quality of Service- as it’s often known).  This has been done so that important applications, such as corporate VoIP, SAP etc receive preferential handling and others, such as access to iPlayer etc receive a low priority.  The problem with Ping (ICMP) is that it receives no particular priority at all, and so is not really representatitive of a particular application. 

I don’t mean to diss ping completely – sometimes it’s all we’ve got, but the way around this is to measure the latency of “real” application traffic which is subject to the appropriate network QoS.   To do this we can take advantage of the fact that when a TCP connection is made, and before any http, ftp etc. request is made, or data is sent, a handshake (known as the three way handshake) takes place between the client and the server.  

Using some maths on this we get to see the latency for presisely that application, as the handshake packets are subject to all the QoS parameters of the rest of the TCP session.

You could therefore take the approach of trying to create a TCP session to a target server using the correct IP addresses and ports and time the handshake, but this might cause problems for the server, and as we’re unlikely to complete the transaction it could be regarded as a potential intrusion or DoS (Denial of Service) attack.

Because of this, in our INE Companion and iTrinegy AppQoS products, we provide the ability to “watch” the normal, existing  transactions taking place and time their handshakes.