Wednesday, August 29, 2007

Beer in the Evening - Data Parallelisation and Functional Languages in Finance - September 25th 2007

It's been suggested that we get together again for another beer in the evening by John Barr who has recently joined the 451 Group from Intel. John wrote an article called
"Reinventing the future: parallelism and acceleration" which tracked the recent rise in accelerated computing and is a comprehensive review of the area.

I'll put up some reading material in due course. Watch this space.

So if you're up for a beer and a casual chat about HPC, HFF, low latency, parallelism, messaging vs events etc we'll be meeting in the Red Lion, 8 Lombard Court, London from 630pm onwards on the 25th September.

You can call me or email on 0791 505 5380 or rgb at enhyper.com if you have any questions.






Micro and Macro Performance Analytics - a Log with Two Tales

Micro and Macro Performance Analytics - a Log with Two Tales

Sometime when we measure things we end up introducing latency into the system. I recently came across a neat scheme where a timing message was “stamped” on every hop then rerouted to the source. The average round trip time was calculated by dividing the elapsed time in two. The overall times produced were adequate for what the guy required but there was an annoying latency spike which happened periodically and was a cause for concern.
The average trip time from source to receiver was 900 microseconds. Occasionally, a 9 millisecond time was reported which, if echoed in production, would have been unacceptable. To find the problem, all the messages above 1ms were tagged and grep’d out into a file. Using vi to analyse this log file, it was noted that the latency spikes had a definite periodicity of around 600ms.

Visual inspection is a great technique as the brain is good at recognising patterns. However, as we can now generated millions of log entries per second, this tactic has severe limitations in that it highlights micro trends, therefore It’s important to visualise the data too so that macro trends can be recognised. Luckily I did as there was one lurking in the data.
I took a minute’s worth of >1ms log file entries and manipulated them in good old excel to produce the graph below:

As you see from the graph, there's a an obvious 10 second event which is highly regular in it's form. On further investigation, it turned out to be a call to a summarisation routine which caused the spikes to cease momentarily. Now it's lucky this was found using a synthetic load - in production, it could be a lot more difficult to spot. On the plus side, it's nice to come across a well designed load simulation - this is seldom the case. People are prone to quote averages and pay little attention to the spikes - but in high frequency trading, it's the spike that will lose you big money.

So, be careful what you measure and be careful that by measuring, you don't introduce side effects which could so easily be hidden in production.