Enhyper: 2011-11-20

Some feedback from this event which I attended today.

Event http://events.waterstechnology.com/etas

Infrastructure Management: Reducing costs, Improving performance, Professor Roger Woods, Queens University, Belfast

Prof Woods gave an impassioned talk about a tool that he has developed which takes c++, allows you to navigate the code and identify subsystems which you can target to run on hardware or emulation of hardware.

He worked on the JP Morgan collaboration with Maxellor and was bullish about the technology.
Two years from pilot to production.
Developed at tool that allows identification of sections that are suitable for FPGA
Key issue: programming FPGA bitstreams (http://en.wikipedia.org/wiki/Bitstream) - took six months
C++ is translated into C (manually) before being cross compiled into Java which is what the Maxellor compiler requires.
This is to remove c++ abstraction which "kills parellisation" (see slides)
Focus was hardware FFT - all other logic in software - comms via FPGA bitstream

In summary:

ideal for risk calculation and monte carlo where algorithm does not change.
C++ legacy code does not parallelise easily and is not a candidate for FPGA
Three year dev cycle.
Complex, manual process
JPM own 20% of Maxellor

Resources

http://www.ecit.qub.ac.uk/Card/?name=r.woods
eFutures Portal - http://efutures.ac.uk/

This continued to a panel hosted by Chris Skinner

Panel: The Parallel Paradigm Shift: are we about to enter a new chapter in the algorithmic arms race

Moderator: Chris Skinner, Panel: Prof Woods, Steven Weston, Global Head of Analytics, JPM. Andre Nedceloux, Sales guy, Excelian

FPGA Plant needs to be kept hot to achieve best latency. To keep FPGA busy you need a few regular host cores loading work onto them.
Programming/debugging directly in VHDL is ‘worse than a nightmare’, don’t try.
Isolate the worst performing pieces, (Amdahl’s law) de-abstract and place on FPGA, they call each of the isolated units a ‘kernel’ .
Compile times are high for Maxeler compiler to output VHDL, 4 hours for a model on a 4 core box.
Iterative model for optimisation and implementation. They improved both the mathematics in the models and the implementation onto FPGA – ie, consider it not just a programming problem, but also a maths modeling one.
They use python to manage the interaction with the models (e.g pulling reports)
Initially run a model on the FPGA hosts and then incrementally update it through the day - when market data or announcements occur.
No separate report running phase – it is included in the model run and report is kept in memory. Data only written out to a database at night time, if it is destroyed then it can be re-created.
Low-latency is no longer a competitive advantage but now a status quo service for investment banking.
Requires specialist non-general/outsourced programmers required who can understand hardware and algorithms who work alongside the business.

Panel

How low can you go? Ultra-low-latency trading

Moderator: David Berry

Members: Jogi Narain, CTO, FGS Capital LLP, Benjamin Stopford, Architect, RBS. Chris Donan, Head of Electronic Trading - Barcap.

This was a well run panel with some good insights from Chris Donan in particular:

Stock programmers don't understand the stack from network to nic to stack to application and the underlying hardware operations
Small teams of experienced engineers produce the best results
Don't develop VHDL skills in house - use external resources.
Latency gains correlate to profitability
FPGA is good for market data (ie fixed problem) and risk
Software parallelism is the future.

Enhyper

Tuesday, November 22, 2011

Waters European Trading Architecture Summit 2011

Followers

Blog Archive

Contributors

Del.Icio.Us

Favourites