Tuesday, November 22, 2011

Waters European Trading Architecture Summit 2011

Some feedback from this event which I attended today.

Event http://events.waterstechnology.com/etas
Infrastructure Management: Reducing costs, Improving performance, Professor Roger Woods, Queens University, Belfast
Prof Woods gave an impassioned talk about a tool that he has developed which takes c++, allows you to navigate the code and identify subsystems which you can target to run on hardware or emulation of hardware.
  • He worked on the JP Morgan collaboration with Maxellor and was bullish about the technology.
  • Two years from pilot to production.
  • Developed at tool that allows identification of sections that are suitable for FPGA
  • Key issue: programming FPGA bitstreams (http://en.wikipedia.org/wiki/Bitstream) - took six months
  • C++ is translated into C (manually) before being cross compiled into Java which is what the Maxellor compiler requires.
  • This is to remove c++ abstraction which "kills parellisation" (see slides)
  • Focus was hardware FFT - all other logic in software - comms via FPGA bitstream
In summary:
  • ideal for risk calculation and monte carlo where algorithm does not change.
  • C++ legacy code does not parallelise easily and is not a candidate for FPGA
  • Three year dev cycle.
  • Complex, manual process
  • JPM own 20% of Maxellor
This continued to a panel hosted by Chris Skinner

Panel: The Parallel Paradigm Shift: are we about to enter a new chapter in the algorithmic arms race
Moderator: Chris Skinner, Panel: Prof Woods, Steven Weston, Global Head of Analytics, JPM. Andre Nedceloux, Sales guy, Excelian
  • FPGA Plant needs to be kept hot to achieve best latency. To keep FPGA busy you need a few regular host cores loading work onto them.
  • Programming/debugging directly in VHDL is ‘worse than a nightmare’, don’t try.
  • Isolate the worst performing pieces, (Amdahl’s law) de-abstract and place on FPGA, they call each of the isolated units a ‘kernel’ .
  • Compile times are high for Maxeler compiler to output VHDL, 4 hours for a model on a 4 core box.
  • Iterative model for optimisation and implementation. They improved both the mathematics in the models and the implementation onto FPGA – ie, consider it not just a programming problem, but also a maths modeling one.
  • They use python to manage the interaction with the models (e.g pulling reports)
  • Initially run a model on the FPGA hosts and then incrementally update it through the day - when market data or announcements occur.
  • No separate report running phase – it is included in the model run and report is kept in memory. Data only written out to a database at night time, if it is destroyed then it can be re-created.
  • Low-latency is no longer a competitive advantage but now a status quo service for investment banking.
  • Requires specialist non-general/outsourced programmers required who can understand hardware and algorithms who work alongside the business.

How low can you go? Ultra-low-latency trading

Moderator: David Berry

Members: Jogi Narain, CTO, FGS Capital LLP, Benjamin Stopford, Architect, RBS. Chris Donan, Head of Electronic Trading - Barcap.

This was a well run panel with some good insights from Chris Donan in particular:

  • Stock programmers don't understand the stack from network to nic to stack to application and the underlying hardware operations
  • Small teams of experienced engineers produce the best results
  • Don't develop VHDL skills in house - use external resources.
  • Latency gains correlate to profitability
  • FPGA is good for market data (ie fixed problem) and risk
  • Software parallelism is the future.