Remember me

Register  |   Lost password?

The Trading Mesh

How to Reduce Latency by Running Trading Algorithms in a Switch

Mon, 07 Oct 2013 17:06:00 GMT           


By Mike O'Hara


Last week, Argon Design, a design services company based in Cambridge UK working in the areas of high performance networking, processor architecture and digital logic design, released an announcement about some pretty impressive results they had achieved in tests running a trading algorithm directly on an Arista Networks 7124FX Application Switch…


“For the measured leg in the test harness, latency was reduced from a previous best of 4,600ns to 176ns for algorithmically generated trades executed to the simulated market.”


I spoke with Steve Barlow, CTO of Argon Design and asked him to provide some background behind the press release.


“We specialise in FPGA & manycore processor design and architecting high performance systems”, he said. “And for the past year or so, we’ve been getting more & more involved in the high performance trading space. As a design services firm, we can help with architecture, with detailed implementation, with testing and so on. And because we’re not traders or specialists in finance - we’re specialists in the technology and the architecture - we bring a different mindset. So what we’ve done here is to demonstrate what’s possible, something interesting that shows the sort of things that can be done with this technology and with the right sort of design”.


To conduct the test, Argon Design used the Finteligent test framework developed at Intel’s fasterLab - a simple trading technology framework allowing latency to be measured and compared with other published figures - running two separate processes on an x86: a market data feed simulator generating data in FIX format over TCP; and an exchange simulator to accept orders, match the trades and send acknowledgements and fills, again via FIX.


The specific leg of latency they measured was from the point where market data left the market data simulator to the point where the exchange simulator received an order. So the 176ns latency quoted in the press release for their trading device included the receipt of market data, the making of a trading decision, the generation of an order and the submission of that order to the market – or specifically, from the end of the incoming market data packet to the end of the outgoing order packet.


In order to achieve the lowest latency, their approach was to put the trading device directly in the switch.


“We used the Arista 7124FX switch because it’s got an FPGA built directly into it, which can access some of the 10Gig ports directly. We ran the trading system on that FPGA”, said Barlow. “The trading logic was pretty basic, just a simple rule-matching engine, but there’s no reason why it couldn’t be made more complex. We have a design structure that allows a very expressive set of rules to be created, which we think would cover most cases”.


The engineers at Argon Design used a couple of other interesting techniques to bring latency down to an absolute minimum: inline parsing and pre-emption.


“Inline parsing means you don’t have to wait until you’ve got the whole packet”, explained Barlow. “But you do have to be a bit careful because the packet could be corrupted, so you can’t make anything irrevocable until you’ve checked the checksum at the end of the packet. But you can make tentative decisions and line stuff up so that it’s all ready to go when you do see the checksum.


“And pre-emption allows output to be pre-prepared. It assumes that the order is going to be sent and then poisons it if it doesn’t want to make an order after all (or the market feed checksum is no good), in which case the packet just disappears as soon as it gets to the next switch.”


This video describes the whole process:



Wanting to find out more about the technology behind these results, I spoke with Paul Goodridge, Regional Director for Financial Services at Arista, who stressed the role of the switch in future low-latency trading environments.


“The real story here is that the actual switch itself can be as valuable a tool in the execution of trades as any niche appliance”, he said.


“With the 7124FX, what stands out is its FPGA capability, but equally important is the metadata coming out of the switch, the ability to take those metadata streams and factor them into the way that you traverse the switch, and the decisions you make as a result of that.


“The 7124FX provides a tremendous degree of flexibility, but we’re already moving on beyond that in that we’re continually improving our EOS operating system to give more feedback around the biometrics of the switch. This allows HFT firms, for example, to get a more granular view as to the role the switch plays in their end-to-end ecosystem and a stronger insight within the context of what they’re trying to achieve.


Goodridge explained that there has always been the attitude in the market to utilise leading edge technologies such as FPGA, but Argon Design is bringing the aptitude to take full advantage of that.


“What they’ve done, which is fantastic, is clearly define that this is easier than first thought. They’ve shown that it’s not that hard to leverage this sort of technology and that the rewards can be significant”, he said. “And they’ve shown that these are not just bland marketing statements, these are demonstrable results and the benefits associated with the investment are very evident”.


More details about Argon Design including a downloadable white paper that documents the test and results, can be found at


Additional information about Arista and their 7124FX switch can be found at


, , , , , , , , , , ,