Saturday, June 16, 2007

Mining Massive Data Sets for Security

Semiophore points me to the forthcoming two week workshop on the above to be held in mid-September 2007 in Italy.

"It is the purpose of this workshop to review the various technologies available (data mining algorithms, social networks, crawling and indexing, text-mining, search engines, data streams) in the context of very large data sets."

I'd love to attend as this is an area I think is crucial for High Frequency Finance. Whilst working on a high performance trade order router for a tier 1, I did some research which I was allowed to present publicly at the Fiteclub, a forum which meets occasionally in London. I presented two papers of note - Financial Data Infrastructure with HDF5 which concentrated on high performance data delivery and analysis. In this presentation I proposed a machine which could be built for around $25K that could eat 20TB of data in 90 minutes - using COTS components. This was inspired by the seminal article on disk technology amusingly entitled "You don't know jack about disks" published by the ACM.

The second presentation, also at Fiteclub, was entitled Open Source Intelligence in Finance and was inspired by the techniques used in open source intelligence applied to finance. Here I build the case for news analysis applied to program trading.

No comments: