Wednesday, June 20, 2007

A Lesson in Simplification: Heuristic methods, Genetic Algos and NP Complete Problems

It's rare to find an intriguing and difficult task in investment banking - mostly our work involves straight forward computational work. I was asked to write a program for a guy who was manually sorting spreadsheets which contained totals of net asset values for exchange traded funds from Reuters. He was trying to find out which flags had been used by Reuters to calculate the total.

My initial reaction was that this would be a simple iterative algorithm which would be easy to implement. After consulting two quantitative analysts I know, I was informed that I'd come across the Postage Stamp problem (a.k.a. the Money Changing problem) which could only be solved using a brute force approach. Technically, the problem is N-P Complete. There was no known algorithm for this so he suggested I go through all the subsets as outlined in this article on Generating Partitions. At this point I felt like giving up but I persevered, and read some interesting articles which related the problem to Frobenious so I found some code which implements this though didnt get particulary far with this as it produced a (to me at least) meaningless series of numbers. I did find this McNuggets Problem analogy from Wolfram useful in explaining the problem in Joe six-pack terms. I also found a candidate algorithm which looked good for the job - The Diophantine Linear Equation.

A young turk here, Darryl Stein, was itching to help so we both worked on candidate solutions. He adopted a
GA approach based on a C# GA library from those lovely people at Google - Aforge.NET - this was efficient and reasonably accurate on the data we had - but being c# - it couldnt' work on the unix platform so was not really feasible as this had to be done server side (the data was coming out of Vhayu).

So we ended up coding a version of the Diophantine equation which did pretty much what the guy was doing with the spreadsheet - ie - starting with the biggest number, then the next until the limit was breached then sieving through the permutations until you get a hit or not. Darryl beat me to it with a neatly coded recursive solution which did the job.








No comments: