Programmers take the burden of writing parallel programs to increase speed and
memory. The aim of every parallel algorithm designer is to write code that scales linearly,
i.e. runs
times as fast on a
-processor machine. This clearly
constitutes an upper bound, if the sequential algorithm is already optimal.
Linear scalability is achieved by using good load-balancing, keeping all processors
busy all the time and communication costs are minimized.
Data dependency can make optimal speed-up impossible. It determines parallel complexity, the minimum number of steps an algorithm would need to run on a PRAM-computer. This constitutes an upper bound on the maximal speed-up that can be achieved.
There are many different and more sophisticated layouts of parallel implementations possible. The right choice
depends on the size of the portfolio and available hardware. For the sake of simplicity in
this article we have chosen the brute force approach.
Sketch of the scalability for a parallel brute-force GS: To avoid local extrema, one has to start many times from different grid-points:
Sketch of the scalability for SI:
One typical schoolbook error in this context is not to use a high quality random number generator, assuring independent random number streams on all processors (cf. Mascagni, Ceperley and Srinivasan (1998, 1999)).
Large clusters as well as the rise of grid-computing requires analytic forecastig of run-times to chose the appropriate hardware for the task. There are many potential trade-offs (cf. Jarvis et al. (2002, 2003) and Roehrl (1998)): time versus money, etc. Our paper has shown that a pragmatic approach can take advantage of developments in computerscience to enable the exploration of new portfolio optimization techniques using parallel computing techniques.