z-logo
Premium
GPGPUs in computational finance: massive parallel computing for American style options
Author(s) -
Pagès Gilles,
Wilbertz Benedikt
Publication year - 2011
Publication title -
concurrency and computation: practice and experience
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.309
H-Index - 67
eISSN - 1532-0634
pISSN - 1532-0626
DOI - 10.1002/cpe.1774
Subject(s) - computer science , cuda , computational finance , dynamic programming , parallel computing , general purpose computing on graphics processing units , valuation of options , mathematical finance , computational economics , embarrassingly parallel , autoregressive model , parallel algorithm , finance , algorithm , mathematics , graphics , computer graphics (images) , economics , macroeconomics , econometrics
SUMMARY The pricing of American style and multiple exercise options is a very challenging problem in mathematical finance. One usually employs a least squares Monte Carlo approach (Longstaff–Schwartz method) for the evaluation of conditional expectations, which arise in the backward dynamic programming principle for such optimal stopping or stochastic control problems in a Markovian framework. Unfortunately, these least squares MC approaches are rather slow and allow, because of the dependency structure in the backward dynamic programming principle, no parallel implementation neither on the MC level nor on the time layer level of this problem. We therefore present in this paper a quantization method for the computation of the conditional expectations that allows a straightforward parallelization on the MC level. Moreover, we are able to develop for first‐order autoregressive processes a further parallelization in the time domain, which makes use of faster memory structures and therefore maximizes parallel execution. Furthermore, we discuss the generation of random numbers in parallel on a GPGPU architecture, which is the crucial tool for the application of massive parallel computing architectures in mathematical finance. Finally, we present numerical results for a CUDA implementation of these methods. It will turn out that such an implementation leads to an impressive speed‐up compared with a serial CPU implementation. Copyright © 2011 John Wiley & Sons, Ltd.

This content is not available in your region!

Continue researching here.

Having issues? You can contact us here