120 likes | 145 Views
Learn about pricing Asian options using binomial model with CUDA algorithms for faster computation. Understand the process of generating average price and option price trees efficiently. Discover limitations and results of the approach.
E N D
Da-Yoon Chung Daniel Lu Pricing Asian Options with the Binomial Model
Options • An option is a contract that can be bought or sold, its value is a function of the value of the underlying stock • An Asian option is an option whose terminal value is based on the average prices of stock at certain points in time.
Pricing Options • RecombinantBinomial Tree • Problem with Asian Options: • Non-recombinant (2^N) paths we have to consider
(Serial) Algorithm Step 1: generate the average price tree • for each (i,j), store 2*N out of iCj possible values of the running average up to that point • Requires 2*N random paths of length O(N) consisting of up or down for each (i,j) (O(N^3) storage overhead) • Step 2: generate the option prices tree • Each level of the tree depends only on the next • use backwards inductive approach starting at the leaves of the tree • use linear interpolation to find an estimate of the option price at each node
CUDA Algorithm (global memory) Step 1: generate the average price tree • Compute all random paths required using Thrust • For each (i,j), the 2*N average values are computed in parallel (one thread per node) • Write all updates immediately to the global tree • Step 2: generate the option price tree • Compute each level of the tree in parallel • Write all updates immediately to the global tree
CUDA Algorithm (shared memory) Step 1: generate the average price tree • Again, for each (i,j), the 2*N average values are computed in parallel (one thread per node) • Store all intermediate values in shared memory to minimize global memory accesses • Use a hash function to generate the random path within the kernel (reduce memory overhead) Step 2: generate the option price tree • Divide the tree into subtreesat the same depth in the original tree which can be computed independently • Compute one level of subtrees per kernel call • Store all computations for subtrees in shared memory
Limitations • Size of shared memory • N = 64, Tree occupies N * N * (2 * N) * sizeof(float) = 2^6 * 2^6 * (2 * 2^6) * 4 = 2^21 = 2m (48k shared memory) • Increasingly sequential as N increases • Nature of the algorithm • Step 2 of the algorithm is inherently sequential