Assignment 2 : Matrix Multiplication using shared memory

Fabian Prada


1) Experiment 1:

In this experiment I compared three matrix multiplication strategies. The first strategy takes the original matrices M and N and padd them with zeros entries to construct rectangular matrices with sides multiples of TILE_WIDTH (I took TILE_WIDTH=16). Then it runs a Kernel where matrix multiplication uses shared memory. The second strategy does not apply any preprocessing to the initial matrices, instead, it includes some conditional statements to deal with arbitrary matrices of arbitrary dimensions. This second strategy also uses shared memory. Finaly the third strategy does not implement shared memory.

In the following figure I show the results of my implementation on a GTX 560 TI:

newWidth.png
Observations:

2) Experiment 2:

In this experiment I compared the performance of my implementation for TILE_WIDTHS in the range [1,32]. The following figure shows the results obtained for the second multiplication strategy (the conditional kernel which uses shared memory) and matrices M, N of dimensions 1024x1024. The device is again a a GTX 560 TI:

newTile.png
Observations: