12/03/2008: latency/ directory is updated to work with sdk3.0 and libspe2 note: most code are outdated (pre-SDK 3.0). The ones I keep updating are /libtask and tasktest/ If you find errors in the code, please send email to Guochun Shi . bandwidth: This directory contains code that measures the bandwidth between PPU and SPUs, bandwidth can read ~24GB/s. For unknown reason the bandwidth is not consistent, i.e. I may get 24 GB/s one day and get 15GB/s only the other day with the same code and the same machine. complex_matrix: 4x4 complex matrix implementation complex_matrix3x3: compute 3x3 complex matrix in the form of 4x4 complex matrix, i.e. there is bandwidth and compuation waste(9/16 = 56% efficiency) complex_matrix3x3_native: compute 3x3 complex matrix in more efficient way, no bandwidth or computation power is wasted dmalist: dmalist example dotproduct: dotproduct example dotproduct_v: another dot product example(see README in the directory) eib_bandwidth: measure the CELL bus bandwidth event: event(interrupt mailbox) sample code latency: the latency is about 7~8 us roundtrip between ppu -- spu libtask: a simple library which takes an task_t struct to do computation in spu, see tasktest/ tasktest: sample code to use libtask library matrix_mult: This directory contains code that runs a big loop of 4x4 float value matrix multiplication the speedup is ~150 with 8 SPUs compared with running the code in ppu only specomm: This directory contains code that demonstrate communication between SPUs including dma, mailbox and signal FAQ: 1. What size of dma will work? dma requires size to be 1, 2, 4, 8, 16, 32, ..., 16n (n is an integer)