This code does 4x4 complex matrix computation speedup is around ~440