@eloj: I've had solutions that resemble duffs but you quickly run into two problems: no relative jumps (only to explicit labels) and code size. Most puzzles have a limit of 50, some let you do 75 instructions, and labels count against that :(
@df-1: Most SPU code, if not written straight in C/C++, had a C/C++ reference implementation, which was a lot easier than trying to port assembly. A lot of the super math intensive stuff these days now live as compute jobs on the GPU.
cowbs's comments