|I'm just curious, but can the muon1 app make use of the nvidia tesla workstations?|
|The current version of Muon1 only uses the GPU for plotting graphics.|
|Yeah I think though hes thinking of using CUDA to try and accelerate the functions Stephen, I assume the performance boost that you could get would be pretty massive.|
[Edited by CloverField at 2011-08-13 15:07:23]
|I've tried using the GPU and found you can only get single-precision floating point, which I think isn't really enough for Muon1. Some interfaces say they support double precision but it turns out to be some kind of emulation or kludge and the speed reduces 4x. Very few cards support double precision in hardware. "Tesla" cards might, but the installed base is tiny, since most people don't need it. Until your common mid-range GPU can do double precision calculations it's not worth me supporting it.|
Also don't say "CUDA", say "OpenCL". CUDA is a proprietary NVidia interface, whereas OpenCL is an industry standard. (In fact I got single-precision calculations working using just OpenGL and pixel shaders, though that's a bit of a hack).
|Yeah I just looked at it, seems like all the GTX series cards get only 1/8 of there total speed with double precision, while the Tesla and Quadro Cards get half of their speed in double precision. So I assume that the pay off wouldn't be worth the time then.|
|I could live with "half speed" double precision (it is twice as long, after all), but the cards that support it would have to extend outside of the high-end workstation world.|
Once the hardware looks ready, this is going to be worth considering. It would require an almost complete rewrite of Muon1's core logic, though. Each component type would require its own shader to generate electromagnetic fields and then I'd link them into a shader that does particle tracking. The difficult bit is the fact that particles are affected by differing sets of components at any given time, so that tracking shader isn't fixed, or will have to know something about the layout of the accelerator.
|The other option could be to follow the BOINC method, and have a separate 'client' that runs almost entirely on the gpu. So the regular muon1 client runs on the cpu, and a secondary module runs effectively a second instance on the gpu.|