Interesting, but it can be quite inefficient. Can't you just do dijkstra on the GPU? I guess it would be hard with simple OpenGL, but OpenCL should do the trick.
GPUs get their parallelism via SIMD: a single instruction operating on an array of many elements all at once. If a kernel/shader branches a lot, so that pieces of data must be acted upon individually, the GPU can't parallelize the work effectively. For example, Dijkstra's algorithm is queue oriented and isn't really amenable to parallel operation and a different approach must be taken to map the problem onto the target hardware.
But constructing a queue isn't entirely impossible on a GPU framework, it's just complicated and has some overhead. I'm just wondering how that would compare to the posted approach. Or, maybe the unnecessary work can somehow be culled in the opproach with cellular automata.
The difference is that the breadth-first horizon is searched in parallel, exploiting the GPU's architecture, rather than pushed out one square at a time as a CPU would normally do it.
5 comments
0 u/Wmb102er 23 Jun 2015 13:21
Interesting, but it can be quite inefficient. Can't you just do dijkstra on the GPU? I guess it would be hard with simple OpenGL, but OpenCL should do the trick.
1 u/skeeto [OP] 23 Jun 2015 15:03
GPUs get their parallelism via SIMD: a single instruction operating on an array of many elements all at once. If a kernel/shader branches a lot, so that pieces of data must be acted upon individually, the GPU can't parallelize the work effectively. For example, Dijkstra's algorithm is queue oriented and isn't really amenable to parallel operation and a different approach must be taken to map the problem onto the target hardware.
0 u/Wmb102er 23 Jun 2015 15:13
But constructing a queue isn't entirely impossible on a GPU framework, it's just complicated and has some overhead. I'm just wondering how that would compare to the posted approach. Or, maybe the unnecessary work can somehow be culled in the opproach with cellular automata.
0 u/Unleeb 24 Jun 2015 09:06
How is this any different from a straight forward breadth-first search? Ok, it has been implemented on a GPU, but otherwise?
Disclaimer: It's been ages since I did anything related to "algorithms and data structures". It's also quite possible I'm stupid O_O
1 u/skeeto [OP] 24 Jun 2015 19:54
The difference is that the breadth-first horizon is searched in parallel, exploiting the GPU's architecture, rather than pushed out one square at a time as a CPU would normally do it.