User Documentation Introduction Abstraction Thread Block Warp Element Implementation Library Interface Structure Usage Rationale Details Mapping onto Specific Hardware Architectures CUDA GPUs x86 CPUs Accelerators Developer Documentation Code Formatting Publishing Doxygen Documentation on gh-pages