Sometimes it is useful to look at the intermediate and assembly code for GPU programs. This can lead to some interesting performance insights, especially for compiler writers. Unfortunately, the AMD APP SDK is a bit limited on Linux, and the AMD APP KernelAnalyzer, which conveniently dumps the AMDIL and Device ISA for an OpenCL kernel, is not available on Linux. However, digging through the AMD APP OpenCL Programming Guide, one finds an environment variable that can be used for the same purpose: GPU_DUMP_DEVICE_KERNEL.
According to the programming guide, this environment variable can take one of three values:
|1||Save intermediate IL files in local directory.|
|2||Disassemble ISA file and save in local directory.|
|3||Save both the IL and ISA files in local directory.|
Therefore, if you run your OpenCL program with:
$ GPU_DUMP_DEVICE_KERNEL=3 ./my-program
You will get two files in your local directory: [kernel-name]_[device-name].il and [kernel-name]_[device-name].isa, which contain AMDIL and Device ISA disassembly, respectively.