To learn or not to learn TRITON, that is the question

I have been thinking recently: As a performance ML engineer who is fairly confident writing cuda kernels, is it worth it to learn Triton? If you find yourself in a similar situation, here is my 2 cents: 1) If you don't know cuda, start with Triton. Learn cuda after that. Triton is very python like. Abstracts away most of the details of the GPU hardware that you have to think about while writing cuda code. 2) If you know cuda, still worth it to learn Triton. It's a different programming paradigm. Would be fun to think about problems from a different perspective (compared to cuda's programming paradigm, which is scalar in program, blocked in threads... triton's programming model is blocked in program, scalar in threads) Conclusion: I am going for it. Over the next few weeks, I will be implementing some common problems (eg: matrix multiplication, matrix transpose, convolution) in Triton and doing a timing comparison between the CUDA and Triton codes to get a sense of the power of Triton! Stay tuned!

Apr 13, 2025 - 06:32
 0
To learn or not to learn TRITON, that is the question

I have been thinking recently:
As a performance ML engineer who is fairly confident writing cuda kernels, is it worth it to learn Triton?
If you find yourself in a similar situation, here is my 2 cents:

1) If you don't know cuda, start with Triton. Learn cuda after that. Triton is very python like. Abstracts away most of the details of the GPU hardware that you have to think about while writing cuda code.

2) If you know cuda, still worth it to learn Triton. It's a different programming paradigm. Would be fun to think about problems from a different perspective (compared to cuda's programming paradigm, which is scalar in program, blocked in threads... triton's programming model is blocked in program, scalar in threads)

Conclusion:
I am going for it. Over the next few weeks, I will be implementing some common problems (eg: matrix multiplication, matrix transpose, convolution) in Triton and doing a timing comparison between the CUDA and Triton codes to get a sense of the power of Triton!
Stay tuned!