I am seeking a seasoned CUDA developer to design a custom CUDA kernel for machine learning purposes. Specifically, a variant of flashattention.
- Key Responsibilities:
- Develop a CUDA-based kernel for deep learning.
Ideal candidates should have experience in algorithmic development, with a deep understanding of Transformers and have previous exposure to CUDA programming. Candidates should be able to demonstrate a track record of success in similar roles.