Role Description

As AI HPC Engineer at ACL, you will be contributing to achieving ACL's mission of delivering advanced parallel computing research by:

Building theoretical models that break down KLA's image processing algorithms, that leverage AI, in computing terms such as bandwidth, computational FLOPS, etc.
Bridging the gap between the theoretical peak performance achievable on current and next-gen hardware such as GPUs and AI accelerators by enhancing the algorithm.
Porting and optimizing algorithms on current and next-gen CPUs, GPUs, and AI accelerators by leveraging constructs in high-performance modern programming languages such as C++-14/C++-17, and low-level programming models such as SIMD extensions (SSE/AVX), CUDA, OpenVINO, etc.
Exploring paths to achieve price-optimized-performance in next-generation devices that implement revolutionary new solutions to accelerate AI algorithms for training and inference.

Expected Background

3-7 Year's of Experience required in GPU Programming using CUDA.
Graduates in Ph.D, MS in EE/CS/CSE.
Bachelors graduates will also be considered with exceptional background and prior experience in HPC field.
Strong foundation in computer architecture, with interest in high performance parallel processing at the device level (GPUs or CPUs/SIMD).
Strong mental model of computational loads and mapping different algorithms to parallel architectures.
Proficient in programming skills in C/C++/Python.
Good understanding and exposure to the Linux operating system at the user level.
Exposure to multiprocessor and multithreading concepts
A self-motivated individual with good communication skills.

Hands-on experience with GPU programming using CUDA, OpenCL or SYCL, and modern CPU programming constructs such as those in C++-14 / C++-17
Exposure to profiling tools such as NSIGHT or VTUNE.
Experience with large-scale distributed HPC systems.
Familiarity with AI frameworks like TensorFlow.
Hands-on work in developing and optimizing computer vision algorithms at scale.