AI Model Compiler Engineer

페블스퀘어

Job Type

Full Time

Job Category

Etc

Work Model

On-site

Working Days

Mon, Tue, Wed, Thu, Fri

Work Time

Salary

Follow Company Inner Rule

Location

331 Pangyo-ro, Bundang-gu, Seongnam-si, Gyeonggi-do, Republic of Korea

Job Description

- Model Compilation Pipeline: Design and implement compilers that translate AI models (ONNX, TensorFlow, PyTorch, etc.) into executable formats for AI accelerators and edge devices. - Graph Optimization: Apply operator fusion, pruning, quantization, and memory optimizations to improve model performance. - Hardware Acceleration: Optimize AI model execution on CPU, GPU, DSP, TPU, or custom AI chips (e.g., NPU, FPGA). - Intermediate Representations (IRs): Work with MLIR, TVM, XLA, Glow, or custom IRs for model transformation. - Performance Tuning: Profile and analyze models using LLVM, Halide, CUDA, OpenCL, or Metal. - Kernel Optimization: Develop low-level math libraries (SIMD, vectorized ops, matrix multiplications, tensor ops) for efficient AI inference. - Custom Operator Support: Implement new AI operators and optimize execution on target hardware. - Cross-Platform Deployment: Enable model portability across multiple architectures and backends. - AI/ML Framework Integration: Extend compiler functionality for PyTorch, TensorFlow, ONNX Runtime, and other ML frameworks. - Debugging & Benchmarking

Qualifications

- Education: Bachelor's, Master's, or Ph.D. in Computer Science, Electrical Engineering, or related fields. - Experience: 2+ years in model compilation, AI frameworks, or deep learning accelerators. - Programming Languages: C, C++, Python, and LLVM IR or MLIR. - Compiler Development: Experience with LLVM, TVM, XLA, Halide, Glow, or custom ML compilers. - Graph Transformations: Knowledge of operator fusion, loop unrolling, constant folding, quantization, and tiling techniques. - Hardware Optimization: Experience with SIMD, CUDA, OpenCL, ROCm, or low-level tensor operations. - AI Frameworks: Hands-on with TensorFlow, PyTorch, ONNX, TensorRT, TFLite, or OpenVINO. - Parallel Computing: Experience with multi-threading, vectorization (SSE/AVX), and heterogeneous computing.

Preferred

- Neural Network Compression: Quantization-aware training (QAT), weight pruning, distillation. - Cloud & Edge AI: Deploying models on AWS Inferentia, NVIDIA Jetson, Intel Movidius, or Qualcomm AI chips. - Formal Methods & Verification: Model validation, correctness proofs, and fuzz testing for compiler robustness.

Etc

- All recruitment processes are ongoing and may close early if there are successful applicants. - Applicants wishing to apply for multiple positions can discuss their options during the interview process. To ensure a smooth recruitment process, please submit your application only once. - International applicants must indicate their Korean language proficiency level and visa type. - Recruitment processes may be added or omitted depending on the circumstances. - All interviews will be conducted one-on-one or group-to-group and will last approximately 30 minutes to 1 hour. - If any false information is discovered in the submitted materials or throughout the recruitment process, your application will be cancelled.

Preferred Visas

Employment Visa (E1~E7)

Residence (F2)

Overseas Korean (F4)

Permanent Residence (F5)

International Marriage (F6)

페블스퀘어

Industry

C. Manufacturing

E-mail

info@pebble-square.com

Website

https://pebblesquare.ninehire.site/

Company Location

경기도 성남시 분당구 판교로 331, 402호

This job posting must not be copied, distributed, or modified without permission from 코워크위더스(주). Any unauthorized use-including for non-recruitment purposes-is strictly prohibited.