cv
Curriculum vitae of an AI inference engine expert specializing in on-device LLM deployment and high-performance computing.
General Information
Full Name | Zhaode Wang |
Languages | Chinese, English |
Education
-
2017 - 2020 M.S. in Computer Architecture
State Key Laboratory of Computer Architecture, Institute of Computing Technology, Chinese Academy of Sciences -
2013 - 2017 B.S. in Computer Science and Technology
Shandong University
Experience
-
2020 - Present Inference Engine Technology Expert
Alibaba (Taotian Group) - Responsible for the architecture design and optimization of the MNN inference engine (13K+ GitHub Stars), including operator performance optimization, static memory allocation, and automatic operator fusion.
- Led the development of model conversion tools for TorchScript, Tensorflow, and ONNX, with a focus on supporting control flow for advanced speech models (e.g., ASR, NLU).
- Developed NPU backends (CoreML, NNAPI) to accelerate algorithms like super-resolution and beauty filters in major apps such as Taobao and Taobao Live.
- Initiated and led the MNN-LLM project (1.5K+ GitHub Stars) for on-device deployment of large language models, involving deep optimization of Transformer models (quantization, memory, computation) to support over 100 open-source LLMs.
- Enhanced the MNN ecosystem by developing cv/audio processing modules, refactoring Python interfaces, and building CI/CD automation pipelines, resulting in 2K+ stars for related demo projects.
-
2018 - 2020 Compiler Engineer Intern
Cambricon Technologies - Focused on operator optimization for Cambricon chips, compiler development, and linker construction.
Publications
-
2024 MNN-LLM: A Generic Inference Engine for Fast Large Language Model Deployment on Mobile Devices
- MM Asia 2024
-
2022 Walle: An End-to-End, General-Purpose, and Large-Scale Production System for Device-Cloud Collaborative Machine Learning
- OSDI 2022
Competitions
-
2025 IEEE AICAS 2025 Grand Challenge: LLM Software and HardWare System Co-optimization
- the First Prize
-
2024 IEEE AICAS 2024 Grand Challenge: LLM Software and HardWare System Co-optimization
- the First Prize
-
2024 Tecorigin Operator Development Competition
- the First Prize
-
2022 SOPHGO TPU Program Competition
- the First Prize
Skills
- C, C++, Python, Assembly
- Compilers, AI Inference Engines, CPU/GPU/NPU Operator Optimization, Large Language Models