cv

Curriculum vitae of an AI inference engine expert specializing in on-device LLM deployment and high-performance computing.

General Information

Full Name Zhaode Wang
Languages Chinese, English

Education

  • 2017 - 2020
    M.S. in Computer Architecture
    State Key Laboratory of Computer Architecture, Institute of Computing Technology, Chinese Academy of Sciences
  • 2013 - 2017
    B.S. in Computer Science and Technology
    Shandong University

Experience

  • 2020 - Present
    Inference Engine Technology Expert
    Alibaba (Taotian Group)
    • Responsible for the architecture design and optimization of the MNN inference engine (13K+ GitHub Stars), including operator performance optimization, static memory allocation, and automatic operator fusion.
    • Led the development of model conversion tools for TorchScript, Tensorflow, and ONNX, with a focus on supporting control flow for advanced speech models (e.g., ASR, NLU).
    • Developed NPU backends (CoreML, NNAPI) to accelerate algorithms like super-resolution and beauty filters in major apps such as Taobao and Taobao Live.
    • Initiated and led the MNN-LLM project (1.5K+ GitHub Stars) for on-device deployment of large language models, involving deep optimization of Transformer models (quantization, memory, computation) to support over 100 open-source LLMs.
    • Enhanced the MNN ecosystem by developing cv/audio processing modules, refactoring Python interfaces, and building CI/CD automation pipelines, resulting in 2K+ stars for related demo projects.
  • 2018 - 2020
    Compiler Engineer Intern
    Cambricon Technologies
    • Focused on operator optimization for Cambricon chips, compiler development, and linker construction.

Publications

  • 2024
    MNN-LLM: A Generic Inference Engine for Fast Large Language Model Deployment on Mobile Devices
    • MM Asia 2024
  • 2022
    Walle: An End-to-End, General-Purpose, and Large-Scale Production System for Device-Cloud Collaborative Machine Learning
    • OSDI 2022

Competitions

  • 2025
    IEEE AICAS 2025 Grand Challenge: LLM Software and HardWare System Co-optimization
    • the First Prize
  • 2024
    IEEE AICAS 2024 Grand Challenge: LLM Software and HardWare System Co-optimization
    • the First Prize
  • 2024
    Tecorigin Operator Development Competition
    • the First Prize
  • 2022
    SOPHGO TPU Program Competition
    • the First Prize

Skills

  • C, C++, Python, Assembly
  • Compilers, AI Inference Engines, CPU/GPU/NPU Operator Optimization, Large Language Models