Zhaode's blog

I am a technology expert and researcher specializing in the high-performance deployment of AI models on edge devices. My passion lies at the intersection of computer architecture, compiler technology, and artificial intelligence, with a core focus on making powerful AI, especially Large Language Models (LLMs), run efficiently on the devices we use every day.

Currently, I am an Inference Engine Technology Expert at Alibaba (Taotian Group), where I am one of the core architects and developers of MNN, a leading deep learning inference engine with over 13,000 stars on GitHub. My work involves everything from low-level operator optimization across CPUs, GPUs and NPUs to advanced model conversion and memory management.

Recognizing the future of generative AI on mobile, I initiated and now lead the MNN-LLM project. This open-source initiative is dedicated to optimizing and deploying LLMs on-device, and I’m proud that it has already enabled support for dozens of open-source models and garnered significant community interest.

My academic background is in computer architecture, with a Master’s degree from the Institute of Computing Technology, Chinese Academy of Sciences. This foundation drives my approach to solving software challenges with a deep understanding of the underlying hardware.

Through this blog, I share my findings and insights on AI deployment, performance optimization, and the latest trends in on-device machine learning. Welcome, and feel free to connect with me via my social links below.

latest posts

Nov 14, 2025	MNN支持Eagle3
Nov 04, 2025	LLM训练实战手册
Oct 24, 2025	MNN模型支持：Qwen3-VL
Sep 25, 2025	一图读懂Qwen
Sep 22, 2025	端侧LLM硬件系列（二）：内存容量