Zhaode's blog
On-Device AI & LLM & Open Source Contributor
I am a technology expert and researcher specializing in the high-performance deployment of AI models on edge devices. My passion lies at the intersection of computer architecture, compiler technology, and artificial intelligence, with a core focus on making powerful AI, especially Large Language Models (LLMs), run efficiently on the devices we use every day.
Currently, I am an Inference Engine Technology Expert at Alibaba (Taotian Group), where I am one of the core architects and developers of MNN, a leading deep learning inference engine with over 13,000 stars on GitHub. My work involves everything from low-level operator optimization across CPUs, GPUs and NPUs to advanced model conversion and memory management.
Recognizing the future of generative AI on mobile, I initiated and now lead the MNN-LLM project. This open-source initiative is dedicated to optimizing and deploying LLMs on-device, and I’m proud that it has already enabled support for dozens of open-source models and garnered significant community interest.
My academic background is in computer architecture, with a Master’s degree from the Institute of Computing Technology, Chinese Academy of Sciences. This foundation drives my approach to solving software challenges with a deep understanding of the underlying hardware.
Through this blog, I share my findings and insights on AI deployment, performance optimization, and the latest trends in on-device machine learning. Welcome, and feel free to connect with me via my social links below.
latest posts
Sep 10, 2025 | Qwen3-Next:下一代MoE模型架构解析 |
---|---|
Sep 02, 2025 | 端侧LLM硬件系列(一):内存带宽 |
Aug 18, 2025 | CoreML踩坑记:慎用Conv1D |
Aug 08, 2025 | 深入 gpt-oss-20b 架构:MNN 移动端性能实践 |
Aug 05, 2025 | 混元端侧模型分析 |