Zhaode's blog

On-Device AI & LLM & Open Source Contributor

I am a technology expert and researcher specializing in the high-performance deployment of AI models on edge devices. My passion lies at the intersection of computer architecture, compiler technology, and artificial intelligence, with a core focus on making powerful AI, especially Large Language Models (LLMs), run efficiently on the devices we use every day.

Currently, I am an Inference Engine Technology Expert at Alibaba (Taotian Group), where I am one of the core architects and developers of MNN, a leading deep learning inference engine with over 13,000 stars on GitHub. My work involves everything from low-level operator optimization across CPUs, GPUs and NPUs to advanced model conversion and memory management.

Recognizing the future of generative AI on mobile, I initiated and now lead the MNN-LLM project. This open-source initiative is dedicated to optimizing and deploying LLMs on-device, and I’m proud that it has already enabled support for dozens of open-source models and garnered significant community interest.

My academic background is in computer architecture, with a Master’s degree from the Institute of Computing Technology, Chinese Academy of Sciences. This foundation drives my approach to solving software challenges with a deep understanding of the underlying hardware.

Through this blog, I share my findings and insights on AI deployment, performance optimization, and the latest trends in on-device machine learning. Welcome, and feel free to connect with me via my social links below.

latest posts