Learn how to effectively integrate large language models with Intel neural processing units, one of the compute engines available in the new Intel AI PC.
Model size + limited hardware resources in client devices (i.e., disk, RAM, CPU) make it increasingly challenging to deploy LLMs on laptops compared to cloud-based solutions.
The Intel AI PC solves this issue by including a CPU, GPU, and NPU on one device.
This session focuses on the NPU, showcasing how to prototype and deploy LLM applications on it locally.
Key learnings:
- How NPU architecture works, including features, advantages, and capabilities in accelerating neural network computations on Intel® Core™ Ultra processors (the backbone of Intel’s AI PCs).
- Practical aspects of deploying performant LLM apps on Intel NPU—from initial setup to optimization and system partitioning—using OpenVINO toolkit and its NPU plugin.
- Large language modes: what they are and advantages/challenges of local inference.
- Fast LLM prototyping on Intel Core Ultra processors using the Intel NPU Acceleration Library.
Includes real-world examples and case studies (like chatbots and RAG) that showcase the seamless integration of LLM applications with NPUs, including how this synergy can unlock performance and efficiency.
Sign up today.
Skill level: All
Featured software
Intel, the Intel logo, OpenVINO, and the OpenVINO logo are trademarks of Intel Corporation or its subsidiaries.
Intel, the Intel logo, and Intel Core are trademarks of Intel Corporation or its subsidiaries.
Alessandro Palla
Machine Learning Engineer, Intel Corporation