Business Challenge
Situation
A global enterprise aimed to evaluate the performance of endpoint devices for deploying Generative AI workloads, focusing on advanced use cases like Retrieval-Augmented Generation (RAG). The objective was to assess the feasibility of utilizing state-of-the-art hardware, such as Intel NPU-enabled laptops, to support enterprise AI applications.
Challenges
Performance Limitations
Hardware Compatibility
Optimization Needs
Autonomous Operation
Solution
To address these challenges, a state-of-the-art LLM-based RAG-driven chatbot application was developed:
Technical Design
Deployment Tested
Secure Processing
Optimization Techniques
Technology & Tools
Hardware:
Intel Meteor Lake (32GB, Intel Core Ultra 7 155U), Alder Lake-U (64GB, Intel Core i7), and legacy Intel Core i5 laptops.
Framework:
Intel OpenVINO for model training, hosting, and querying.
Model:
Llama3 with 7B parameters, quantized to int4 for optimized performance.
Workload:
RAG-driven chatbot trained on domain-specific datasets.