Sparsity In Deep Neural Nets

Mon, 01 Apr 2024 00:00:00 +0000

Large Language Models (LLMs) have captured the attention of the tech world with their remarkable common-sense reasoning and generalizability. However, their large size and server transfer requirements can make them resource-intensive and slow, which is problematic for use in mobile or wearable devices like smart glasses and smart watches. Moreover, on-device computing could offer a solution to privacy concerns by keeping sensitive data, such as text messages or photos, on the device itself. To tackle these challenges, we’ve developed a more compact language model, ranging from 0.5B to 1.4B parameters. This model is designed to run on-device, providing a competitive performance for conversational grounded tasks, while also managing latency and memory usage effectively.

Nelson-Bighetti | 2025 Conference on Predictive Inference in Sports

Sparsity In Deep Neural Nets