Towards specialized efficient LLMs: Data Scaling Laws and Sparse Adapters

Welcome to the AI research bites. This series of short and informative talks showcases cutting-edge research work from ServiceNow AI Research team. The AI Research Bites are open to all, especially those interested in keeping up with the fast-paced AI research community.

This presentation explores two complementary paths to model specialization: choosing the right data sources during pre-training and adapting models efficiently after training. We introduce a scaling-law framework that predicts data source utility across compute budgets, revealing that rankings often shift at larger scales—insights that can prevent costly misallocation in domain-specific pre-training. We then present sparse adapters as a powerful post-training method, showing they can outperform LoRA and enable data-private knowledge transfer across tasks through model merging. Together, these approaches point toward more efficient and adaptable domain-specific models.

Paper: https://arxiv.org/abs/2507.07140 and https://www.arxiv.org/pdf/2507.22250
ServiceNow AI Research team: https://www.servicenow.com/research/