StatLLaMA: Multi-stage training for domain-optimized statistical large language models

Jing-Yi Zeng; Guan-Hua Huang

doi:10.52933/jdssv.v6i4.171

StatLLaMA: Multi-stage training for domain-optimized statistical large language models

Authors

Jing-Yi Zeng National Yang Ming Chiao Tung University
Guan-Hua Huang National Yang Ming Chiao Tung University https://orcid.org/0000-0002-1802-3855

DOI:

https://doi.org/10.52933/jdssv.v6i4.171

Keywords:

Foundation model, supervised fine-tuning, continual pretraining, instruction tuning, reinforcement learning from human feedback.

Abstract

This study investigates how to efficiently build a domain-specialized large language model (LLM) for statistics using the lightweight LLaMA-3.2-3B family as the foundation model (FM). We systematically compare three multi-stage training pipelines—starting from a base FM with no instruction-following capability, a base FM augmented with post-hoc instruction tuning, and an instruction-tuned FM with strong general reasoning abilities—across continual pretraining, supervised fine-tuning (SFT), and reinforcement learning from human feedback (RLHF) preference alignment. Results show that pipelines beginning with a base FM fail to develop meaningful statistical reasoning, even after extensive instruction tuning, SFT, or RLHF alignment. In contrast, starting from LLaMA-3.2-3B-Instruct enables effective domain specialization. A comprehensive evaluation of SFT variants reveals clear trade-offs between domain expertise and general reasoning ability. We further demonstrate that direct preference optimization provides stable and effective RLHF preference alignment. The final model, StatLLaMA, achieves strong and balanced performance on benchmarks of mathematical reasoning, common-sense reasoning, and statistical expertise, offering a practical blueprint for developing resource-efficient statistical LLMs. The code is available at https://github.com/HuangDLab/StatLLaMA.

StatLLaMA: Multi-stage training for domain-optimized statistical large language models

Authors

DOI:

Keywords:

Abstract

Downloads

Published

How to Cite

Issue

Section

License