Advancing SAS to R Translation with Generative AI
Conference
Regional Statistics Conference 2026
Format: IPS Abstract - Malta 2026
Keywords: advance, artificial intelligence (ai), code, genai
Session: IPS 1249 - Using Artificial Intelligence for Official Statistics
Wednesday 3 June 11:20 a.m. - 1 p.m. (Europe/Malta)
Abstract
Integrating generative artificial intelligence into statistical practice opens exciting opportunities to improve how we work with data and migrate code. Developed in 2023, our SAS to R code translator has evolved significantly into a production-level application actively used within our organization. It leverages multiple large language models, a multi-agent system, test-time compute, and advanced prompt engineering to automatically convert SAS code into clean, high-quality R code that preserves statistical logic and meets internal standards. Ongoing enhancements include running LLMs locally, with careful evaluation of hardware requirements and open-source model options to reduce costs and enable offline capabilities. The tool has proven highly effective at streamlining legacy code migration while maintaining strong reliability. Statisticians and programmers always perform the final review and validation, keeping human oversight central to the process. All the code and implementation details are publicly available on GitHub, inviting the statistical community to explore, use, and contribute to the project. This work demonstrates how generative AI can be applied practically and responsibly in statistics, balancing powerful innovation with the accuracy, security, and professional standards our field demands.