Large Language Models for Serverless Function Generation: An Investigation on FaaS

The rapidly expanding use of large language models (LLMs) for code generation introduces promising opportunities and challenges in cloud computing, particularly regarding runtime performance. Previous research has focused primarily on LLMs in non-cloud environments, focusing on the largest, resource-intensive LLMs while neglecting mid-range LLMs (1.5–8 billion or 32–70 billion parameters) that can operate on consumer-grade hardware. In this work, we address this gap by investigating both large-scale publicly accessible LLMs and smaller locally hosted LLMs (e.g. those that can be deployed on small clusters or local machines) for serverless function code generation. We investigate serverless function runtime performance, and also cloud hosting costs for code generated by different LLMs. Our study examines LLM-generated solutions for three code generation prompt categories (e.g. common benchmarks, interview questions, and custom problems that LLMs may not have ever seen). We analyze how prompt characteristics, such as token count and metadata, impact serverless function runtime performance in the cloud for each solution.