Skip to main content

Study of Performance Improvement Techniques for Code Generation in Large Language Models

Page 1

International Research Journal of Engineering and Technology (IRJET)

e-ISSN: 2395-0056

Volume: 11 Issue: 10 | Oct 2024

p-ISSN: 2395-0072

www.irjet.net

Study of Performance Improvement Techniques for Code Generation in Large Language Models Swapna Shingade1, Pratik Bhilore2, Juie Pachupate2, Sumeet Gaikwad2, Akash Pandit2 1Assistant Professor, Department of Artificial Intelligence and Data Science, PVGCOET, Maharashtra, India. 2Student, PVGCOET, Maharashtra, India

-----------------------------------------------------------***------------------------------------------------------------

Abstract - Advancements in AI and Machine Learning

involves adapting pre-trained models to specific tasks or domains, thereby improving their performance on those tasks. Prompt Design focuses on crafting effective prompts that guide the model to generate more relevant and accurate code. Context Awareness emphasizes the importance of incorporating both in-file and cross-file contexts to provide the model with a comprehensive understanding of the codebase. By understanding the effects of these techniques on the quality of code generation, the study provides a comprehensive overview of the current state of the art, highlighting key techniques and their impact on the performance of LLMs in code generation tasks.

have led to the creation of Large Language Models (LLMs) capable of generating human-like text and code. Despite this, generating accurate and efficient code remains a challenge. The study surveys performance improvement techniques for LLM code generation, focusing on FineTuning, Prompt Design, and Context Awareness. Fine-tuning enhances LLM by adjusting the models on specific coding datasets, improving adaptability to different coding tasks. Prompt Design involves clear crafting and precise prompts to guide LLMs, enhancing the quality and accuracy of generated code. Context Awareness equips LLM with the ability to maintain and utilize context, ensuring coherence and consistency in code generation.

2. INTRODUCTION TO FINE-TUNING: Fine-tuning in large language models (LLMs) involves the process of adapting a pre-trained model by optimizing its parameters on a task-specific, smaller dataset. The procedure leverages the pre-existing knowledge encoded in the model, to allow specialization in peculiar tasks, such as text classification, translation, or sentiment analysis. During fine-tuning, the model's weights are adjusted through gradient descent, refining its ability to perform the new task while retaining general language understanding. It typically requires balancing between preserving the model's generalization capabilities and adapting it to the specific characteristics of the new data, ensuring optimal performance on the target task without overfitting.

Through empirical analysis and case studies, the survey evaluates the techniques' impact on code generation performance. The findings aim to provide guidelines for optimizing LLMs, contribute to more reliable and efficient code generation, and improve software development processes. Key Words: Large Language Models (LLMs), FineTuning, Prompt Designing, Context awareness, Code Generation

1. INTRODUCTION The rapid advancements in Artificial Intelligence and Machine Learning have led to the development of Large Language Models (LLMs) capable of generating humanlike text. Among their many applications, code generation has emerged as a particularly promising area, offering the potential to significantly enhance software development processes. However, the performance of these models in generating accurate and efficient code remains a critical challenge. The survey paper aims to explore different performance improvement techniques for code generation in LLM. By examining the latest research and methodologies, the survey seeks to identify strategies that can enhance the accuracy, efficiency, and overall effectiveness of code generation models.

Retraining an LLM on task-specific data not only increases its ability to encode contextual knowledge but also significantly improves its performance in generating more relevant content. The minimal human-labeled seed tasks are used to generate more data. In this way, it is ensured that the quality and relevance of the tasks in the dataset are maintained while the manual effort is minimized. Over time, various traditional fine-tuning approaches have been utilized, including supervised fine-tuning on labeled datasets, transfer learning where pre-trained models are adapted to new tasks, and domain adaptation techniques aimed at enhancing model performance in specific contexts. In recent advances, full fine-tuning and various parameterefficient fine-tuning methods have been proposed.

The paper dives into three main techniques: Fine-tuning, Prompt Design, and Context Awareness. Fine-tuning

© 2024, IRJET

|

Impact Factor value: 8.315

|

ISO 9001:2008 Certified Journal

|

Page 678


Turn static files into dynamic content formats.

Create a flipbook
Study of Performance Improvement Techniques for Code Generation in Large Language Models by IRJET Journal - Issuu