Issuu

International Research Journal of Engineering and Technology (IRJET)

e-ISSN: 2395-0056

Volume: 11 Issue: 01 | Jan 2024

p-ISSN: 2395-0072

www.irjet.net

Comparative Analysis of GANs and VAEs in Generating High-Quality Images: A Case Study on the MNIST Dataset Srija Chaturvedula1, Yaswanth Battineedi2 1Masters in Computer Science University of Florida, Gainesville, United States, 2Masters In Computer Science University of Florida, Gainesville, United States.

---------------------------------------------------------------------***---------------------------------------------------------------------

Abstract - Generative Adversarial Networks (GANs) and

metrics such as network size, training time, and the quality of the generated data. Additionally, the underlying mathematics are investigated and related to the theoretical foundations of GANs and VAEs.

Variational Autoencoders (VAEs) are two of the most often employed methods in the field of generative models, which is a growing area of research in machine learning. In order to examine the advantages, disadvantages, and prospective uses of GANs and VAEs on the MNIST dataset, we give a thorough comparison in this paper. We put these methods into practice using TensorFlow and Python and assess how well they perform using a variety of measures, including network size, training duration, and the caliber of the output data. In addition, we go into the underlying mathematics and connect our findings to the theoretical underpinnings of GANs and VAEs. The findings indicate that both methods can provide high-quality data, with GANs being particularly good at capturing the high-level aspects of the input data and VAEs being more suited to modeling the underlying probability distribution.

2. RELATED WORK Generative models have been the subject of significant research in the field of machine learning in recent years, with Generative Adversarial Networks (GANs) and Variational Autoencoders (VAEs) being two of the most widely used techniques. Several studies have compared the performance of GANs and VAEs on different datasets and applications, with some reporting better results for GANs (Karras et al., 2019) while others reporting better results for VAEs (Bowman et al., 2019). Some of the most influential papers in this area include Goodfellow et al.’s (2014) introduction of the GAN framework, and Kingma and Welling’s (2014) introduction of the VAE framework, which have been extensively cited in subsequent works. Salimans et al.’s (2016) paper proposed techniques for stabilizing the training of GANs, such as using different learning rates for the generator and discriminator, while Chen et al. (2016) proposed a modification to the GAN framework that allows for learning of interpretable representations.

1.INTRODUCTION This study focuses on generative models, which are machine learning algorithms that can generate new data based on a given dataset. Generative models have a wide range of applications in areas such as natural language processing, image synthesis, and anomaly detection. The two most commonly used generative models are Variational Autoencoders (VAEs) and Generative Adversarial Networks (GANs). GANs were introduced by Goodfellow et al. in 2014 and involve training two neural networks, a generator and a discriminator, to compete against each other in a minimax game. The generator tries to produce data that can fool the discriminator, while the discriminator tries to distinguish between real and fake data. VAEs were introduced by Kingma and Welling in 2013 and involve training a neural network to learn a probability distribution over the input data by modeling the distribution of a latent variable that captures the underlying structure of the data. The focus of this study is to compare the performance of GANs and VAEs on the MNIST dataset, a benchmark dataset of handwritten digits that has been widely used in the literature on generative models. The goal is to explore the strengths, limitations, and potential applications of both techniques, and to identify the key factors that influence their performance. The study implements both techniques using TensorFlow and Python, and evaluates their performance based on

Impact Factor value: 8.226

Mescheder et al.’s (2017) paper proposed a hybrid model that combines the strengths of VAEs and GANs, and Arjovsky et al.’s (2017) paper proposed a modification to the GAN framework that uses Wasserstein distance as the objective function, leading to more stable training. Kumar et al.’s (2019) paper proposed a modification to the GAN framework that introduces a bottleneck in the discriminator, leading to improved performance, while Shen et al.’s (2020) paper proposed a method for discovering interpretable directions in the latent space of GANs, allowing for control over specific attributes of generated images.

3. METHODOLOGY 3.1 DATASET We applied the 60,000 training photos and 10,000 test images of handwritten digits from the MNIST dataset. Each

ISO 9001:2008 Certified Journal

Page 485