Text-to-Image Generation using Generative Adversarial Network
Authors: Chitvan Jamdagni, Vasu Sharma, Jatin Goyal, Divyans h, Baddam Nikhil Kumar Reddy and Payal Thakur
Publishing Date: 27-02-2024
ISBN: 978-81-955020-7-3
Abstract
A deep learning model called Text-to-Image Creation with Generative Adversarial Networks (GAN) can generate images from text descriptions. A wide range of applications, including photo-searching, photo-editing, art creation, computer-aided design, image reconstruction, captioning, and portrait drawing, are among the several study fields that it has a significant impact on. Producing realistic visuals consistently under predetermined settings is the most difficult endeavor. Current text-to-image creation algorithms produce images that don't accurately reflect the text. The Caltech-UCSD Birds-200-2011 dataset was used to train the suggested model, and an inception score and PSNR were used to assess its performance. Stage-I and Stage-II make up the proposed StackGAN paradigm. Based on the input written description, Stage-I GAN generates low-resolution images by the method of roughing out the basic shape and colors of the object. By using the Stage-I results and textual descriptions as inputs, together with defect detection and detail addition, Stage-II GAN creates high-resolution and photo-realistic images with fine details.
Keywords
GAN, Generative Adversarial Networks, Deep Learning, Text to Image generation (T2I).
Cite as
Chitvan Jamdagni, Vasu Sharma, Jatin Goyal, Divyans h, Baddam Nikhil Kumar Reddy and Payal Thakur, "Text-to-Image Generation using Generative Adversarial Network", In: Ashish Kumar Tripathi and Vivek Shrivastava (eds), Advancements in Communication and Systems, SCRS, India, 2024, pp. 209-217. https://doi.org/10.56155/978-81-955020-7-3-19