In this paper we present a system that generates a video of a person from only one image that is given to it, with
complete facial animation and speech generated from the text message. The purpose of this project is to create a talking head
that can (a) replicate the facial movements of the person whose image is given and (b) sync those movements with the speech
that will be generated from the text input that is written to deliver the message the user wants their digital human to convey. Only
two inputs, viz. An image of the user and text message that needs to be delivered are required by the system. This paper presents
a method that uses only a single base model to create a personalized video, eliminating the hassle of training it every time. In a
nutshell, the output of the system will be a two-dimensional clone of an individual made entirely of a still image of a person, that
conveys the given text message to the target audience through replicated facial expressions.