Speech Recognition System

Page 1

International Research Journal of Engineering and Technology (IRJET)

e-ISSN: 2395 -0056

Volume: 04 Issue: 04 | Apr -2017

p-ISSN: 2395-0072

www.irjet.net

Speech Recognition System Pragati Gupta1, Shubham Kumar1, Ravi Prakash1 1. B.tech students, Department of Computer Science, IMS Engineering College, Ghaziabad-201009, Uttar Pradesh ---------------------------------------------------------------------***---------------------------------------------------------------------

Abstract - This paper is demonstrating to convert the

entity into its constituent elements. The recognition of speech signals by the system is called as Speech Recognition. The process of mapping an acoustic speech signal to text is called as Speech Recognition. In this the computer receives the user's speech and interprets what is said. This allows the user to control the computer (or some aspects of it) by voice, rather than having to use the mouse and keyboard, or alternatively just dictating the contents of a document. There are possibilities of one of the two things that could be possible in Speech Recognition. First approach is Command and Control(can also be abbreviated as CnC, or simply SR) in this the application can understand running text should match with the list of references at the end of the paper.

audio signals to perform the task. Speech recognition is one of the fastest growing technology nowadays. In this paper, we aimed at developing the speech recognition system as a helping tool for the differently able people. This paper demonstrates to convert the speech into English text. The conversion of speech into text is made by the speech recognizer. It can be used at various places with many possible solutions. There are around 20% people who are suffering from many disabilities. There are people who are blind, some cannot use their hands effectively and for illiterates, for them this system could be very helpful. This system will also be helpful for the enterprises where most of the work is to type. This system can recognize the audio signals and convert into text it can perform some operations, such as open calculator, open Google chrome etc. ; it also enables a user to perform operations such as “save, open, exit” a file by providing voice input . Likewise this system can perform some operations. At the initial level effort is made to provide help for basic operations as discussed above, to perform more operation this software can be updated and enhanced further.

and follow simple commands that it has been educated about in advance. Second approach is Dictation (can also be abbreviated to DSR). In this the engine has to identify arbitrary spoken words, therefore it is more complex, and also need to decide which spelling of similarly sounding words is required. The context information is developed based on the preceding and following words to try and help decide. CnC is sometimes referred to as context-free recognition, because this context analysis is not required with Command and Control recognition.

This paper presents a method to design a speech to text then performs a task accordingly using .net framework using Visual Studio.

Key Words:

Speech, text, recognition, using .net framework

desktop

Dictation speech recognition is speaker-dependant. It means different people's enunciation, accent, pitch and many such factors, varies from person to person. For the decent results, recognizer requires a speaker profile to be set up.

application,

On the other hand, command and control speech recognition is usually not speaker-independent.

1. INTRODUCTION

This paper describes about the Speech Recognition System which is in particular CnC application i.e. Command and Control. This system is developed on .net framework. It runs on Microsoft Visual Studio 2015. API is provided by Microsoft that allows developers to use speech recognition and speech synthesis engines in windows applications. Speech-to-text conversion is done with the help of Speech Recognition engine, while speech synthesis provides access to text-to-speech conversion engine. The SAPI (Speech API) can be seen as an interface between the Application and SR/TTS engines.

Speech is the primary means of communication between people. Speech is the most common method of exchanging thoughts among humans. The clearness of speech and accent are the important part to convey the message correctly for the real conversation. Speech can be processed in two forms Speech Synthesis and Speech recognition. The speech can also be artificially produced. The artificial production of speech is called as Speech Synthesis. The word ‘Synthesis’ is defined by the vocabulary as the combining of the constituent elements of separate material or abstract entities into a single or unified entity the separating of any material or abstract

© 2017, IRJET

|

Impact Factor value: 5.181

|

ISO 9001:2008 Certified Journal

|

Page 1337


Turn static files into dynamic content formats.

Create a flipbook
Issuu converts static files into: digital portfolios, online yearbooks, online catalogs, digital photo albums and more. Sign up and create your flipbook.