Skip to main content

Recognition of music genres using deep learning.

Page 1

International Research Journal of Engineering and Technology (IRJET) Volume: 09 Issue: 05 | May 2022

www.irjet.net

e-ISSN: 2395-0056 p-ISSN: 2395-0072

Recognition of music genres using deep learning. Vishal Phulmante1, Ambar Bidkar2, Yashkumar Mundada3, Mrs. Prutha P. Kulkarni4 1,2,3Department

of Electronics & Tele-Communication Engineering, Vishwakarma Institute of Information Technology Pune, Maharashtra, India. 4Asst. Professor, Department of Electronics & Tele-Communication Engineering, Vishwakarma Institute of Information Technology Pune, Maharashtra, India. ---------------------------------------------------------------------***---------------------------------------------------------------------

The authors of the paper “Music genre classification looking for the Perfect Network?”[11] worked on different architectures like CNN, CRNN and LSTM. They got the highest accuracy of 52% with a CNN model using Melspectrograms as an input.

Abstract - The paper encapsulates recognition of music

genres by using convolution neural networks (CNNs). Three different approaches were considered for implementing the solution to the problem. The first approach is to extract Melspectrograms , second one is to extract MFCC plots and the last one is by plotting chroma STFT features of the audio files. The aim of this project work is to test the different audio features which are best suitable for such kinds of tasks.

Athulya K M and Sindhu S in their work titled “In Deep learning-based music genre classification using spectrogram”[12] got an impressive accuracy of around 94% using a CNN model having 5 convolution 2D layers with Mel-spectrogram features.

Key Words: Music Genre Recognition, Audio Processing, Deep Learning, Convolution Neural Networks, Mel-spectrograms, Mel Frequency Cepstral Coefficients, Chroma STFT.

The authors of “Deep attention based music genre classification” proposed the GRU based Bidirectional Recurrent Neural Network architecture[13]. They got an overall accuracy of 92.7% on the GTZAN dataset.

1. INTRODUCTION

3. METHODOLOGY

Music information retrieval (MIR) is a field that incorporates components of machine learning, signal processing and music theory to study the musical content present in the audio sample. MIR allows machine algorithms to smartly analyze and process data present in the given music sample[3]. The music consumption is increasing day by day due to the ever increasing platforms for music and music stores i.e databases and new music creation. Users find it difficult to organize songs which they listen to. Genre, which is determined by various aspects in the music sample such as rhythms, harmonic information, and instruments which are used in that particular music, is a way to differentiate and group songs together[1].

There are mainly four different steps are involved in the proposed work: 1. 2. 3. 4.

Developing a system capable of segregating music genres, indirectly through audio is very challenging. The basic objective is to recognize music genres on the basis of audio provided and perform relatively well. The model can identify music genres with higher accuracy on the unseen data.

Fig-1: Process Workflow

3.1 Data Collection GTZAN genre dataset has been used in this process. The GTZAN dataset is the widely used public dataset for evaluation in deep listening research for music genre recognition (MGR) tasks[4].

2. LITERATURE STUDY In the paper titled “Convolutional Neural Network Achieves Human-level Accuracy in Music Genre Classification''[10] authors used CNN model having two convolution layers with Mel-spectrogram features which in turn gave them an accuracy of around 70%. They split the audio files into smaller chunks of 3sec length.

© 2022, IRJET

|

Impact Factor value: 7.529

Data collection Data preprocessing Feature extraction Genre Classification

The dataset consists of 100 audio tracks for each of 10 genres. The audio files are 30 sec long 22050Hz Mono 16bit in .wav format. The 10 different genres can be seen in table-1.

|

ISO 9001:2008 Certified Journal

|

Page 936


Turn static files into dynamic content formats.

Create a flipbook