Interweaving Convolutions: An application to Audio Classification

Abstract

The monumental success of Convolutional Neural Networks (CNNs) in the field of image classification has motivated the application of CNNs in the domain of auditory data. Prior works have shown performance of Hidden Markov Models (HMMs) and Deep Neural Networks (DNNs) in the field of Content-based Audio Classification. This paper presents a novel concatenating strategy for a CNN-based neural architecture. The proposed methodology was evaluated for audio classification task using UrbanSound8K dataset (US8K) as benchmark. The proposed architecture achieves an average recognition accuracy of 97.55 %, an average EER of .14% on US8K dataset. A small-footprint variant of the proposed architecture is also proposed.

Publication
24th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
Harsh Sinha
Harsh Sinha
Graduate Student

My research interests include computer vision, biometrics, domain adaptation, machine learning.