South-Asian-Sounds_Audio-Classification

This repository contains the code for the research paper "South Asian Sounds: Audio Classification," published at the 4th International Conference on Computer, Communication, Control & Information Technology (C3IT) in 2024.

The project focuses on classifying diverse South Asian urban and cultural sounds by combining Mel-Frequency Cepstral Coefficients (MFCCs) with a custom 1D Convolutional Neural Network (1D-CNN).

🏗️ Model Architecture

Below is the architecture of the proposed SAS-CNN model used in this study.

✨ Key Highlights

Audio recordings are preprocessed and segmented into standardized 4-second clips.
Features are extracted using MFCCs to capture the unique spectral characteristics of each sound.
Classification is performed using a custom 1D-CNN (SAS-CNN) and compared against baseline models like ResNet50V2.
The model was evaluated on two key datasets: the newly introduced SAS-KIIT dataset and the benchmark UrbanSound8K dataset.
High classification accuracy was achieved through 10-fold cross-validation:
- 99.78% on the SAS-KIIT dataset.
- 94.26% on the UrbanSound8K dataset.

🏆 Contribution

This research makes the following contributions:

Proposes a robust and effective pipeline for classifying South Asian urban sounds.
Introduces the SAS-KIIT dataset, a new public resource for the research community.
Demonstrates that combining MFCC features with a 1D-CNN is a highly effective approach for multi-class sound recognition.

🔗 Links

Published Paper (IEEE Xplore): https://ieeexplore.ieee.org/document/10829485
SAS-KIIT Dataset: https://sas-kiit.netlify.app/

💻 Code Overview

This repository contains the Python scripts for the complete pipeline:

Audio preprocessing and feature extraction (MFCCs).
Implementation of the 1D-CNN (SAS-CNN) model.
Scripts for model training, evaluation, and inference.

📄 Citation

If you find this work useful in your research, please cite the paper.

BibTeX

@INPROCEEDINGS{10829485,
  author={Chatterjee, Rajdeep and Bishwas, Pappu and Chakrabarty, Sudip and Bandyopadhyay, Tathagata},
  booktitle={2024 4th International Conference on Computer, Communication, Control & Information Technology (C3IT)}, 
  title={South Asian Sounds: Audio Classification}, 
  year={2024},
  volume={},
  number={},
  pages={1-6},
  keywords={Translation; Pollution; Smart cities; Biological system modeling; Surveillance; Noise; Urban planning; Feature extraction; Rail transportation; Mel frequency cepstral coefficient; Audio classification; CNN; MFCC; Sound recognition},
  doi={10.1109/C3IT60531.2024.10829485}
}

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
10Fold_1D_CNN_MFCC_new_model_&_SAS-KIIT_dataset(wave,spectro_image).ipynb		10Fold_1D_CNN_MFCC_new_model_&_SAS-KIIT_dataset(wave,spectro_image).ipynb
10_Fold_1D_CNN_Urban8k_newModel.ipynb		10_Fold_1D_CNN_Urban8k_newModel.ipynb
10_fold_crossvalidation_use_keras_unbansound8k_using_new_model.ipynb		10_fold_crossvalidation_use_keras_unbansound8k_using_new_model.ipynb
1D_CNN_MFCC_new_model_SAS_KIIT_data_set(wave,spectro_image).ipynb		1D_CNN_MFCC_new_model_SAS_KIIT_data_set(wave,spectro_image).ipynb
1D_CNN_Urban8k_newModel.ipynb		1D_CNN_Urban8k_newModel.ipynb
README.md		README.md
SAS_SEGMENT+MEL-Spectrogram.ipynb		SAS_SEGMENT+MEL-Spectrogram.ipynb
South Asian Sounds.pdf		South Asian Sounds.pdf
dataset.png		dataset.png
model_architecture.png		model_architecture.png
t-SNE- SAS-KIIT.png		t-SNE- SAS-KIIT.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

South-Asian-Sounds_Audio-Classification

🏗️ Model Architecture

✨ Key Highlights

🏆 Contribution

🔗 Links

💻 Code Overview

📄 Citation

BibTeX

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

South-Asian-Sounds_Audio-Classification

🏗️ Model Architecture

✨ Key Highlights

🏆 Contribution

🔗 Links

💻 Code Overview

📄 Citation

BibTeX

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages