Not logged in : Login
(Sponging disallowed)

About: Robust Deep Learning Frameworks for Acoustic Scene and Respiratory Sound Classification     Goto   Sponge   Distinct   Permalink

An Entity of Type : bibo:Thesis, within Data Space : linkeddata.uriburner.com:28898 associated with source document(s)

AttributesValues
type
seeAlso
sameAs
http://www.loc.gov...erms/relators/THS
http://eprints.org/ontology/hasDocument
dcterms:issuer
Title
  • Robust Deep Learning Frameworks for Acoustic Scene and Respiratory Sound Classification
described by
Date
  • 2021-09
Creator
status
abstract
  • Although research on Acoustic Scene Classification (ASC) is very close to, or even overshadowed by different popular research areas known as Automatic Speech Recognition (ASR), Speaker Recognition (SR) or Image Processing (IP), this field potentially opens up several distinct and meaningful application areas based on environment context detection. The challenges of ASC mainly come from different noise resources, various sounds in real-world environments, occurring as single sounds, continuous sounds or overlapping sounds. In comparison to speech, sound scenes are more challenging mainly due to their being unstructured in form and closely similar to noise in certain contexts. Although a wide range of publications have focused on ASC recently, they show task-specific ways that either explore certain aspects of an ASC system or are evaluated on limited acoustic scene datasets. Therefore, the aim of this thesis is to contribute to the development of a robust framework to be applied for ASC, evaluated on various recently published datasets, and to achieve competitive performance compared to the state-of-the-art systems. To do this, a baseline model is firstly introduced. Next, extensive experiments on the baseline are conducted to identify key factors affecting final classification accuracy. From the comprehensive analysis, a robust deep learning framework, namely the Encoder-Decoder structure, is proposed to address three main factors that directly affect an ASC system. These factors comprise low-level input features, high-level feature extraction methodologies, and architectures for final classification. Within the proposed framework, three spectrogram transformations, namely Constant Q Transform (CQT), gammatone filter (Gamma), and log-mel, are used to convert recorded audio signals into spectrogram representations that resemble two-dimensional images. These three spectrograms used are referred to as low-level input features. To extract high-level features from spectrograms, a novel Encoder architecture, based on Convolutional Neural Networks, is proposed. In terms of the Decoder, also referred as to the final classifier, various models such as Random Forest Classifier, Deep Neural Network and Mixture of Experts, are evaluated and structured to obtain the best performance. To further improve an ASC system's performance, a scheme of two-level hierarchical classification, replacing the role of Decoder classification recently mentioned, is proposed. This scheme is useful to transform an ASC task over all categories into multiple ASC sub-tasks, each spanning fewer categories, in a divide-and- conquer strategy. At the highest level of the proposed scheme, meta-categories of acoustic scene sounds showing similar characteristics are classified. Next, categories within each meta-category are classified at the second level. Furthermore, an analysis of loss functions applied to different classifiers is conducted. This analysis indicates that a combination of entropy loss and triplet loss is useful to enhance performance, especially with tasks that comprise fewer categories. Further exploring ASC in terms of potential application to the health services, this thesis also explores the 2017 Internal Conference on Biomedical Health Informatics (ICBHI) benchmark dataset of lung sounds. A deep-learning frame- work, based on our novel ASC approaches, is proposed to classify anomaly cycles and predict respiratory diseases. The results obtained from these experiments show exceptional performance. This highlights the potential applications of using advanced ASC frameworks for early detection of auditory signals. In this case, signs of respiratory diseases, which could potentially be highly useful in future in directing treatment and preventing their spread.
Is Part Of
Subject
list of authors
degree
is topic of
is primary topic of
Faceted Search & Find service v1.17_git144 as of Jul 26 2024


Alternative Linked Data Documents: iSPARQL | ODE     Content Formats:   [cxml] [csv]     RDF   [text] [turtle] [ld+json] [rdf+json] [rdf+xml]     ODATA   [atom+xml] [odata+json]     Microdata   [microdata+json] [html]    About   
This material is Open Knowledge   W3C Semantic Web Technology [RDF Data] Valid XHTML + RDFa
OpenLink Virtuoso version 08.03.3331 as of Aug 25 2024, on Linux (x86_64-ubuntu_noble-linux-glibc2.38-64), Single-Server Edition (378 GB total memory, 16 GB memory in use)
Data on this page belongs to its respective rights holders.
Virtuoso Faceted Browser Copyright © 2009-2024 OpenLink Software