About: Robust Deep Learning Frameworks for Acoustic Scene and Respiratory Sound Classification

Not logged in : Login

(Sponging disallowed)

Facets (new session)
Description
Metadata
Settings
- Rule:
- Inverse Functional Properties:
- "Same As":

About: Robust Deep Learning Frameworks for Acoustic Scene and Respiratory Sound Classification Goto Sponge NotDistinct Permalink

An Entity of Type : bibo:Thesis, within Data Space : linkeddata.uriburner.com:28898 associated with source document(s)

Attributes	Values
type	http://eprints.org/ontology/EPrint http://eprints.org/ontology/ThesisEPrint Article Thesis
seeAlso	HTML Summary of #90835 Robust Deep Learning Frameworks for Acoustic Scene and Respiratory Sound Classification
sameAs	Robust Deep Learning Frameworks for Acoustic Scene and Respiratory Sound Classification
http://www.loc.gov...erms/relators/THS	Palaniappan Ramaswamy
http://eprints.org/ontology/hasDocument	Robust Deep Learning Frameworks for Acoustic Scene and Respiratory Sound Classification (PDF) Robust Deep Learning Frameworks for Acoustic Scene and Respiratory Sound Classification (Other) Robust Deep Learning Frameworks for Acoustic Scene and Respiratory Sound Classification (Other) Robust Deep Learning Frameworks for Acoustic Scene and Respiratory Sound Classification (Other) Robust Deep Learning Frameworks for Acoustic Scene and Respiratory Sound Classification (Other) Robust Deep Learning Frameworks for Acoustic Scene and Respiratory Sound Classification (Other)
dcterms:issuer	University of Kent, School of Computing, University of Kent,
Title	Robust Deep Learning Frameworks for Acoustic Scene and Respiratory Sound Classification
described by	https://demo.openlinksw.com/about/id/entity/https/raw.githubusercontent.com/annajordanous/CO644Files/main/export_kar_RDFN3.n3
Date	2021-09
Creator	Lam Dang Pham
status	published
abstract	Although research on Acoustic Scene Classification (ASC) is very close to, or even overshadowed by different popular research areas known as Automatic Speech Recognition (ASR), Speaker Recognition (SR) or Image Processing (IP), this field potentially opens up several distinct and meaningful application areas based on environment context detection. The challenges of ASC mainly come from different noise resources, various sounds in real-world environments, occurring as single sounds, continuous sounds or overlapping sounds. In comparison to speech, sound scenes are more challenging mainly due to their being unstructured in form and closely similar to noise in certain contexts. Although a wide range of publications have focused on ASC recently, they show task-specific ways that either explore certain aspects of an ASC system or are evaluated on limited acoustic scene datasets. Therefore, the aim of this thesis is to contribute to the development of a robust framework to be applied for ASC, evaluated on various recently published datasets, and to achieve competitive performance compared to the state-of-the-art systems. To do this, a baseline model is firstly introduced. Next, extensive experiments on the baseline are conducted to identify key factors affecting final classification accuracy. From the comprehensive analysis, a robust deep learning framework, namely the Encoder-Decoder structure, is proposed to address three main factors that directly affect an ASC system. These factors comprise low-level input features, high-level feature extraction methodologies, and architectures for final classification. Within the proposed framework, three spectrogram transformations, namely Constant Q Transform (CQT), gammatone filter (Gamma), and log-mel, are used to convert recorded audio signals into spectrogram representations that resemble two-dimensional images. These three spectrograms used are referred to as low-level input features. To extract high-level features from spectrograms, a novel Encoder architecture, based on Convolutional Neural Networks, is proposed. In terms of the Decoder, also referred as to the final classifier, various models such as Random Forest Classifier, Deep Neural Network and Mixture of Experts, are evaluated and structured to obtain the best performance. To further improve an ASC system's performance, a scheme of two-level hierarchical classification, replacing the role of Decoder classification recently mentioned, is proposed. This scheme is useful to transform an ASC task over all categories into multiple ASC sub-tasks, each spanning fewer categories, in a divide-and- conquer strategy. At the highest level of the proposed scheme, meta-categories of acoustic scene sounds showing similar characteristics are classified. Next, categories within each meta-category are classified at the second level. Furthermore, an analysis of loss functions applied to different classifiers is conducted. This analysis indicates that a combination of entropy loss and triplet loss is useful to enhance performance, especially with tasks that comprise fewer categories. Further exploring ASC in terms of potential application to the health services, this thesis also explores the 2017 Internal Conference on Biomedical Health Informatics (ICBHI) benchmark dataset of lung sounds. A deep-learning frame- work, based on our novel ASC approaches, is proposed to classify anomaly cycles and predict respiratory diseases. The results obtained from these experiments show exceptional performance. This highlights the potential applications of using advanced ASC frameworks for early detection of auditory signals. In this case, signs of respiratory diseases, which could potentially be highly useful in future in directing treatment and preventing their spread.
Is Part Of	https://kar.kent.ac.uk/id/repository
Subject	QA 76 Software, computer programming,
list of authors	https://kar.kent.ac.uk/id/eprint/90835#authors
degree	PhD degree
is topic of	https://raw.githubusercontent.com/annajordanous/CO644Files/main/export_kar_RDFN3.n3
is primary topic of	HTML Summary of #90835 Robust Deep Learning Frameworks for Acoustic Scene and Respiratory Sound Classification

Faceted Search & Find service v1.17_git144 as of Jul 26 2024

Alternative Linked Data Documents: iSPARQL | ODE Content Formats:

RDF

ODATA

Microdata

About

OpenLink Virtuoso version 08.03.3331 as of Aug 25 2024, on Linux (x86_64-ubuntu_noble-linux-glibc2.38-64), Single-Server Edition (378 GB total memory, 16 GB memory in use)
Data on this page belongs to its respective rights holders.
Virtuoso Faceted Browser Copyright © 2009-2024 OpenLink Software