This HTML5 document contains 32 embedded RDF statements represented using HTML+Microdata notation.

The embedded RDF content will be recognized by any processor of HTML5 Microdata.

Namespace Prefixes

PrefixIRI
dctermshttp://purl.org/dc/terms/
n2https://kar.kent.ac.uk/id/eprint/
n4doi:10.1017/
wdrshttp://www.w3.org/2007/05/powder-s#
dchttp://purl.org/dc/elements/1.1/
n17http://purl.org/ontology/bibo/status/
rdfshttp://www.w3.org/2000/01/rdf-schema#
n14https://kar.kent.ac.uk/id/subject/
n21https://demo.openlinksw.com/about/id/entity/https/raw.githubusercontent.com/annajordanous/CO644Files/main/
n5http://eprints.org/ontology/
n16https://kar.kent.ac.uk/87091/
bibohttp://purl.org/ontology/bibo/
n19https://kar.kent.ac.uk/id/publication/
n18https://kar.kent.ac.uk/id/org/
rdfhttp://www.w3.org/1999/02/22-rdf-syntax-ns#
owlhttp://www.w3.org/2002/07/owl#
n6https://kar.kent.ac.uk/id/document/
n8https://kar.kent.ac.uk/id/
xsdhhttp://www.w3.org/2001/XMLSchema#
n11https://demo.openlinksw.com/about/id/entity/https/www.cs.kent.ac.uk/people/staff/akj22/materials/CO644/
n13https://kar.kent.ac.uk/id/eprint/87091#
n9https://kar.kent.ac.uk/id/person/

Statements

Subject Item
n2:87091
rdf:type
n5:ArticleEPrint n5:EPrint bibo:AcademicArticle bibo:Article
rdfs:seeAlso
n16:
owl:sameAs
n4:S0269888920000417
n5:hasAccepted
n6:3234295
n5:hasDocument
n6:3234324 n6:3234321 n6:3234322 n6:3234323 n6:3234295 n6:3234303
dc:hasVersion
n6:3234295
dcterms:title
A study on the statistical evaluation of classifiers
wdrs:describedby
n11:export_kar_RDFN3.n3 n21:export_kar_RDFN3.n3
dcterms:date
2020-11-27
dcterms:creator
n9:ext-d1e06992e4335d01f7c6e4f5c84d8a54 n9:ext-5d3fb5e28a383444503110ae4fe02ec4 n9:ext-a.a.freitas@kent.ac.uk n9:ext-214944abfbc5dd52b993e5db0cb34544
bibo:status
n17:peerReviewed n17:published
dcterms:publisher
n18:ext-7dc6ac206349427818537421ac9815ec
bibo:abstract
Statistical significance analysis, based on hypothesis tests, is a common approach for comparing classifiers. However, many studies oversimplify this analysis by simply checking the condition p-value < 0.05, ignoring important concepts such as the effect size and the statistical power of the test. This problem is so worrying that the American Statistical Association has taken a strong stand on the subject, noting that although the p-value is a useful statistical measure, it has been abusively used and misinterpreted. This work highlights problems caused by the misuse of hypothesis tests and shows how the effect size and the power of the test can provide important information for better decision-making. To investigate these issues, we perform empirical studies with different classifiers and 50 datasets, using the Student’s t-test and the Wilcoxon test to compare classifiers. The results show that an isolated p-value analysis can lead to wrong conclusions and that the evaluation of the effect size and the power of the test contributes to a more principled decision-making.
dcterms:isPartOf
n8:repository n19:ext-02698889
dcterms:subject
n14:Q335
bibo:authorList
n13:authors
bibo:issue
e1
bibo:volume
36