This HTML5 document contains 28 embedded RDF statements represented using HTML+Microdata notation.

The embedded RDF content will be recognized by any processor of HTML5 Microdata.

Namespace Prefixes

PrefixIRI
dctermshttp://purl.org/dc/terms/
n2https://kar.kent.ac.uk/id/eprint/
wdrshttp://www.w3.org/2007/05/powder-s#
n17http://purl.org/ontology/bibo/status/
dchttp://purl.org/dc/elements/1.1/
rdfshttp://www.w3.org/2000/01/rdf-schema#
n7https://kar.kent.ac.uk/id/subject/
n4https://demo.openlinksw.com/about/id/entity/https/raw.githubusercontent.com/annajordanous/CO644Files/main/
n19https://kar.kent.ac.uk/65147/
n9http://eprints.org/ontology/
n16https://kar.kent.ac.uk/id/event/
bibohttp://purl.org/ontology/bibo/
n14https://kar.kent.ac.uk/id/org/
rdfhttp://www.w3.org/1999/02/22-rdf-syntax-ns#
n10https://kar.kent.ac.uk/id/document/
n15https://kar.kent.ac.uk/id/
xsdhhttp://www.w3.org/2001/XMLSchema#
n12https://demo.openlinksw.com/about/id/entity/https/www.cs.kent.ac.uk/people/staff/akj22/materials/CO644/
n13https://kar.kent.ac.uk/id/eprint/65147#
n6https://kar.kent.ac.uk/id/person/

Statements

Subject Item
n2:65147
rdf:type
bibo:Article n9:ConferenceItemEPrint n9:EPrint bibo:AcademicArticle
rdfs:seeAlso
n19:
n9:hasAccepted
n10:287753
n9:hasDocument
n10:2990459 n10:2990460 n10:2990461 n10:2990462 n10:287753 n10:287775
dc:hasVersion
n10:287753
dcterms:title
Improving Language Modelling with Noise Contrastive Estimation
wdrs:describedby
n4:export_kar_RDFN3.n3 n12:export_kar_RDFN3.n3
dcterms:date
2018-11-18
dcterms:creator
n6:ext-fl207@kent.ac.uk n6:ext-m.grzes@kent.ac.uk
bibo:status
n17:peerReviewed n17:published
dcterms:publisher
n14:ext-b7d3403b8846bd942cc3a0ce54b9d44d
bibo:abstract
Neural language models do not scale well when the vocabulary is large. Noise contrastive estimation (NCE) is a sampling-based method that allows for fast learning with large vocabularies. Although NCE has shown promising performance in neural machine translation, its full potential has not been demonstrated in the language modelling literature. A sufficient investigation of the hyperparameters in the NCE-based neural language models was clearly missing. In this paper, we showed that NCE can be a very successful approach in neural language modelling when the hyperparameters of a neural network are tuned appropriately. We introduced the `search-then-converge' learning rate schedule for NCE and designed a heuristic that specifies how to use this schedule. The impact of the other important hyperparameters, such as the dropout rate and the weight initialisation range, was also demonstrated. Using a popular benchmark, we showed that appropriate tuning of NCE in neural language models outperforms the state-of-the-art single-model methods based on the standard LSTM recurrent neural networks.
dcterms:isPartOf
n15:repository
dcterms:subject
n7:QA76.87 n7:QA276
bibo:authorList
n13:authors
bibo:presentedAt
n16:ext-a338877a1fb3689eefd168647adc2eba