Subir material

Suba sus trabajos a SEDICI, para mejorar notoriamente su visibilidad e impacto

 

Mostrar el registro sencillo del ítem

dc.date.accessioned 2018-04-03T16:27:36Z
dc.date.available 2018-04-03T16:27:36Z
dc.date.issued 2017
dc.identifier.uri http://sedici.unlp.edu.ar/handle/10915/65941
dc.description.abstract This work explores the use of word embeddings, also known as word vectors, trained on Spanish corpora, to use as features for Spanish verb sense disambiguation (VSD). This type of learning technique is named disjoint semisupervised learning [1]: an unsupervised algorithm is trained on unlabeled data separately as a first step, and then its results (i.e. the word embeddings) are fed to a supervised classifier. Throughout this paper we try to assert two hypothesis: (i) representations of training instances based on word embeddings improve the performance of supervised models for VSD, in contrast to more standard feature engineering techniques based on information taken from the training data; (ii) using word embeddings trained on a specific domain, in this case the same domain the labeled data is gathered from, has a positive impact on the model’s performance, when compared to general domain’s word embeddings. The performance of a model over the data is not only measured using standard metric techniques (e.g. accuracy or precision/recall) but also measuring the model tendency to overfit the available data by analyzing the learning curve. Measuring this overfitting tendency is important as there is a small amount of available data, thus we need to find models to generalize better the VSD problem. For the task we use SenSem [2], a corpus and lexicon of Spanish and Catalan disambiguated verbs, as our base resource for experimentation. en
dc.format.extent 26-34 es
dc.language en es
dc.subject word embeddings en
dc.subject disjoint semisupervised learning en
dc.subject verb sense disambiguation en
dc.title Disjoint Semi-supervised Spanish Verb Sense Disambiguation Using Word Embeddings en
dc.type Objeto de conferencia es
sedici.identifier.uri http://www.clei2017-46jaiio.sadio.org.ar/sites/default/files/Mem/ASAI/asai-05.pdf es
sedici.identifier.issn 2451-7585 es
sedici.creator.person Cardellino, Cristian es
sedici.creator.person Alonso i Alemany, Laura es
sedici.subject.materias Ciencias Informáticas es
sedici.description.fulltext true es
mods.originInfo.place Sociedad Argentina de Informática e Investigación Operativa es
sedici.subtype Objeto de conferencia es
sedici.rights.license Creative Commons Attribution-ShareAlike 4.0 International (CC BY-SA 4.0)
sedici.rights.uri http://creativecommons.org/licenses/by-sa/4.0/
sedici.date.exposure 2017-09
sedici.relation.event XVIII Simposio Argentino de Inteligencia Artificial (ASAI) - JAIIO 46 (Córdoba, 2017). es
sedici.description.peerReview peer-review es


Descargar archivos

Este ítem aparece en la(s) siguiente(s) colección(ones)

Creative Commons Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) Excepto donde se diga explícitamente, este item se publica bajo la siguiente licencia Creative Commons Attribution-ShareAlike 4.0 International (CC BY-SA 4.0)