Subir material

Suba sus trabajos a SEDICI, para mejorar notoriamente su visibilidad e impacto

 

Mostrar el registro sencillo del ítem

dc.date.accessioned 2017-05-04T13:27:42Z
dc.date.available 2017-05-04T13:27:42Z
dc.date.issued 2017-04
dc.identifier.uri http://sedici.unlp.edu.ar/handle/10915/59979
dc.description.abstract Featured Articles (FA) are considered to be the best articles that Wikipedia has to offer and in the last years, researchers have found interesting to analyze whether and how they can be distinguished from “ordinary” articles. Likewise, identifying what issues have to be enhanced or fixed in ordinary articles in order to improve their quality is a recent key research trend. Most of the approaches developed to face these information quality problems have been proposed for the English Wikipedia. However, few efforts have been accomplished in Spanish Wikipedia, despite being Spanish, one of the most spoken languages in the world by native speakers. In this respect, we present a breakdown of Spanish Wikipedia’s quality flaw structure. Besides, we carry out studies with three different corpora to automatically assess information quality in Spanish Wikipedia, where FA identification is evaluated as a binary classification task. Our evaluation on a unified setting allows to compare with the English version, the performance achieved by our approach on the Spanish version. The best results obtained show that FA identification in Spanish, can be performed with an F1 score of 0.88 using a document model consisting of only twenty six features and Support Vector Machine as classification algorithm. en
dc.format.extent 29-36 es
dc.language en es
dc.subject featured article identification en
dc.subject information quality en
dc.subject quality flaws prediction en
dc.subject Wikipedia en
dc.title Towards Information Quality Assurance in Spanish: Wikipedia en
dc.type Articulo es
sedici.identifier.uri http://journal.info.unlp.edu.ar/wp-content/uploads/2017/05/JCST-44-Paper-4.pdf es
sedici.identifier.issn 1666-6038 es
sedici.creator.person Ferretti, Edgardo es
sedici.creator.person Soria, Matías es
sedici.creator.person Pérez Casseignau, Sebastián es
sedici.creator.person Pohn, Lian es
sedici.creator.person Urquiza, Guido es
sedici.creator.person Gómez, Sergio Alejandro es
sedici.creator.person Errecalde, Marcelo Luis es
sedici.subject.materias Ciencias Informáticas es
sedici.description.fulltext true es
mods.originInfo.place Facultad de Informática es
sedici.subtype Articulo es
sedici.rights.license Creative Commons Attribution 3.0 Unported (CC BY 3.0)
sedici.rights.uri http://creativecommons.org/licenses/by/3.0/
sedici.description.peerReview peer-review es
sedici.relation.journalTitle Journal of Computer Science & Technology es
sedici.relation.journalVolumeAndIssue vol. 17, no. 1 es


Descargar archivos

Este ítem aparece en la(s) siguiente(s) colección(ones)

Creative Commons Attribution 3.0 Unported (CC BY 3.0) Excepto donde se diga explícitamente, este item se publica bajo la siguiente licencia Creative Commons Attribution 3.0 Unported (CC BY 3.0)