Subir material

Suba sus trabajos a SEDICI, para mejorar notoriamente su visibilidad e impacto

 

Mostrar el registro sencillo del ítem

dc.date.accessioned 2019-09-03T17:16:09Z
dc.date.available 2019-09-03T17:16:09Z
dc.date.issued 2019 es
dc.identifier.uri http://sedici.unlp.edu.ar/handle/10915/80384
dc.description.abstract Addressing the huge amount of data continuously generated is an important challenge in the Machine Learning field. The need to adapt the traditional techniques or create new ones is evident. To do so, distributed technologies have to be used to deal with the significant scalability constraints due to the Big Data context. In many Big Data applications for classification, there are some classes that are highly underrepresented, leading to what is known as the imbalanced classification problem. In this scenario, learning algorithms are often biased towards the majority classes, treating minority ones as outliers or noise. Consequently, preprocessing techniques to balance the class distribution were developed. This can be achieved by suppressing majority instances (undersampling) or by creating minority examples (oversampling). Regarding the oversampling methods, one of the most widespread is the SMOTE algorithm, which creates artificial examples according to the neighborhood of each minority class instance. In this work, our objective is to analyze the SMOTE behavior in Big Data as a function of some key aspects such as the oversampling degree, the neighborhood value and, specially, the type of distributed design (local vs. global). en
dc.format.extent 75-85 es
dc.language en es
dc.subject big data es
dc.subject imbalanced classification es
dc.subject preprocessing techniques es
dc.subject SMOTE es
dc.subject scalability es
dc.title An analysis of local and global solutions to address Big Data imbalanced classification: a case study with SMOTE preprocessing en
dc.type Objeto de conferencia es
sedici.identifier.isbn 978-3-030-27713-0 es
sedici.creator.person Basgall, María José es
sedici.creator.person Hasperué, Waldo es
sedici.creator.person Naiouf, Marcelo es
sedici.creator.person Fernández, Alberto es
sedici.creator.person Herrera, Francisco es
sedici.subject.materias Ciencias Informáticas es
sedici.description.fulltext true es
mods.originInfo.place Instituto de Investigación en Informática es
sedici.subtype Objeto de conferencia es
sedici.rights.license Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0)
sedici.rights.uri http://creativecommons.org/licenses/by-nc-sa/4.0/
sedici.date.exposure 2019-06
sedici.relation.event VII Conference Cloud Computing and Big Data (La Plata, 2019) es
sedici.description.peerReview peer-review es
sedici.relation.isRelatedWith http://doi.org/10.1007/978-3-030-27713-0 es


Descargar archivos

Este ítem aparece en la(s) siguiente(s) colección(ones)

Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0) Excepto donde se diga explícitamente, este item se publica bajo la siguiente licencia Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0)