Busque entre los 167490 recursos disponibles en el repositorio
Mostrar el registro sencillo del ítem
dc.date.accessioned | 2019-09-16T13:02:39Z | |
dc.date.available | 2019-09-16T13:02:39Z | |
dc.date.issued | 2016-03-31 | |
dc.identifier.uri | http://sedici.unlp.edu.ar/handle/10915/81217 | |
dc.description.abstract | Handling faults is a growing concern in HPC; greater varieties, higher error rates, larger detection intervals and silent faults are expected in the future. It is projected that, in exascale systems, errors will occur several times a day, and that they will propagate to generate errors that will range from process crashes to corrupted results, with undetected errors in applications that are still running. In this article, we analyze a methodology for transient fault detection (called SMCV) for MPI applications. The methodology is based on software replication, and it assumes that data corruption is made apparent producing different messages between replicas. SMCV allows obtaining reliable executions with correct results, or, at least, leading the system to a safe stop. This work presents a complete characterization, formally defining the behavior in the presence of faults and experimentally validating it in order to show its efficacy and viability to detect transient faults in HPC systems. | en |
dc.format.extent | 77-90 | es |
dc.language | en | es |
dc.publisher | Editorial de la Universidad Nacional de La Plata (EDULP) | es |
dc.subject | transient faults | es |
dc.subject | detection | es |
dc.subject | scientific parallel applications | es |
dc.subject | silent data corruption | es |
dc.subject | HPC | es |
dc.subject | fault injection | es |
dc.title | Characterizing a Detection Strategy for Transient Faults in HPC | en |
dc.type | Libro | es |
sedici.identifier.isbn | 978-987-4127-00-6 | es |
sedici.creator.person | Montezanti, Diego Miguel | es |
sedici.creator.person | Rexachs del Rosario, Dolores | es |
sedici.creator.person | Rucci, Enzo | es |
sedici.creator.person | Luque Fadón, Emilio | es |
sedici.creator.person | Naiouf, Marcelo | es |
sedici.creator.person | De Giusti, Armando Eduardo | es |
sedici.subject.materias | Ciencias Informáticas | es |
sedici.description.fulltext | true | es |
mods.originInfo.place | Red de Universidades con Carreras en Informática (RedUNCI) | es |
sedici.subtype | Capitulo de libro | es |
sedici.rights.license | Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NC 4.0) | |
sedici.rights.uri | http://creativecommons.org/licenses/by-nc/4.0/ | |
sedici.contributor.compiler | Feierherd, Guillermo Eugenio | es |
sedici.contributor.compiler | Pesado, Patricia Mabel | es |
sedici.contributor.compiler | Russo, Claudia Cecilia | es |
sedici.relation.isRelatedWith | http://sedici.unlp.edu.ar/handle/10915/58554 | es |
sedici.relation.bookTitle | Computer Science & Technology Series. XXI Argentine Congress of Computer Science. Selected papers | es |