Busque entre los 169416 recursos disponibles en el repositorio
Mostrar el registro sencillo del ítem
dc.date.accessioned | 2021-12-06T15:35:36Z | |
dc.date.available | 2021-12-06T15:35:36Z | |
dc.date.issued | 2017 | |
dc.identifier.uri | http://sedici.unlp.edu.ar/handle/10915/129169 | |
dc.description.abstract | Handling faults is a growing concern in HPC; higher error rates, larger detection intervals and silent faults are expected in the future. It is projected that, in exascale systems, errors will occur several times a day, and they will propagate to generate errors that will range from process crashes to corrupted results because of undetected errors. In this article, we propose a methodology that improves system reliability against transient faults, when running parallel message-passing applications. The proposed solution, based on process replication, has the goal of helping programmers and users of parallel scientific applications to achieve reliable executions with correct results. This work presents a characterization of the strategy, defining its behavior in the presence of faults and modeling the temporal costs of employing it. As a result, we show its efficacy and viability to tolerate transient faults in HPC systems. | en |
dc.format.extent | 434-441 | es |
dc.language | en | es |
dc.subject | Soft error detection | es |
dc.subject | Automatic recovery | es |
dc.subject | Systemlevel checkpoint | es |
dc.subject | User-level checkpoint | es |
dc.title | A methodology for soft errors detection and automatic recovery | en |
dc.type | Objeto de conferencia | es |
sedici.identifier.other | https://doi.org/10.1109/HPCS.2017.71 | es |
sedici.identifier.isbn | 978-1-5386-3250-5 | es |
sedici.creator.person | Montezanti, Diego Miguel | es |
sedici.creator.person | De Giusti, Armando Eduardo | es |
sedici.creator.person | Naiouf, Marcelo | es |
sedici.creator.person | Villamayor, Jorge | es |
sedici.creator.person | Rexachs del Rosario, Dolores | es |
sedici.creator.person | Luque Fadón, Emilio | es |
sedici.subject.materias | Ciencias Informáticas | es |
sedici.description.fulltext | true | es |
mods.originInfo.place | Instituto de Investigación en Informática | es |
sedici.subtype | Objeto de conferencia | es |
sedici.rights.license | Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0) | |
sedici.rights.uri | http://creativecommons.org/licenses/by-nc-sa/4.0/ | |
sedici.date.exposure | 2017-07 | |
sedici.relation.event | 2017 International Conference on High Performance Computing & Simulation (HPCS) (Italia, 17 al 21 de julio de 2017) | es |
sedici.description.peerReview | peer-review | es |
sedici.relation.bookTitle | 2017 International Conference on High Performance Computing & Simulation (HPCS) | es |