Subir material

Suba sus trabajos a SEDICI, para mejorar notoriamente su visibilidad e impacto

 

Mostrar el registro sencillo del ítem

dc.date.accessioned 2023-04-19T14:57:38Z
dc.date.available 2023-04-19T14:57:38Z
dc.date.issued 2022
dc.identifier.uri http://sedici.unlp.edu.ar/handle/10915/151735
dc.description.abstract During recent years transformers architectures have been growing in popularity. Modulated Detection Transformer (MDETR) is an end-to-endmulti-modal understanding model that performs tasks such as phase grounding, referring expression comprehension, referring expression segmentation, andvisual question answering. One remarkable aspect of the model is the capacity to infer over classes that it was not previously trained for. In this work we explore the use of MDETR in a new task, action detection, without any previous training. We obtain quantitative results using the Atomic Visual Actions dataset.Although the model does not report the best performance in the task, we believe that it is an interesting finding. We show that it is possible to use a multi-modal model to tackle a task that it was not designed for. Finally, we believe that this line of research may lead into the generalization of MDETR in additionaldownstream tasks. en
dc.format.extent 6-10 es
dc.language en es
dc.subject Multi-modal transformers es
dc.subject Action detection es
dc.subject Model generalization es
dc.title Exploring modulated detection transformer as a tool for action recognition in videos en
dc.type Objeto de conferencia es
sedici.identifier.uri https://publicaciones.sadio.org.ar/index.php/JAIIO/article/download/388/326 es
sedici.identifier.issn 2451-7496 es
sedici.creator.person Crisol, Tomás es
sedici.creator.person Ermantraut, Joel es
sedici.creator.person Rostagno, Adrián es
sedici.creator.person Aggio, Santiago L. es
sedici.creator.person Iparraguirre, Javier es
sedici.subject.materias Ciencias Informáticas es
sedici.description.fulltext true es
mods.originInfo.place Sociedad Argentina de Informática e Investigación Operativa es
sedici.subtype Objeto de conferencia es
sedici.rights.license Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0)
sedici.rights.uri http://creativecommons.org/licenses/by-nc-sa/4.0/
sedici.date.exposure 2022-10
sedici.relation.event Simposio Argentino de Imágenes y Visión (SAIV 2022) - JAIIO 51 (Modalidad virtual y presencial (UAI), octubre 2022) es
sedici.description.peerReview peer-review es


Descargar archivos

Este ítem aparece en la(s) siguiente(s) colección(ones)

Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0) Excepto donde se diga explícitamente, este item se publica bajo la siguiente licencia Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0)