Upload resources

Upload your works to SEDICI to increase its visibility and improve its impact

 

Show simple item record

dc.date.accessioned 2023-04-19T14:57:38Z
dc.date.available 2023-04-19T14:57:38Z
dc.date.issued 2022
dc.identifier.uri http://sedici.unlp.edu.ar/handle/10915/151735
dc.description.abstract During recent years transformers architectures have been growing in popularity. Modulated Detection Transformer (MDETR) is an end-to-endmulti-modal understanding model that performs tasks such as phase grounding, referring expression comprehension, referring expression segmentation, andvisual question answering. One remarkable aspect of the model is the capacity to infer over classes that it was not previously trained for. In this work we explore the use of MDETR in a new task, action detection, without any previous training. We obtain quantitative results using the Atomic Visual Actions dataset.Although the model does not report the best performance in the task, we believe that it is an interesting finding. We show that it is possible to use a multi-modal model to tackle a task that it was not designed for. Finally, we believe that this line of research may lead into the generalization of MDETR in additionaldownstream tasks. en
dc.format.extent 6-10 es
dc.language en es
dc.subject Multi-modal transformers es
dc.subject Action detection es
dc.subject Model generalization es
dc.title Exploring modulated detection transformer as a tool for action recognition in videos en
dc.type Objeto de conferencia es
sedici.identifier.uri https://publicaciones.sadio.org.ar/index.php/JAIIO/article/download/388/326 es
sedici.identifier.issn 2451-7496 es
sedici.creator.person Crisol, Tomás es
sedici.creator.person Ermantraut, Joel es
sedici.creator.person Rostagno, Adrián es
sedici.creator.person Aggio, Santiago L. es
sedici.creator.person Iparraguirre, Javier es
sedici.subject.materias Ciencias Informáticas es
sedici.description.fulltext true es
mods.originInfo.place Sociedad Argentina de Informática e Investigación Operativa es
sedici.subtype Objeto de conferencia es
sedici.rights.license Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0)
sedici.rights.uri http://creativecommons.org/licenses/by-nc-sa/4.0/
sedici.date.exposure 2022-10
sedici.relation.event Simposio Argentino de Imágenes y Visión (SAIV 2022) - JAIIO 51 (Modalidad virtual y presencial (UAI), octubre 2022) es
sedici.description.peerReview peer-review es


Download Files

This item appears in the following Collection(s)

Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0) Except where otherwise noted, this item's license is described as Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0)