Search among the 162122 resources available in the repository
dc.date.accessioned | 2011-06-29T15:52:51Z | |
dc.date.available | 2011-06-29T03:00:00Z | |
dc.date.issued | 2011 | |
dc.identifier.uri | http://sedici.unlp.edu.ar/handle/10915/5529 | |
dc.description.abstract | Digital repositories acting as resource aggregators typically face different challenges, roughly classified in three main categories: extraction, improvement and storage. The first category comprises issues related to dealing with different resource collection protocols: OAI-PMH, web-crawling, webservices, etc and their representation: XML, HTML, database tuples, unstructured documents, etc. The second category comprises information improvements based on controlled vocabularies, specific date formats, correction of malformed data, etc. Finally, the third category deals with the destination of downloaded resources: unification into a common database, sorting by certain criteria, etc. This paper proposes an ETL architecture for designing a software application that provides a comprehensive solution to challenges posed by a digital repository as resource aggregator. Design and implementation aspects considered during the development of this tool are described, focusing especially on architecture highlights. | en |
dc.language | en | es |
dc.subject | repositories | en |
dc.subject | Búsqueda y recuperación de información | es |
dc.subject | aggregation | en |
dc.subject | Aplicaciones de los Sistemas de Información | es |
dc.subject | harvesting | en |
dc.subject | datawarehousing | en |
dc.subject | data integration | en |
dc.title | Extract, transform and load architecture for metadata collection | en |
dc.title.alternative | Arquitectura ETL para la recolección de metadatos | es |
dc.type | Objeto de conferencia | es |
sedici.creator.person | De Giusti, Marisa Raquel | es |
sedici.creator.person | Lira, Ariel Jorge | es |
sedici.creator.person | Oviedo, Néstor Fabián | es |
sedici.subject.materias | Ciencias Informáticas | es |
sedici.subject.materias | Bibliotecología | es |
sedici.description.fulltext | true | es |
mods.originInfo.place | Servicio de Difusión de la Creación Intelectual (SEDICI) | es |
sedici.subtype | Objeto de conferencia | es |
sedici.rights.license | Creative Commons Attribution 3.0 Unported (CC BY 3.0) | |
sedici.rights.uri | http://creativecommons.org/licenses/by/3.0/ | |
sedici.date.exposure | 2011-05-17 | |
sedici.relation.event | VI Simposio Internacional de Bibliotecas Digitales (Brasil, 2011) | es |
sedici.description.peerReview | peer-review | es |
sedici2003.identifier | ARG-UNLP-DIS-0000001240 | es |