We present an extension to the GF framework for OntologyBased Data Access with the aim of determining the functional dependencies that hold in a spreadsheet. Spreadsheets are restricted to a single table expressed as a CSV text file. An initial set of tentative functional dependencies is computed using the TANE datamining algorithm. This set is then presented to the user who is used as an oracle to revise it. Given a functional dependency, the user can see the tuples from the spreadsheet justifying it. The user can revise the validity of the functional dependency with the help of our system, which will generate tuples not present in the dataset by using values already present in the table. The user can then add some of the new records to the table when he considers their feasibility and rerun the miner to see if the functional dependency still holds. We present a running example along with a downloadable JAVA-based application with source code of the miner in the C programming language and the files used in our experiments to help with the reproducibility of our results.
Información general
Fecha de exposición:octubre 2023
Fecha de publicación:2024
Idioma del documento:Inglés
Evento:XXIX Congreso Argentino de Ciencias de la Computación (CACIC) (Luján, 9 al 12 de octubre de 2023)
Institución de origen:Red de Universidades con Carreras en Informática
Excepto donde se diga explícitamente, este item se publica bajo la siguiente licencia Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0)