Audio identification consist in the ability to pair audio signals of the same perceptual nature. In other words, the aim is to be able to compare an audio signal with a modified versions perceptually equivalent. To accomplish that, an audio fingerprint is extracted from the signals and only the fingerprints are compared to asses the similarity.
Some guarantee have to be given about the equivalence between comparing audio fingerprints and perceptually comparing the signals.
In designing AFPs, a dense representation is more robust than a sparse one. A dense representation also imply more compute cycles and hence a slower processing speed.
To speedup the computing of a very dense audio fingerprint, able to stand stable under noise, re-recording, low-pass filtering, etc., we propose the use of a massive parallel architecture based on the Graphics Processing Unit (GPU) with the CUDA programming kit. We prove experimentally that even with a relatively small GPU and using a single core in the GPU, we are able to obtain a notable speedup per core in a GPU/CPU model. We compared our FFT implementation against state of the art CUFFT obtaining impressive results, hence our FFT implementation can help other areas of application.