Brain-Computer Interfaces are useful devices that can partially restore communication from severely compromised patients. Although advances in deep learning have significantly improved brain pattern recognition, a large amount of data is required for training these deep architectures. In recent years, the inner speech paradigm has drawn much attention, as it can potentially allow natural control of different devices. However, as of the date of this publication, there is only a small amount of data available in this paradigm. In this work we show that it is possible, through transfer learning and domain adaptation methods, to make the most of the scarce data, enhancing the training process of a deep learning architecture used in brain-computer interfaces.