Nowadays, information is crucial in the configuration of the socio-political space. Data relevance in both decision making and decision taking has exponentially increased. Content examination, social network analysis, information propagation (including epidemic and statistical modeling analysis), or sentiment analysis techniques are currently used to classify and curate information. Nonetheless, mis- and dis-information are among the major current cybersecurity challenges, as it is hindering the very health of our democratic systems. As a result, there is an urge to devise and implement technical solutions to detect and deter the propagation of unreliable information.
In this work, we consider a specific case in the taxonomy of the complex scenarios of mis- and dis-information phenomena, the so-called fake news. In short, we used labeled data set containing fake news, which are going to be detected by means of traditional natural language processing techniques and advanced deep learning approaches. Our intention relies on comparing the accuracy of simple methods (namely, traditional natural language processing) with respect to modern and complex techniques in the deep learning family. The study of the above mentioned dataset hints that adopting complex techniques may not always guarantee achieving better classification performances.
Acknowledgements
This project has received funding from the European Union’s Horizon 2020 research and innovation programme, under grant agreement No. 872855 (TRESCA project), and from Ministerio de Economía, Industria y Competitividad (MINECO), Agencia Estatal de Investigación (AEI), and Fondo Europeo de Desarrollo Regional (FEDER, EU) under project COPCIS, reference TIN2017-84844-C2-1-R, and the Comunidad de Madrid (Spain) under the project CYNAMON (P2018/TCS-4566), cofinanced with FSE and FEDER EU funds.