Santiago Palmero Muñoz, Christian Oliva, Luis F Lago-Fernández, David Arroyo
4th Multidisciplinary International Symposium on Disinformation in Open Online Media. MISDOOM2022
11 y 12 de octubre de 2022, Idaho, EE.UU.
Detecting unreliable information in social media is an open challenge, in part as a result of the difficulty to associate a piece of information to known and trustworthy actors. The identification of the origin of sources can help society deal with unverified, incomplete, or even false information. In this work we tackle the problem of associating a piece of information to a certain politician. The use of inaccurate information is of great relevance in the case of politicians, since it affects social perception and voting behavior. Moreover, misquotation can be weaponized to hinder adversary reputation. We consider the task of applying a compression-based metric to conduct authorship attribution in social media, namely in Twitter. In specific, we leverage the Normalized Compression Distance (NCD) to compare an author’s text with other authors’ texts. We show that this methodology performs well, obtaining 80.3% accuracy in a scenario with 6 different politicians.