國立虎尾科技大學 |

Linguistic and Emotion-Based Identification of Tweets with Fake News : = A Case Study.

紀錄類型:	書目-語言資料,手稿 : Monograph/item
正題名/作者:	Linguistic and Emotion-Based Identification of Tweets with Fake News :/
其他題名:	A Case Study.
作者:	Bernardes, Vitor Sexto.
面頁冊數:	1 online resource (101 pages)
附註:	Source: Masters Abstracts International, Volume: 84-01.
Contained By:	Masters Abstracts International84-01.
標題:	Sentiment analysis. -
電子資源:	click for full text (PQDT)
ISBN:	9798835535798

Linguistic and Emotion-Based Identification of Tweets with Fake News : = A Case Study.
Bernardes, Vitor Sexto.

Linguistic and Emotion-Based Identification of Tweets with Fake News :A Case Study. - 1 online resource (101 pages)

Source: Masters Abstracts International, Volume: 84-01.

Thesis (M.Sc.)--Universidade do Porto (Portugal), 2021.

Includes bibliographical references

Since the popularization of the term in 2016, we have observed a proliferation of so called "fake news" content, assisted by the widespread use of social media platforms. The dissemination of fake content brings with it serious political, economical, and health-related real-world impacts, which makes it imperative to find ways to mitigate this problem. In this dissertation we propose a machine learning-based approach to tackle it by automatically identifying tweets associated with questionable content. To that end, we employ a practical approach with a case study using newly collected data from Twitter related to the 2020 US presidential election. In order to create a sizable annotated data set, we use an automatic labeling process based on the factual reporting level of links contained in tweets, as classified by human experts, resulting in a labeled data set containing 150 thousand tweets representative of the real-world scenario of fake news distribution in social media. We derive relevant features from that data, based on a combination of text content representation, linguistic attributes, user profile, and post metadata. We compare different variations of these features and identify the most applicable methods to the problem at hand, with an approach generally based on data obtained from an individual tweet, enabling classification as soon as a tweet is posted, therefore helping to prevent the harm inevitably caused by fake news diffusion. We additionally demonstrate the specific contribution of features derived from named entity and emotion recognition techniques, including a novel approach using sequences of prevalent emotions in sentences, analogous to n-grams. We conclude the dissertation by evaluating and comparing the performance of several machine learning models on a number of test sets, created with different contexts and time periods, and show they are applicable to addressing the issue of fake news dissemination.

Electronic reproduction.
Ann Arbor, Mich. :
ProQuest,
2024

Mode of access: World Wide Web

ISBN: 9798835535798Subjects--Topical Terms:

1403973
Sentiment analysis.
Index Terms--Genre/Form:

554714
Electronic books.

Linguistic and Emotion-Based Identification of Tweets with Fake News : = A Case Study.
LDR:05440ntm a22003377 4500 001 1150606
005 20241028051809.5
006 m o d
007 cr bn ---uuuuu
008 250605s2021 xx obm 000 0 eng d
020 $a 9798835535798
035 $a (MiAaPQ)AAI29139235
035 $a (MiAaPQ)Portugal10216139434
035 $a AAI29139235
040 $a MiAaPQ $b eng $c MiAaPQ $d NTU
100 1 $a Bernardes, Vitor Sexto. $3 1477134
245 1 0 $a Linguistic and Emotion-Based Identification of Tweets with Fake News : $b A Case Study.
264 0 $c 2021
300 $a 1 online resource (101 pages)
336 $a text $b txt $2 rdacontent
337 $a computer $b c $2 rdamedia
338 $a online resource $b cr $2 rdacarrier
500 $a Source: Masters Abstracts International, Volume: 84-01.
500 $a Advisor: Figueira, Alvaro.
502 $a Thesis (M.Sc.)--Universidade do Porto (Portugal), 2021.
504 $a Includes bibliographical references
520 $a Since the popularization of the term in 2016, we have observed a proliferation of so called "fake news" content, assisted by the widespread use of social media platforms. The dissemination of fake content brings with it serious political, economical, and health-related real-world impacts, which makes it imperative to find ways to mitigate this problem. In this dissertation we propose a machine learning-based approach to tackle it by automatically identifying tweets associated with questionable content. To that end, we employ a practical approach with a case study using newly collected data from Twitter related to the 2020 US presidential election. In order to create a sizable annotated data set, we use an automatic labeling process based on the factual reporting level of links contained in tweets, as classified by human experts, resulting in a labeled data set containing 150 thousand tweets representative of the real-world scenario of fake news distribution in social media. We derive relevant features from that data, based on a combination of text content representation, linguistic attributes, user profile, and post metadata. We compare different variations of these features and identify the most applicable methods to the problem at hand, with an approach generally based on data obtained from an individual tweet, enabling classification as soon as a tweet is posted, therefore helping to prevent the harm inevitably caused by fake news diffusion. We additionally demonstrate the specific contribution of features derived from named entity and emotion recognition techniques, including a novel approach using sequences of prevalent emotions in sentences, analogous to n-grams. We conclude the dissertation by evaluating and comparing the performance of several machine learning models on a number of test sets, created with different contexts and time periods, and show they are applicable to addressing the issue of fake news dissemination.
520 $a Desde 2016, com a popularizacao da expressao, temos observado uma proliferacao das chamadas "fake news" (termo em ingles para "noticias falsas"), que tem sido exponenciada pelo uso das plataformas de redes sociais. A disseminacao de conteudos falsos traz consigo graves impactos reais, seja no campo da politica, da economia ou da saude publica, o que torna essencial encontrar maneiras de atenuar esse problema. Nesta dissertacao, propomos uma abordagem para enfrenta-lo com base em aprendizagem computacional, por meio da identificacao automatizada de tweets associados a conteudos questionaveis. Adotamos uma abordagem pratica, realizando um caso de estudo com dados recolhidos do Twitter exclusivamente para esse fim, relacionados a eleicao presidencial norte-americana de 2020. Para obter um conjunto de dados de tamanho consideravel, empregamos um processo automatizado de etiquetagem baseado no grau de reportagem factual de links presentes em tweets, conforme avaliacao de especialistas. Esse processo resultou em um conjunto de dados etiquetado com 150 mil tweets que e representativo da distribuicao real das fake news em redes sociais. A partir desses dados, criamos atributos relevantes, com base em uma combinacao de representacao de texto, propriedades linguisticas, no perfil do utilizador e em metadados das publicacoes. Comparamos variacoes desses atributos e identificamos os metodos mais adequados para enfrentar o problema em questao, com uma abordagem baseada em dados que podem ser obtidos a partir de cada tweet, o que possibilita que ele seja classificado assim que e publicado, o que por sua vez pode ajudar a evitar o dano que e inevitavelmente causado pela divulgacao de fake news. Alem disso, demonstramos a contribuicao especifica de atributos derivados de tecnicas de reconhecimento de entidades nomeadas e de emocoes, com uma nova abordagem que utiliza sequencias de emocoes dominantes em frases, analogas a n-gramas. Concluimos a dissertacao com a avaliacao e a comparacao do desempenho de diversos modelos de aprendizagem computacional em uma serie de conjuntos de teste, criados em diferentes contextos e periodos, com os quais demonstramos a adequacao de modelos para enfrentar o problema da disseminacao das fake news.
533 $a Electronic reproduction. $b Ann Arbor, Mich. : $c ProQuest, $d 2024
538 $a Mode of access: World Wide Web
650 4 $a Sentiment analysis. $3 1403973
650 4 $a Emotions. $3 560966
650 4 $a Web studies. $3 1148502
655 7 $a Electronic books. $2 local $3 554714
690 $a 0646
710 2 $a ProQuest Information and Learning Co. $3 1178819
710 2 $a Universidade do Porto (Portugal). $3 1188642
773 0 $t Masters Abstracts International $g 84-01.
856 4 0 $u http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=29139235 $z click for full text (PQDT)