Page 785 - Emerging Trends and Innovations in Web-Based Applications and Technologies
P. 785

International Journal of Trend in Scientific Research and Development (IJTSRD) @ www.ijtsrd.com eISSN: 2456-6470
             technical contributions, this research seeks to enhance our   models. Additionally, fake news creators continuously evolve
             comprehension  of  the  fake  news  landscape  and  offer   their strategies, making it difficult for static models to adapt
             valuable  insights  for  policymakers,  digital  platform   to new patterns and narratives. Another critical challenge is
             developers,  and  researchers  dedicated  to  addressing   addressing  the  propagation  of  fake  news  through  social
             misinformation. The overarching objective is to build a well-  networks. Vosoughi et al. (2018) [14] highlighted that fake
             informed  and  resilient  society  that  can  accurately   news spreads faster and more widely than true news due to
             differentiate between genuine and misleading news content   its sensational nature, necessitating the development of real-
             in today's digital era.                            time detection systems. Furthermore, ethical considerations,
                                                                such as ensuring user privacy and avoiding censorship, must
             2.  Literature Review
                                                                be carefully addressed to maintain public trust.
             The detection of fake news has garnered significant attention
             in  recent  years,  leading  to  a  growing  body  of  research   The existing body of work demonstrates that both traditional
             exploring various techniques and approaches. This section   and deep learning methods have significantly contributed to
             reviews the existing literature, focusing on three main areas:   fake  news  detection.  However,  traditional  methods  often
             traditional machine learning methods, deep learning-based   require  extensive  manual  effort  for  feature  engineering,
             approaches, and challenges in fake news detection.    while deep learning approaches demand large datasets and
                                                                substantial  computational  resources.  The  integration  of
             Early research in fake news detection predominantly relied
                                                                multi-modal data and the use of advanced models such as
             on  traditional  machine  learning  techniques,  leveraging   transformers  hold  promise  for  improving  detection
             textual  and  metadata  features.  Techniques  such  as  Naïve
                                                                performance.  Nonetheless,  addressing  the  challenges  of
             Bayes, Support Vector Machines (SVM), Logistic Regression,
                                                                dataset  bias,  evolving  fake  news  tactics,  and  ethical
             and  Decision  Trees  were  widely  applied  due  to  their   considerations remain crucial for developing effective and
             simplicity  and  interpretability.  Rubin  et  al.  (2015)  [7]
                                                                trustworthy solutions.
             explored linguistic cues such as writing style, syntax, and
             readability  to  classify  news  articles,  showing  the   This study builds upon the existing literature by exploring a
             effectiveness of feature engineering in distinguishing fake   range of machine learning algorithms, including traditional,
             news  from  legitimate  content.  Similarly,  Potthast  et  al.   deep  learning,  and  hybrid  methods,  to  identify  the  most
             (2017) [8] utilized contentbased features, including word   effective approaches for fake news detection. Additionally,
             frequency and sentiment analysis, combined with SVM for   we aim to address some of the challenges highlighted in the
             fake  news  detection,  achieving  promising  results.  While   literature  by  employing  diverse  datasets  and  evaluating
             these  methods  demonstrated  moderate  success,  their   model performance across different scenarios.
             reliance on manual feature extraction posed limitations in
                                                                3.  Methodology
             handling  the  complex  and  evolving  nature  of  fake  news.
                                                                This study introduces a structured approach to detecting
             Moreover,  traditional  approaches  often  struggled  with
                                                                fake news through machine learning techniques and natural
             generalization  across  datasets,  as  fake  news  tactics  and
                                                                language processing. The methodology consists of multiple
             narratives varied widely across different contexts.
                                                                interrelated  phases,  starting  with  data  collection  and
             The advent of deep learning has revolutionized fake news   preprocessing,  followed  by  feature  extraction,  model
             detection by enabling models to automatically learn features   implementation, and performance evaluation. The dataset
             from  data.  Recurrent  Neural  Networks  (RNNs)  and  Long   used comprises both authentic and fabricated news articles,
             Short-Term Memory (LSTM) networks have been extensively   maintaining an almost equal distribution (50.4% real, 49.6%
             used to capture contextual  and sequential information  in   fake)  to  ensure  balanced  binary  classification.  The  data
             news text. Wang et al. (2022) [9] introduced a hybrid model   preprocessing workflow includes several essential steps to
             combining convolutional neural networks (CNNs) and LSTMs   enhance  text  quality  and  uniformity.  Initially,  special
             to  extract  spatial  and  temporal  features,  significantly   characters and numerical values are eliminated using regular
             improving  classification  accuracy.  Transformer-based   expressions, followed by converting text to lowercase and
             models,   such   as   BERT   (Bidirectional   Encoder   removing  frequently  used  stopwords  in  English.  These
             Representations from Transformers), have further advanced   preprocessing steps help minimize noise while preserving
             the field by capturing deeper contextual relationships in text.   the  core  meaning  of  the  content,  leading  to  a  21.96%
             Devlin  et  al.  (2019)  [10]  demonstrated  the  superior   reduction in average text length and a 32.6% decrease in
             performance of BERT in text classification tasks, including   word count (from 423.04 to 285.13 words on average).
             fake news detection. Researchers such as Zhou et al. (2021)
                                                                Figure  1 Text Analysis Visualization: Length Distribution
             [11]  have  fine-tuned  transformer  models  on  fake  news
                                                                and Word Clouds of Fake vs Real News
             datasets, achieving stateof-the-art results. Moreover, multi-
             modal approaches that incorporate textual, visual, and social   To  extract  features,  this  study  utilizes  Term  Frequency-
             network  data  have  gained  traction.  Qi  et  al.  (2021)  [12]   Inverse  Document  Frequency  (TF-IDF)  vectorization,
             proposed a model that combines textual analysis with image   selecting a maximum of 5000 features to effectively capture
             recognition  and  user  engagement  patterns  to  detect  fake   both  word  significance  within  individual  documents  and
             news on social media platforms, highlighting the importance   their relevance across the entire dataset. The dataset is then
             of integrating diverse data sources for robust detection.    divided into training (80%) and testing (20%) sets, ensuring
                                                                stratified sampling to maintain an even class distribution.
             Despite significant advancements, several challenges persist
                                                                Five different machine learning models are applied: Logistic
             in the domain of fake news detection. One of the primary
                                                                Regression (configured for a maximum of 1000 iterations),
             issues  is  the  lack  of  standardized  and  balanced  datasets.
                                                                Random Forest Classifier, Support Vector Machine (SVM)
             Horne  and  Adali  (2017)  [13]  noted  that  many  publicly
                                                                with probability estimates enabled, Multinomial Naïve Bayes,
             available  datasets  are  biased  toward  specific  topics  or
                                                                and a Neural Network (MLP Classifier) set for 300 iterations.
             languages, limiting the generalizability of machine learning
             IJTSRD | Special Issue on Emerging Trends and Innovations in Web-Based Applications and Technologies   Page 775
   780   781   782   783   784   785   786   787   788   789   790