Page 784 - Emerging Trends and Innovations in Web-Based Applications and Technologies
P. 784
International Journal of Trend in Scientific Research and Development (IJTSRD)
Special Issue on Emerging Trends and Innovations in Web-Based Applications and Technologies
Available Online: www.ijtsrd.com e-ISSN: 2456 – 6470
FakeAlert: An Innovative Machine Learning Framework
for Identifying and Combatting Falsified News
3
1
Jyoti Tiwari , Tushar Mahajan , Aditya Kathalkar , Prof. Usha Kosarkar
2
4
1,2,3,4 Department of Science and Technology,
1,2,3,4 G H Raisoni College of Engineering and Management, Nagpur, Maharashtra, India
ABSTRACT Detecting fake news is a complex task influenced by several
The spread of misinformation has become a serious global factors, such as the advanced techniques used by
concern, impacting public trust and information integrity. misinformation creators, the subjective nature of truth, and
This study investigates the use of advanced machine the fast-changing digital communication landscape. Many
learning techniques to detect fraudulent news, utilizing a fabricated stories incorporate factual elements alongside
dataset containing both legitimate and false news articles. false information, making them challenging to differentiate
Preprocessing techniques such as text cleaning and TF-IDF from legitimate news. Furthermore, technologies like
vectorization enhance data quality and model efficiency. deepfake media and AI-generated content have made
Five machine learning algorithms—Random Forest, identification more difficult, requiring more advanced
Support Vector Machine (SVM), Neural Networks, Logistic analytical methods. This study evaluates the effectiveness of
Regression, and Naïve Bayes—are evaluated based on different machine learning algorithms in identifying fake
accuracy, precision, recall, and F1-score. The Random news by considering both textual and contextual
Forest Classifier achieves the highest accuracy of 99.95%, characteristics. Specifically, it assesses the performance of
demonstrating superior reliability in distinguishing fake models such as Random Forest, Support Vector Machine
news from authentic articles. While SVM and Neural (SVM), Neural Networks, Logistic Regression, and Naïve
Networks also perform well, Logistic Regression and Naïve Bayes. Each algorithm provides distinct strengths in
Bayes, though computationally efficient, show relatively analyzing language patterns, semantic structures, and
lower effectiveness. This research underscores the contextual indicators. Additionally, the research explores key
significance of ensemble models and advanced challenges in fake news detection, including dataset bias,
preprocessing in developing robust fake news detection evolving misinformation strategies, and the absence of
systems, offering valuable insights for automated universal evaluation standards.
misinformation mitigation strategies.
This study explores the impact of feature engineering and
selection on enhancing the accuracy of fake news detection.
KEYWORDS: Fake news detection, Machine learning, Text Various textual attributes, such as syntactic structures,
classification, Natural language processing, Misinformation semantic connections, and writing style patterns, are
prevention examined alongside metadata factors like source reliability,
dissemination trends, and audience interaction metrics.
1. INTRODUCTION Combining these diverse elements aims to develop a more
The rapid rise of online misinformation presents a significant resilient and adaptable detection framework capable of
challenge, distorting public perception and undermining addressing evolving misinformation tactics. By
trust in media and institutions. The unrestricted spread of systematically evaluating accuracy, precision, recall, and
unverified information on digital platforms has exacerbated computational efficiency across different machine learning
this issue, particularly during critical events such as elections models, this research seeks to determine the most effective
and global crises. Traditional verification methods, such as methodologies for identifying fake news. Additionally, the
manual fact-checking, are often inadequate given the speed study assesses the balance between model complexity and
at which fake news proliferates. As a solution, machine performance, considering the practical challenges of real-
learning-based approaches provide scalable and efficient world applications. Extensive experimentation on multiple
ways to analyze text data and detect deceptive content datasets, covering varied misinformation types and linguistic
through linguistic patterns and contextual analysis. contexts, ensures the broad applicability of the findings. The
This study adopts a systematic approach to fake news research also highlights the necessity of creating unbiased
detection, comprising data acquisition, preprocessing, and well-structured datasets to improve the reliability and
feature extraction, model implementation, and evaluation. effectiveness of machine learning-based detection systems.
The dataset includes a near-equal distribution of real and This involves tackling challenges such as dataset labeling,
fake news articles. Preprocessing steps involve removing class imbalance, and ensuring data relevance over time.
special characters, normalizing text, and eliminating Furthermore, ethical concerns and potential biases in
stopwords, leading to a significant reduction in data noise. automated detection tools are examined, leading to
TF-IDF vectorization is applied to extract meaningful recommendations for the responsible design and
features for classification. The dataset is split into training implementation of misinformation detection technologies.
(80%) and testing (20%) subsets, ensuring balanced This study's findings are anticipated to play a crucial role in
representation. the advancement of more refined and adaptable tools for
curbing the spread of misinformation. In addition to its
IJTSRD | Special Issue on Emerging Trends and Innovations in Web-Based Applications and Technologies Page 774