Domain
This section contains a summary of the project paper content.
The study explores the application of new technologies,
particularly Natural Language Processing (NLP), to automate
policy creation and enforcement in cybersecurity. Various
researchers have investigated using NLP to translate written
policies into formal access control frameworks and automate
compliance verification processes, with advances in large
language models (LLMs) expanding these capabilities. However,
challenges remain, such as interpreting ambiguous language and
integrating policies into existing systems. Additionally, human
factors in cybersecurity, including user awareness and behavior,
remain critical areas of focus. Studies have shown that while
security awareness training can raise knowledge, long-term
behavioral changes require personalized, ongoing interventions.
Research on network intrusion detection and phishing detection
using artificial neural networks (ANN) and machine learning has
demonstrated promising results, though evolving threats continue
to challenge the effectiveness of these models. Overall, a
comprehensive, multi-faceted approach combining technology and
human-centered strategies is necessary to address the
complexities of modern cybersecurity.
Current cybersecurity measures focus heavily on technological
solutions but often overlook the human element, which remains
the weakest link. While many organizations invest in advanced
cyber defenses, they fail to address security hygiene, user
awareness, and behavior analytics adequately. There is also a
lack of comprehensive frameworks that integrate AI-based
intrusion detection, NLP-driven policy enforcement, and behavior
profiling to bridge this gap between human vulnerabilities and
cybersecurity solutions.
The research addresses the critical issue of user-related
vulnerabilities in corporate environments, which are exploited
through phishing, poor security hygiene, and browser-based
attacks. Specifically, the problem revolves around how
organizations struggle to maintain security due to:
• Limited user awareness and poor online behavior hygiene.
• Inefficient enforcement of browser-based security
policies.
• Ineffective real-time phishing detection and
user profiling mechanisms.
• Depending on traditional
Network Intrusion Detection Systems.
The study aims to
find a solution that connects human behavior with advanced
cybersecurity mechanisms, minimizing the risks posed by human
error.
The primary objective is to develop a centralized framework that
monitors, evaluates, and improves employees’ online presence
within a corporate environment. Specific objectives include:
1. Implementing behavior analytics for profiling user
activities.
2. Automating the generation and enforcement
of browser security policies using NLP.
3. Designing a
real-time intrusion detection system powered by Artificial
Neural Networks (ANN).
4. Improving phishing detection
through URL and visual analysis to identify high-risk
activities.
5. Promoting better security hygiene by
bridging the gap between user awareness and the evolving threat
landscape.
The methodology includes the following components
1. User Behavior Profiling:
• Identify and analyze
risk factors affecting browser hygiene using Principal Component
Analysis (PCA) and Bayesian Network Analysis.
• Create a
Browser Hygiene Risk Assessment Model to quantify risks based on
various factors like outdated browsers, malicious extensions,
and unsafe protocols.
2. Intrusion Detection and Prevention System:
•
Develop an ANN-based solution trained on datasets like NSL-KDD
and updated with real-time threat intelligence.
• Employ a
binary classification model that flags malicious traffic on each
host system for faster detection and prevention.
3. NLP-Assisted Policy Generation and Enforcement:
•
Use BERT models for classifying and extracting policy intents
from natural language inputs.
• Implement scripts for
applying browser security policies across multiple
Chromium-based browsers.
4. Phishing Detection Mechanism:
• Use a browser
plugin to compare website visuals and URLs with a predefined
safe list.
• Redirect users to safety when phishing
attempts are detected, adding malicious domains to a blacklist.
Artificial Neural Networks are used to analyze realtime
network traffic and make predictions on them.
Natural Language Processing is used to identify security
requirements given in natural human input.
Mathematical model was developed to calculate the user based
security score based on customizable coefficients.
Browser plugins was utilized in both NLP based browser
policy enforcement and capture, Browser based phishing
detection.
We used elastic search for indexing and applying
calculations to our data gathered based on user digital
hygine.
GitLab was used as the main version controlling mechanism.
Python was used in data processing, training and driving
both Artificial Neural Networks and Natural Language
Processing engines.
Bash scripts were used to handle small automations and
integrations in the backend.
Kibana was used in data visualization and qurying purposes.
MySQL database was used to store information related to
phishing detection,
React was used in implementing frontends.
Javascript was heavily utilized for implementing web
application functions in both frontends and backends.