Read the attached three documents and write the summary of ?Chapter 2 (literature review ) for ANYONE of the three attached papers in 300 words minim
Read the attached three documents and write the summary of Chapter 2 (literature review ) for ANYONE of the three attached papers in 300 words minimum.
Note: Write a "summary of the Literature review" of ANY ONE of the three papers attached. NO need to write a summary for all papers attached.
Artificial Intelligence in Cybersecurity: Concentration on the Effectiveness of Machine
Learning
By: Preston Pham
December 7, 2021
Submitted in Partial Fulfillment of the Requirements for the Doctor of Education degree.
St. Thomas University
Miami Gardens, Florida
ii
Copyright© 2021 by Preston Pham
All Rights Reserved
iii
Copyright Acknowledgement Form
St. Thomas University
I, the writer’s full name, understand that I am solely responsible for the content of this
dissertationand its use of copyrighted materials. All copyright infringements and issues
are solely the responsibly of myself as the author of this dissertation and not St. Thomas
University, its programs,or libraries.
________________________ _____________________
Signature of Author Date
________________________ _____________________
Witness (Martin Nguyen) Date
__________________
Signature of Author
_____________
_____________
iv
St. Thomas University Library Release Form
Artificial Intelligence in Cybersecurity – Concentration on the Effectiveness of Machine
Learning
Preston Pham
I understand that US Copyright Law protects this dissertation against unauthorized use. By
my signature below, I am giving permission to St. Thomas University Library to place this
dissertation in its collections in both print and digital forms for open access to the wider
academic community. I am also allowing the Library to photocopy and provide a copy of
this dissertation for the purpose of interlibrary loans for scholarly purposes and to migrate
it to other forms of media for archival purposes.
________________________ _____________________
Signature of Author Date
________________________ _____________________
Witness (Martin Nguyen) Date
__________________
Signature of Author
_____________
_____________
v
St. Thomas University Dissertation Manual Acknowledgement Form
Artificial Intelligence in Cybersecurity – Concentration on the Effectiveness of Machine
Learning
Preston Pham
By my signature below, I Preston Pham assert that I have read the dissertation publication
manual, that my dissertation complies with the University’s published dissertation
standards and guidelines, and that I am solely responsible for any discrepancies between
my dissertation and the publication manual that may result in my dissertation being
returned by the library for failure to adhere to the published standards and guidelines
within the dissertation manual. The Dissertation Publication Manual may be found:
________________________ _____________________
Signature of Author Date
________________________ _____________________
Signature of Chair Date
__________________
Signature of Author
_____________
_____________
vi
Abstract
Modern networks drive for ubiquitous connectivity and digitalization in support
of globalization but also, inadvertently and unavoidably, create a fertile ground for the
rise in scale and volume of cyberattacks. Countermeasures to these advanced attacks have
never been more crucial than in our present time; hence with Artificial Intelligence (AI),
this technological breakthrough can help augment protective techniques for the defensive
side of cybersecurity. AI improves its knowledge by detecting the patterns and
relationships among data and learns through the data to build self-learning algorithms. It
analyzes relationships between threats like malicious network traffic, suspicious internet
protocol (IP) addresses, or malware files within minutes or even seconds to provide the
intelligence to the organization for quicker response to a threat event than traditional
labor-intensive methods. This paper is intended to explore the phenomenon of AI in
cybersecurity and determine whether the present stage of AI technology and in particular
Machine Learning can help improve cybersecurity. The paper has two main objectives of
testing AI’s threat classification ability against a human cybersecurity analyst and AI’s
prediction ability of future threat events against a renowned time-series data-forecasting
model, the autoregressive integrated moving average (ARIMA) statistical model.
Keywords: cybersecurity, artificial intelligence, classification, prediction, ARIMA
vii
Acknowledgments
It is with genuine pleasure that I would like to express my deep sense of gratitude
and give my warmest thanks to my former professors and committee members which
consist of Dr. Lisa J. Knowles, Dr. Joseph M. Pogodzinski, and Dr. Jose G. Rocha. Their
dedication, advice, meticulous scrutiny, and scholarly advice have helped me to
accomplish my dissertation paper.
I would like to profoundly acknowledge Dr. Knowles, my dissertation chair, for
her kindness, enthusiasm, positivity, and dynamism. Dr. Knowles relentlessly helped me
manage every step along the way to ensure I completed my all my chapters and the work
overall during the time of a global pandemic in 2020-2021. I also thank my former
manager Mr. Raymond Lee, Director of Information Security, for suggesting necessary
technological advice during my research pursuit in writing about the topic of
cybersecurity.
viii
Dedication
The study serves as time-capsule literature to show how a doctorate student at St.
Thomas University performed a study on Artificial Intelligence within the domain of
cybersecurity using the current technologies of the era. The study seeks to be a reference
guide for both academia and industry leaders to further extend the research and
applications of Artificial Intelligence in cybersecurity. The paper is also dedicated to the
men and women in the cybersecurity industry around the globe who are actively fighting
against cybercriminals to protect their organizations, institutions, or government
agencies.
ix
Table of Contents
Copyright Acknowledgement Form ……………………………………………………………………. iii
St. Thomas University Library Release Form………………………………………………………… iv
St. Thomas University Dissertation Manual Acknowledgement Form ………………………… v
List of Tables…………………………………………………………………………………………………..xii
List of Figures ………………………………………………………………………………………………. xiii
List of Formulas …………………………………………………………………………………………….. xiv
CHAPTER ONE. INTRODUCTION……………………………………………………………………. 1
Introduction to the Problem ……………………………………………………………………….1
Background, Context, and Theoretical Framework ………………………………………..2
Statement of the Problem ………………………………………………………………………….5
Purpose of the Study ………………………………………………………………………………..5
Research Question …………………………………………………………………………………..6
Rationale, Relevance, and Significance of the Study ……………………………………..6
Nature of the Study ………………………………………………………………………………….7
Definition of Terms ………………………………………………………………………………….7
Assumptions, Limitations, and Delimitations ……………………………………………….8
Organization of the Remainder of the Study …………………………………………………9
Chapter One Summary …………………………………………………………………………… 10
CHAPTER TWO. LITERATURE REVIEW ……………………………………………………….. 12
Introduction to the Literature Review ……………………………………………………….. 12
Review of Research Literature ………………………………………………………………… 14
Chapter Two Summary ………………………………………………………………………….. 22
x
CHAPTER THREE METHODOLOGY ……………………………………………………………… 25
Introduction to Methodology …………………………………………………………………… 25
Purpose of Study …………………………………………………………………………………… 27
Research Questions ……………………………………………………………………………….. 27
Research Design …………………………………………………………………………………… 27
Data Collection and Data Analysis Procedures …………………………………………… 30
Target Population, Sampling Method, and Related Procedures ……………………… 33
Instrumentation …………………………………………………………………………………….. 34
Limitations of the Research Design ………………………………………………………….. 37
Data Validity Test …………………………………………………………………………………. 38
Expected Findings …………………………………………………………………………………. 41
Ethical Issues ……………………………………………………………………………………….. 42
Conflict of Interest Assessment ……………………………………………………………….. 42
Chapter Three Summary ………………………………………………………………………… 42
CHAPTER FOUR DATA ANALYSIS AND RESULTS ……………………………………….. 44
Introduction to Data Analysis and Results …………………………………………………. 44
AI vs. Human Analysis in Classification of Threat Events ……………………………. 44
Detailed Analysis (AI vs. Human Analysis in Classification of Threat Events) … 47
AI vs. ARIMA Statistical Computation in Prediction of Threat Events …………… 48
Detailed Analysis (AI vs. ARIMA Statistical Computation to Predict Threat
Events) ………………………………………………………………………………………………… 52
Chapter Four Summary ………………………………………………………………………….. 53
CHAPTER FIVE. CONCLUSIONS AND DISCUSSION ……………………………………… 56
xi
Introduction to Conclusions and Discussion ………………………………………………. 56
Summary of the Results …………………………………………………………………………. 57
Discussion of the Results ……………………………………………………………………….. 58
Discussion of the Results in Relation to the Literature …………………………………. 60
Limitations …………………………………………………………………………………………… 61
Implication of the Results for Practice ………………………………………………………. 62
Recommendations for Further Research ……………………………………………………. 64
Conclusion …………………………………………………………………………………………… 65
APPENDIX A. INSTITUTIONAL REVIEW BOARD (IRB) …………………………………. 67
REFERENCES ……………………………………………………………………………………………….. 68
xii
List of Tables
Table 1. 10 Intrusion Categories with Depiction of Training and Testing Samples …. 30
Table 2. Results of Precision, Recall, and F1-Score for Classifiers ……………………….. 40
Table 3. Results of P-Values Stationary Test…………………………………………………….. 41
Table 4. Ordinary Least Squares Regression: AI vs. Cybersecurity Analyst Results … 46
Table 5. Number of Intrusion Events Detected and Average Time of AI vs.
Cybersecurity Analyst …………………………………………………………………………………… 47
Table 6. Spearman’s Rank Correlation Estimation Results ………………………………….. 50
Table 7. Trend to Month Translog Estimation Results………………………………………… 52
xiii
List of Figures
Figure 1. Branches of Artificial Intelligence ……………………………………………………….. 14
Figure 2. Methodological Literature Used …………………………………………………………… 22
Figure 3. Honeypot Network Architecture Diagram ……………………………………………… 28
Figure 4. Data Collection and Analysis Workflow Diagram …………………………………… 29
Figure 5. Configuration of Log and Netflow Forwarding ………………………………………. 30
Figure 6. Example of a Firewall Log ………………………………………………………………….. 31
Figure 7. Example of Raw Netflow Data ……………………………………………………………. 31
Figure 8. Display of Log Highlights when Unstructured in Corelight ………………………. 32
Figure 9. Display of Log Parsing when Structured in Elastic ………………………………….. 33
Figure 10. AI vs. Cybersecurity Analyst Intrusion Detection Regression Graph ………… 45
Figure 11. AI vs. Cybersecurity Analyst Intrusion Prediction Regression Graph ……….. 49
xiv
List of Formulas
Formula 1. Precision, Recall, and F1-Score Calculation Model ……………………………. 38
Formula 2. Ordinary Least Squares Regression for Severity Score ………………………… 46
Formula 3. Spearman’s Rank Correlation Estimation Model ………………………………… 50
Formula 4. Trend to Month Translog Estimation Model …………………………………….. 51
CHAPTER ONE. INTRODUCTION
Introduction to the Problem
Today’s world is highly network interconnected with the pervasiveness of small
personal devices (e.g., smartphones) as well as large computing devices or services (e.g.,
cloud computing). Each passing day, millions of data bytes are being generated,
processed, exchanged, and consumed by various applications within the cyberspace.
Thus, securing the data and users’ privacy on the world wide web has become an utmost
concern for individuals, business organizations, and national governments (Benavente-
Peces & Bartolini, 2019). With the massive amount of data that travels over the Internet,
it is also a great opportunity for cyber criminals to take advantage of this phenomena to
attack various organizations’ networks. An ever-growing percentage of cyberattacks is
explicitly targeted at specific organizations to steal intellectual properties or sensitive
data; perform espionages; and execute industrial sabotages or denial of services
(Apruzzese, et al., 2018). Although, organizations can employ human analysts to detect
threat agents on their network, yet the amount of time for a human analyst to triage the
malicious activities could take hours, days, or even months of correlating between
multiple data points to identify true positive threat events (Benavente-Peces & Bartolini,
2019). Thus, organizations are now looking at a new prodigy of technological discipline:
Artificial Intelligence (AI) which can gather knowledge by detecting the patterns and
relationships among data, then learn through data architectures to build self-learning
algorithms (Virmani et al., 2020). AI can analyze relationships between threats like
malicious network traffic, suspicious IP addresses, or malware files in seconds or minutes
2
and provide the intelligence to organizations for quicker response to threat events
(Apruzzese, et al., 2018).
Background, Context, and Theoretical Framework
With the rapid expansion in support of globalization, modern networks drive for
ubiquitous connectivity and digitalization, but also, simultaneously and unavoidably,
create a fertile ground for the rise in scale and volume of cyberattacks. Increasing cyber
threats with diversified and sophisticated tactics, cyber criminals and nation state
attackers target the systems that run our day-to-day-lives and easily exposed targets (Al
Qahtani, 2020). Countermeasures to these advanced attacks have never been more crucial
than in our present time; thus, with AI, learning new cyberattack vectors can help
augment protective techniques for the defensive side of cybersecurity. Defense in
cybersecurity can be a set of technologies and processes designed to protect systems,
networks, applications, and data from unauthorized access, alteration, or destruction
(Tyugu, 2011). A cybersecurity defense system consists of a network-based security
system and a host (computer-based) security system. Each of these systems includes
firewalls, antivirus software, intrusion detection and prevention systems (Al Qahtani,
2020). These systems are intended to block certain unwanted traffic; determine and
identify unauthorized system or user behaviors; analyze and distinguish everyday
baseline versus an anomalous event; then lastly eradicate or contain the malicious agent
from further executions.
Calderon and Floridi (2019) believe that AI can improve cybersecurity and
defense measures, allowing for greater system robustness, resilience, and recognition.
First, AI can improve systems’ robustness with the ability of a system to maintain its
3
stable configuration and settings even when it has processed erroneous inputs. Secondly,
AI can strengthen systems’ resilience, that is, the ability of a system to resist and tolerate
an attack without fatal failure or shutdown. Third, AI can be used to enhance system
recognition or detection, in terms of the capacity for a system to discover autonomous
intrusion behaviors and self-identification of vulnerabilities (Calderon & Floridi, 2019).
According to Banoth, et al. (2017), the driving forces that are boosting the use of AI in
cybersecurity are comprised of: (1) speed of impact: In some of the major attacks, the
time of impact on an organization is unpredictable. Today’s attacks are not just targeting
one specific system or certain vulnerability; the attackers can maneuver and change their
targets once they have penetrated the network. These types of attacks occur incredibly
quickly and not many human interactions can counteract the velocity of impact. (2)
Operational complexity is another concern, given the proliferation of cloud computing
and the fact that those platforms and services are operationalized and delivered very
quickly in the millisecond range. This level of complexity overwhelms the human
interactions; therefore, these actions can only be performed by machines matching to
another machines’ prowess. (3) Skills gaps in the cybersecurity workforce remain an
ongoing challenge: There is a global shortage of cybersecurity experts. The level of
scarcity has pushed industry to automate processes at a faster pace (Banoth, et al., 2017).
Realizing the crucial impact of AI today, AI (and in particular, Machine Learning in
cybersecurity) became the focus for this research.
AI is the science that enables computers and machines to learn, judge, and predict
based its own logic (Virmani, et al., 2020). As technology becomes more sophisticated,
the demand for AI is growing because of its ability to solve complex problems within a
4
limited amount of time. AI adopts abilities to equip the technical expertise to a machine
to learn and deploy new theories, methods, and techniques that aim to simulate and
extend the human intelligence (Conner-Simons, 2016). There has been a big
breakthrough in the field of AI due to advances in big data and graphic processing units
(GPUs) which have helped AI to grow exponentially in the last two decades (Sarker,
2020). Organizations can now benefit from AI’s cognitive ability to quickly become a
subject-matter expert in a relatively short time through self-training. Through repeated
use, the system will provide increasingly accurate responses, eventually eclipsing the
accuracy of human expertise (Mittal, et al., 2019). As the intelligence of machines and
the use of digital sensor data improve, various fields of science can use AI to understand
a wide range of collective information (Hussain, et al., 2020). AI is now being applied in
a variety of business industries, with underlying technological subsets such as natural
language processing, robotics, and computer vision. Hence, in particular regard to
cybersecurity, Truve (2017) considers AI techniques are most useful in cybersecurity in
its classification and prediction ability of entities and events. Automated classification of
events will help analysts prioritize on what they should focus their attention. Instead of
spending significant amounts of time deciding what topics to focus on, cybersecurity
analysts can improve their forensic work with already categorized and sorted threats. In
addition, Truve believes cyber defenders today are almost always one step behind, trying
to defend or patch systems where attacks and threats already exist. With predictive
information, defenders might instead start being proactive and protect their systems
against future threats. Therefore, predictive threat intelligence is important with AI’s
capability to predict future events from historical and current data. Prediction generation
5
is an example of a task that is hard or even impossible for a human analyst to carry out,
due to the complexity and large volume of data needed. Algorithms and machines scales
from AI generate predictive models that can be used to forecast events to solve such
problems (Williams & McGregor, 2020).
In this paper, the first task is to determine which branch of AI best applies to
cybersecurity. The overall objective is to apply the most popular branch of AI—Machine
Learning—to classify cybersecurity intrusion events against a human cybersecurity
analyst. Next, AI is tested to predict future cybersecurity intrusion events with time-series
datasets to determine its effectiveness in comparison to a popular time-series data-
prediction model, the autoregressive integrated moving average (ARIMA) statistical
model.
Statement of the Problem
AI is adopted in a wide range of domains where it shows its superiority over
traditional rule-based algorithm and manual human knowledge analysis (Benavente-
Peces & Bartolini, 2019), although the complete automation of detection and analysis of
cybersecurity threats and predict future attacks is an enticing goal. Yet, the efficacy and
accuracy of AI in cyber security must be evaluated with due diligence based on real-life
data.
Purpose of the Study
With its cognitive data-processing capability of Machine Learning, AI is a great
complement to defensive cybersecurity systems which can better detect and defend
against modern cyberwarfare (Truve, 2017). The purpose of the study is to examine the
phenomenon of AI in cybersecurity, research its implications in the business world, and
6
determine whether the present stage of AI technology—in particular, Machine
Learning—can help improve cybersecurity.
Research Question
This research paper focuses on the basic question: What branch of AI is most
applicable to cybersecurity? From this main question emerges the following sub-
questions: How accurate is Machine Learning currently, and can it be beneficial to
cybersecurity? What is the accuracy rate for AI to classify intrusion events versus a
human cybersecurity analyst? When AI is used to predict future intrusion events, what is
the accuracy rate when compared to a time-series prediction statistical ARIMA model?
Rationale, Relevance, and Significance of the Study
In a computing context, the world of information technology has undergone
massive shifts in technology from recent years; the power of high-performance
computers and big data analytics have been the driving factor to these the changes
(Sarker, 2020). With the high trend in cyber-attacks on the frontier of many
organizations. Cybersecurity arguably is the discipline that could benefit most from the
introduction of AI (Calderon & Floridi, 2019). Hence, this research is significant and
relevant to the cybersecurity community. The research is a technical paper that uniquely
designed a small Machine Learning engine with threat-detection algorithms based on
collected data from a dedicated honeypot network environment. Additionally, the
Machine Learning engine is trained with a large amount of data and has an integration
with threat intelligence feeds where the machine self-learns then provides analysis and
predictive results from the data.
7
Nature of the Study
AI has become a hot topic and keyword in recent years; it is being adopted and
widely used in various fields of science (Parrend, et al., 2018). Since AI itself has many
subsets of technology, literature was reviewed on multiple historical AI-related study
cases to determine what AI branch was the most widely used and applicable to
cybersecurity. From there, AI’s ability to classify network threats was compared against a
human cybersecurity analyst, then AI’s ability to forecast future threats was compared
against a well-known ARIMA statistical formula. In order to accomplish this, a dedicated
honeypot network environment was set up to collect firewall logs and netflow data in
order to train and use real-life examples to test the Machine Learning engine hosted on
Microsoft’s Azure Artificial Intelligence Web Service.
Definition of Terms
Anomaly: An activity deviates from what is standard, normal, or expected from
the normal behaviors of systems, network traffic, and system resources (National
Initiative for Cybersecurity Careers and Studies [NICCS], 2018).
Big Data: Extremely large data sets or data points that may be analyzed
computationally to gain insights on patterns, trends, behaviors and interactions (NICCS,
2018).
Cloud: On-demand availability of computer system resources accessible over the
internet, especially networking or computing power, without direct or physical
management by the user (NICCS, 2018).
Cloud Services: Software or program services that are accessible over the
internet (NICCS, 2018).
8
Cyberattack: An act of assault by which an entity intended to evade security
services in order to damage or destroy a computer network or system (NICCS, 2018).
Intelligence Source: A reliable information source where cybersecurity defensive
systems can absorb information about the latest malware algorithms, attack patterns, etc.
(NICCS, 2018).
Intrusion: A security incident in which an entity attempts circumvent security
services in order to gain access to a system or system resource without having proper
authorization (NICCS, 2018).
Netflow: Data of network protocols, IP traffic information as packets enters or
exits the interface collected by a network device (NICCS, 2018).
Network Traffic: Data transmissions in the form of packets sent over the
network from a sender host to a recipient host (NICCS, 2018).
System Log: Data of informational, error, or warning events related to the
behaviors of a computer system and its resources (NICCS, 2018).
Threat: A potential entity that has the intention to cause adversely affect th
Collepals.com Plagiarism Free Papers
Are you looking for custom essay writing service or even dissertation writing services? Just request for our write my paper service, and we'll match you with the best essay writer in your subject! With an exceptional team of professional academic experts in a wide range of subjects, we can guarantee you an unrivaled quality of custom-written papers.
Get ZERO PLAGIARISM, HUMAN WRITTEN ESSAYS
Why Hire Collepals.com writers to do your paper?
Quality- We are experienced and have access to ample research materials.
We write plagiarism Free Content
Confidential- We never share or sell your personal information to third parties.
Support-Chat with us today! We are always waiting to answer all your questions.