The beneﬁts of artificial intelligence (AI) are now broadly acknowledged as a result of the increasing complexity of contemporary information systems and the resulting ever-increasing volume of big data. Particularly with the emergence of deep learning, machine learning (ML) technologies are already being used to address various real-world issues. Machine translation, travel and holiday suggestions, object identification and monitoring, and even varied applications in healthcare are fascinating examples of the practical successes of ML. Additionally, ML is correctly regarded as a technology enabler due to the significant potential it has demonstrated when used to autonomous vehicles or telecommunication networks (Zhang et al., 2022).
Machine Learning, is a key technology for both present and future information systems, and it is already used in many different fields. There is a huge gap between research and practise, but the application of ML in cyber security is still in its infancy. As a result of the current state of the art, which prevents recognising the function of ML in cyber security, this disagreement has its origins there. Unless its benefits and drawbacks are recognised by a large audience, ML’s full potential will never be realised.
The latter call for developing a concept of “normality” and seek to identify events deviating from it under the presumption that such deviations correlate to security incidents. These two methods of detection work in conjunction with one another: misuse-based approaches are very accurate but can only identify known threats; anomaly-based approaches tend to raise more false alarms but are more effective against new attacks (Elsisi et al., 2021).
The ability to use supervised or unsupervised ML algorithms is the distinctive feature of ML applications for cyber risk detection (schematically represented in Fig. 1). The former can serve as full detection systems but calls for labelled data that was developed under some degree of human oversight. The latter can only carry out auxiliary jobs and do not have a human in the loop. Labels may be simpler to obtain depending on the sort of data being analysed; for example, any layperson can tell a valid website from a phishing website, while it is more difficult to tell benign network traffic from malicious traffic.
Machine Learning in Malware Detection
One of the most recognisable difficulties in cyber security is the struggle against malware. Since malware only affects one type of device, it can only be found by examining data at the host level, or through HIDS. Antivirus software can be viewed as a subset of HIDS, in fact. A particular malware version is designed for a certain operating system (OS). For more than 20 years, malware has targeted Windows OS the most due to its widespread use. Attackers are currently focusing their efforts on mobile devices running operating systems like Android (Annamalai, 2022).
Static or dynamic studies can both be used to detect malware. By only examining a given file, the former seek to identify malware without running any code. The latter concentrate on examining a piece of software’s behaviour while it is being used, typically by setting it up in a controlled environment and keeping an eye on its operations. Both static and dynamic assessments are shown schematically in Fig. 2, can acquire from ML.
Machine Learning in Phishing Detection
One of the most frequent ways to infiltrate a target network is by phishing, which is still a serious danger to online security. Modern enterprises must prioritise the early identification of phishing efforts, which can be tremendously helped by ML. We specifically differentiate between two different uses of ML to detect phishing attempts: detection of phishing sites, where the aim is to identify web pages that are disguised to look like a legitimate website; and identification of phishing emails, which either point to a vulnerable website or stimulate a response that includes sensitive information (Geetha & Thilagam, 2021).
Beyond Detection: Additional Roles of Machine Learning in Cybersecurity
There are numerous other functions in cyber security that ML can fill in addition to threat detection. Modern environments do indeed produce enormous amounts of data on a regular basis, and these data may originate from a variety of sources, including ML models. By using (extra) ML to analyse this data, it is possible to gain insights that raise the security of digital systems. Researchers can group all these complementing ML jobs into four tasks without losing generality: alert management, raw data analysis, risk exposure assessment, and cyber threat intelligence (Hameed et al., 2021). Schematic representation of machine learning and threat detection is given in Fig. 7.
The Future of Machine Learning in Cybersecurity
The state-of-the-art can be advanced in a countless number of ways, including by improving current performance, reducing known problems (such the inability to explain problems), and creating new ML-based cyber security applications (like integrating quantum computing).
Data Availability (executives and legislation authorities) – To address the shortage of adequate data, companies should be more willing to share data originating in their environments, whereas regulation authorities should promote such disclosure by defining proper policies and incentives
Usable Security Research (scientific community) – The peer-review process should facilitate and enforce the inclusion of the material for replicating ML experiments. At the same time, such material should be evaluated to ensure its correctness potentially by a separate set of reviewers with more technical expertise.
Information technology (IT) systems, including autonomous ones that are also actively exploited by hostile actors, are being used by modern civilization more and more. As a matter of fact, cyber threats are always changing, in the coming future attackers will have the means to seriously hurt or even kill people. To establish the groundwork for a greater deployment of ML solutions to safeguard present and future systems, this log aims to stimulate significant improvements of machine learning (ML) in the field of cyber security.