EDF-2021-CYBER-R-CDAI: Improving cyber defence and incident management with artificial intelligence
The Commission will pay particular attention to the other R&D and dual-use on-going initiatives at Union level to avoid unnecessary duplication.
The ability to detect and respond to security incidents suffers from several challenges, including: the ever increasing amount of data that needs to be analysed in order to detect and fully understand security incidents; the number of false alarms generated resulting in, for instance, erroneous prioritisation and alarm fatigue amongst operators and analysts; lack of (human) resources to sufficiently analyse all potentially malicious activity; the decreasing effectiveness of traditional defence measures based on known set of rules (e.g. a priori known signatures and/or network traffic profiles) due to the increase of encrypted network traffic and their inadequacy against advanced persistent threats and zero-day attacks (including malware that exploits unknown vulnerabilities, targeted phishing attacks, low-rate data exfiltration, abnormal user behaviour, etc.); choosing appropriate measures in response to attacks in a timely manner, when the scope is uncertain and the situation develops faster than a human being may follow without advanced decision-making support, and while the compromise potentially have or will extend over weeks, months or years.
The use of Artificial Intelligence (AI) seems promising in order to address many of these challenges – and AI has recently shown great results in areas such as playing strategic games and analysing text.
This call seeks proposals that help increase the level of automation in incident management and cyber defence activities through the use AI. In this setting, the engagement of state-of- the-art AI methods should be used to automate incident management and cyber defence activities, including incident detection and response, carried out by security operation centres (SOCs), and cyber defence teams (or similar entities) when they detect and analyse events and determine what actions to take.
Modern SOCs are sometimes equipped with security orchestration, automation and response (SOAR) capabilities that allows human operators to respond to attacks with predefined “playbooks” designed to mitigate ongoing attacks, e.g. by disabling user accounts or reconfigure firewalls, where AI-based solutions seem applicable.
AI can, for instance, be used to complement rule-based detection methods (e.g. through deep learning), to enhance alarms from detection systems using threat intelligence feeds, extract actionable intelligence from the enormous amount of monitoring data and events, correlate alarms with other information to identify attack patterns, automatically respond to events based on the analysis, and recommend actions to human operators. Recent studies unveil that more than two thirds of the organisations included in the studies acknowledge that they are not able to respond to critical threats without AI.
Creating an AI-based solution that automates larger parts of incident management and cyber defence processes involves several technical challenges. These include (among others): the selection and pre-processing of appropriate data sources; creating and applying models and techniques for analysing the output of sensors to assess if an attack may be happening, including selection and tuning of algorithms and parameters; mapping ongoing attacks to known threats (e.g. using threat intelligence); assessing if the consequences of implementing a particular response outweigh the risks associated with not doing so; and creation/selection of appropriate datasets for training and testing the models. Moreover, transparency and configurability are key requirements, especially in the military domain, which is lacking for most commercial Security Information and Event Management platforms (SIEMs).
While AI-based solutions may be required to address these challenges in this ecosystem, incorporating AI also introduces a new set of technical and non-technical challenges. Questions addressing technical challenges: At which point should an alarm be raised and when should it be elevated? How can AI reason about trust with respect to the use of external information (e.g. threat intelligence)? How can the possible consequences of a cyber-attack, and the consequences of implementing different mitigating means, be assessed before and during response, both in real-time, near real-time/short term or as part of a medium or long term defensive strategy? Many incidents will (at some level) require human interaction and human decision-making – how does an AI based system communicate the results and the underlying explanation and reasoning leading to the result? Non-technical challenges include: Which decision rights and processes can, and should, be delegated to an AI-based system and which should remain manual? How should AI be utilised at strategic, tactical and more technical levels? What is the difference between communication at a technical, tactical, operational and strategic/political level? How can humans work together with AI based systems at the different levels? Military systems increasingly use, and depend on, the private sector and civilian infrastructure – how does incident response differ between military and civilian sectors and what are the challenges in a combined military/civilian setting? What are the implications for AI?
Addressing the identified challenges will require inter- and multidisciplinary approaches, where teams conduct work of both a technical and a non-technical nature. Analysis of technical, tactical, operational, strategic and political considerations are required. On a technical level, proposals should provide proof-of-concept solutions for AI-based incident management and cyber defence, including detection, mitigation and response. Capable intrusion detection systems (IDS) could form a starting point for proposals. However, proposals must not seek to further the analysis capabilities of IDS alone, but in the context of an automated or semi-automated system for handling incidents.
In additional to purely technical solutions, processes and actors of selected enterprises may need to be mapped, modelled and understood to ensure fit-for-purpose solutions and answer questions of a more conceptual nature. Proposals are further expected to consider the interaction between human operators, analysts and decision makers and the automated or semi-automated incident management and response system.
A suitable methodology for building contextual understanding is expected through case studies of selected processes, incidents and cyber-attacks of selected enterprises, and case studies of successful detection approaches and resilience oriented success stories where technical and non-technical challenges can be studied and addressed at different levels. For the development of technical proof-of-concept prototypes, an appropriate development approach, which includes user and stakeholder involvement, should be leveraged.
The proposals must cover the following activities as referred in article 10.3 of the EDF Regulation:
Activities aiming to create, underpin and improve knowledge, products and technologies, including disruptive technologies, which can achieve significant effects in the area of defence;
Activities aiming to increase interoperability and resilience, including secured production and exchange of data, to master critical defence technologies, to strengthen the security of supply or to enable the effective exploitation of results for defence products and technologies;
Studies, such as feasibility studies to explore the feasibility of new or improved technologies, products, processes, services and solutions
Design of defence products, tangible or intangible component or technology as well as the definition of the technical specifications on which such design has been developed which may include partial tests for risk reduction in an industrial or representative environment.
All proposed activities ultimately support the creation of fit-for-purpose proof-of-concept prototypes of an automated or semi-automated incident management and cyber defence system, for select phases in the incident management cycle including detection and response. The prototypes may support human operators, analysts and decision-makers at all levels (technical, tactical, operational, strategic and political) and are expected to contribute to enhanced cyber situational awareness, increased military infrastructure resilience and improved protection against advanced cyber threats.
The activities are sorted into three types of tasks:
Enhancing contextual knowledge of the enterprises, processes and decision- making where AI should be utilised
Developing AI-based techniques supporting specific human operator/analyst tasks
Exploring and developing AI as a decision-maker given limited authority in incident management and cyber defence
Feasibility studies drawing upon real-world scenarios should be utilised to ensure that developed proof-of-concepts and techniques are fit-for-purpose. It may also be necessary to create reference systems and appropriate tests cases to generate training data and evaluate the efficacy of different solutions, both with and without human operators interacting with the system.
The proposals must include at least one activity from task 1 and one activity from task 2 and must be coherent with the defined scope as described above.
Enhancing contextual knowledge of the enterprises, processes and decision-making where AI should be utilised
1.1. Knowledge building through analysis of real-life situations, use cases and incidents, in order to sufficiently understand and model the enterprise processes and decision-making processes that AI-based incident management and cyber defence systems will interact with. This includes understanding the relevant actors, their enterprises and business/missions, and the threat environment they operate in.
Proposals may include processes and work flows involving human operators, analysts and decision makers at all levels. Proposals may also cover elements such as information requirements, dealing with uncertainty, strategic objectives, mission objectives, the role of ICT, risk analysis, risk appetite, incident and crisis communication to different stakeholders etc.
1.2. Exploring the boundaries for AI-based autonomous or semi-autonomous response. The playbooks of many SOCs and similar entities describe how to respond to given attacks. However, in depth understanding of the broader context is necessary in order to avoid inappropriate measures. To take into account the broader context in order to avoid inappropriate measures at least the following questions may be addressed: Can such playbooks be automated and can AI reasoning capabilities automate such intuition, experience and contextual understanding? When most machine learning and deep learning algorithms require a vast amount of data to learn from, which will not be available for incident response, can one-shot/few-shot learning (e.g. human-style learning) be utilised in this setting to learn how operators respond to incidents? Can symbolic approaches work in conjunction with machine learning (e.g. neuro-symbolic AI) to automate playbooks?
Developing AI-based techniques supporting specific human operator/analyst tasks
2.1. Creation of AI-based techniques for detecting and understanding adversarial activity. This may include analysing and triaging alarms, conducting forensics, utilising external information with varying levels of trust (e.g threat intelligence), leveraging behavioural analytics, performing kill-chain detection and analysis, assessing potential attacker intentions, monitoring applications and communication activities, analysing malware, etc.
The techniques may be intended for both real-time and non-real-time detection and analysis, involve multi-disciplinary approaches, use data from endpoints, networks and the cloud, and leverage distributed computing and data processing for real-time scalability.
2.2. Creation of AI-based techniques for building knowledge about own protected ICT systems (e.g., a “cyber record” with current and historical information). This must include collecting, linking and fusing different kinds of information about the system hardware, software, and the relationship between them. Information that may be collected is, for instance, architecture and configuration data, hardware location and specifications, installed applications, network information, services, protocols in use, connected peripheral devices etc. A variety of sources may be leveraged for acquiring information, including hardware and software configuration management systems, documentation, vulnerability scanners, SIEMs, asset discovery tools, etc., and techniques such as reverse engineering may be utilised.
2.3. Creation of AI-based techniques for analysing enterprise systems to appraise the value of assets and the potential consequences of different responses (e.g. configuration changes). This must include both static values manually assigned or derived from fixed factors, and dynamic values that must be seen in relation to ongoing and changing business operation or military missions.
2.4. Creation of explainable AI-based techniques. Many of the most promising machine learning systems are not considered to be “intelligible” or “explainable”, which has resulted in a sub-field coined explainable AI (XAI). This task should address how XAI can be utilised to explain detection, analysis and responses at different levels to different actors? Can, for instance, machine learning (e.g. deep learning) be combined with more traditional symbolic AI to make the analysis transparent?
Exploring and developing AI as a decision-enabler given limited authority in incident management and cyber defence
3.1. Creation of AI-based information collection and storage systems that dynamically adapts its collection and storage strategy to the situation as continuously analysed and perceived by the system. This includes what is collected, where it is collected and the granularity (e.g. increasing the level of detail of collected information, such as full packet capture, after an initial compromise is detected). As it is not feasible to collect everything and everywhere, such dynamic big data analytics and data lake systems could help the issue of insufficient data due to limited data collection.
3.2. Creation of AI-based decision systems which are risk and impact aware. They should be able to analyse and understand the impact of security incidents on desired mission performance, identify associated risks, generate different response options to maintain requisite cyber resilience and mission assurance, and potentially select and execute a response option if permitted. The analysis of impact, risks, different response options and potential execution should be explainable.
Personnel development is one of the key requirements for effective cyber defence. Extensive trainings and exercises constitute the best means to enhance and validate the skills of the cyber defence workforce. For this, Member States have invested in establishing cyber ranges that provide controlled artificial environments where, among others, malicious activities can be simulated without negative impact on live systems in an organization. However, the existing cyber ranges can be developed further to achieve their full personnel development potential. In turn, it supports cyber operators improving their skillset and benefits military commanders in understanding cyber as a cross-domain challenge. This includes addressing threats and opportunities driven from the emerging disruptive technologies.
The proposals are expected to address following challenges:
First, the maturity level of user simulation running on cyber ranges is low. The user simulation is often limited to traffic generators and tools for testing user interfaces for a well- defined purpose. User simulation, which leaves a non-detectable footprint and produces logs while being indistinguishable from the real human users’, is needed for providing more meaningful, realistic and life-like scenarios for exercises and training sessions. Additionally, scenarios that rely on the actions of simulated users, e.g. because of phishing emails, require a solution that gives the training or exercise organizer control over the user simulation to make sure simulated users act in accordance to the scenario.
Second, cyber ranges lack capabilities to assess and reflect the decision-making process of cyber operators during an exercise or training. Current systems fail to provide insight to the particular actions cyber operators perform to achieve the objectives. This includes unanswered questions such as which tools and commands were used and options selected in the graphical interface, who was communicated with (both online and in the war-room), what was searched online (and whether that was useful). This presents additional challenges as systematic monitoring and assessment of skill gaps is not possible, especially for more complex exercises. However, automated performance assessment and analysis of the participants allows the training and exercise instructors to monitor either on individual or team level their performance in more detail.
Third, scenarios involving user-simulation and systems enabling analysis and assessment of the decision-making process of cyber operators should be accessible and interoperable for different cyber ranges. This can be enabled through scenario development language. Besides potential cost-efficiency, it also improves and upgrades the scenarios over time based on feedback by many users. Hence, simulated users, scoring system and the analysis of the operators’ performance can be accordingly elevated, training and exercises therefore continuously improved.
Fourth, so far, many cyber ranges focus only on one domain and its functionalities, but the impact of cyber-attacks must be considered as a cross-domain challenge. Therefore, a multi- domain cyber range simulation must support and simulate land, air, sea and space domains. This includes, for example, military systems (e.g. battle management systems), radio and operational technology. Especially the integration of the electromagnetic spectrum (EMS) and the common understanding of cyber and EMS should be a key factor of future cyber ranges. The challenge is to support a highly realistic simulation of multiple domains, the interconnection of systems and to assess the impact of the inter-dependencies between those systems. Such simulations would support the training and evaluation of multi-domain common operating pictures and its operations as well as development and testing of new military approaches and doctrines to cyber/EMS threats.
Last, conducting large-scale cyber exercises or simulating real-life modern ICT environment requires unique and complex set of capabilities and infrastructure (such as specialized hardware to simulate cellular networks, industrial controllers and other parts of critical infrastructure). The most practical way to create such complex environments is through cooperation among Member States and federating cyber range infrastructure and exercise content. Such approach requires development of common standards, protocols, and software solutions to allow federated scoring and situational awareness throughout the federated environment.
The objective of this topic is to create a toolset that allows significantly increased efficiency in the cyber trainings and exercises process while also enhancing cyber ranges interoperability and cost-efficiency, taking into account challenges described.
To develop a technological demonstrator modules that can be easily configured and interfaced to existing system used to conduct cyber trainings and exercises. Integration of the technologies must be demonstrated within TRL 4-8, but specific TRL may wary depending on the work package.
Development of agents capable of using common software applications in a similar manner to human users. Actions in scope for the user simulation is benign use of common software applications (e.g. word processors, web browsers, file management applications, and email clients). The solution is able to replay sequences such actions obtained from the automated performance analysis when human operators has used systems and allows custom behaviours to sequences of such actions to be crafted. Furthermore, the solution also allows custom modifications of action sequences obtained from the automated performance analysis, e.g. to modify a sequence of actions to create an alternative scenario. Simulated users be used to, among other things, simulate social engineering incidents (e.g. phishing), the generation of system logs, and the generation of network logs. The footprint of the simulated users is indistinguishable from those of real users in logs and must be visible in graphical user interfaces.
Automated performance analysis
Development of a system capable of collecting data about the cyber operators activities during trainings and exercises, and automatically analysing tasks of low to medium complexity while also providing supporting data and insight to evaluate advanced situations which are often difficult to fully assess through automated techniques. This should be based on combining already well-researched and documented methods (e.g. application programming interfaces (APIs)), and big data analysis, image analysis and neural language processing to collect and analyse cyber operator’s behaviour during training and an exercise.
Scenario development language
Scenario development language enables scenario sharing with other cyber ranges and ensures interoperability between different cyber ranges. While the cyber ranges are interoperable regarding more common activities (training, exercises etc.) and more complex scenarios can be implemented raising the overall preparedness of cyber operators utilizing the capabilities. The language itself should be described based on research of existing cloud and virtualization topologies and should be extended with specific components (simulation, scoring, federation etc. attributes) that are needed in the cyber range environments.
Development of enhanced multi-domain cyber range simulations for at least 2 domains (e.g., land, sea and/or space) and the standards and interfaces to interconnect relevant systems and environments (e.g., battle management systems, EMS or other systems) in order to allow simulation of realistic joint cross-domain scenarios and situational awareness.
Situational awareness and scoring
The activities include developing standards and protocols for federated scoring system, exchange of situational awareness information, including federated cyber range operation.
The proposals must cover the following activities as referred in article 10.3 of the EDF Regulation, not excluding upstream or downstream activities eligible for development actions if deemed useful to reach the objectives:
Studies, such as feasibility studies to explore the feasibility of new or improved technologies, products, processes, services and solutions;
The design of a defence product, tangible or intangible component or technology as well as the definition of the technical specifications on which such design has been developed which may include partial tests for risk reduction in an industrial or representative environment;
The development of a model of a defence product, tangible or intangible component or technology, which can demonstrate the element’s performance in an operational environment (system prototype);
The testing of product, tangible or intangible component or technology.
The proposals must address in particular the following:
Study the methods and technologies to develop simulated user;
Develop a prototype for simulated user capable of using common software applications while producing realistic logs and footprints in the machines;
Develop a method for converting recorded behavioural data of cyber operators that can be used in the user simulation;
Proof of concept testing and validation of the proposed toolset to present that the user simulation produces a footprint similar to normal users when it is applied.
Automated performance analysis
Study the methods and technologies that provide information about the decision- making process during a training or exercises (i.e. consoles, graphical interfaces (including videos from environment), network traffic, audio between team members));
Study the technologies to gather, store and process information produced during a training or exercise;
Design methodology to correlate cyber-attacks data and collected cyber operators behavioural data, and the ways to improve the individual and collective performance of cyber operators. The methodology should also provide feedback loop for exercise designers to improve the learning effect of the exercises;
Design methods to ensure the integrity of the automated analysis process and/or consider classification of military environments;
Develop a prototype for the automated analysis that includes at least logging, data parsing and performance evaluation functionalities;
Proof of concept testing and validation of the proposed toolset. Scenario development language
Study existing cloud and virtualization topologies, including defining data format to define a common scenario language that can be used by different cyber ranges;
Develop an extendable scenario development language in coherence with the other capabilities described in this call;
Proof of concept testing and validation of the scenario development language on an existing cyber range.
Develop standards and interfaces for interconnecting multi-domain cyber ranges (e.g., land, air, sea and space) for cyber trainings and exercises;
Develop multi-domain scenarios for capacity building;
Proof of concept testing and validation of the proposed toolset.
Situational awareness and scoring
Study the existing technology solutions used by Member States for cyber ranges’ situational awareness and implement the solution in federated environment;
Design common standards for cyber ranges’ situational awareness.