In 2019, Nick Bostrom, a well known author and philosopher focused on AI risks, published a research article in which he presents the Vulnerable World Hypothesis (VWH), likening the human creativity process to pulling balls from a giant urn. Bostrom explains that the balls in the urn represent ideas and innovations. White balls represent those which are considered to be globally beneficial or benign in nature. Gray represents those which are dangerous, but require extensive resources — e.g., building nukes. And finally black, which represent innovations that are very dangerous and require few resources. Further explaining the idea of the black ball, Bostrom presents a counterfactual history for what could have happened if atomic bombs had been easy to create — the destruction of civilization. To date, while technological innovation has produced a great deal of the beneficial white balls, along with many gray ones that are a mixed bag of varying degrees of good and bad depending on your perspective, we have yet to pull a black ball from the urn. Or have we? Could AI be the black ball that materializes as AI-powered malware?
For decades, the idea of Artificial Intelligence (AI) going rogue and destroying civilization has been a common theme of the sci-fi genre. And while that idea has mostly been more fantasy than anything else, the last several years of technological advances in the field of Artificial Intelligence have pushed autonomous AI beyond mere science fiction and into the realm of reality. There is no question that AI is leveraged by threat actors to create and spread malware, but how close are we to a world in which we see AI-powered malware autonomously invade a network in the way Tesla’s autonomous driving system can navigate around a city? This post discusses the potential threat of AI-powered malware and how it differs from AI-enabled attacks.
AI has become a double-edged sword in that it can be used both to detect and prevent cyberattacks, while it also aids in creating and spreading malware used in those attacks.
AI-enabled attacks occur when threat actors take advantage of AI as a tool to assist in creating a piece of malware, or in conducting an attack. Examples of AI-enabled attacks are currently seen in the wild and may include things like deepfakes, data poisoning, or reverse engineering. More recently, advanced conversational chatbots like ChatGPT that use Large Language Models (LLMs) for Natural Language Understanding (NLU), are enhancing the potential to automate and maximize the effectiveness of AI-enabled malware attacks. For example, an attacker might use a chatbot to craft more convincing phishing messages that lack the most obvious red flags such as errors in grammar, syntax, or spelling that quickly give them away.
When it comes to ChatGPT in particular, its capability to write code highlights the growing potential threat of AI-powered malware. Earlier this month, a security researcher at Forcepoint showcased his zero-day virus with undetectable exfiltration using only ChatGPT prompts. While ChatGPT has demonstrated an ability to write functions, it has been weak on production style error checking and prevention. At present, ChatGPT seems to lack the sort of antagonistic reasoning that a malware writer needs — e.g. What will my adversary do to prevent this next action, how can I thwart their efforts while advancing my goals?
As the field of AI and Machine Learning technology continue to advance, there is growing concern that we will see more sophisticated and AI-powered malware that can think for itself and adapt to its environment.
AI-powered malware would be autonomous in nature and trained to adapt to its environment so as to avoid detection — essentially, designed to ‘think’ for itself and then act accordingly without human intervention. While there have been multiple instances that demonstrate the potential for AI-powered malware, this is not yet something that has progressed beyond the proof-of-concept stage and been seen in the wild.
ChaosGPT gives a great example of what AI-powered malware could look like. If you haven’t read about ChaosGPT, it is a chatbot just like ChatGPT, that was created as an experiment using the Auto-GPT open-source autonomous AI project. When given the command to ‘destroy humanity’, ChaosGPT reportedly acted as instructed — accessed the internet, researched global destruction, recruited another GPT AI tool for assistance, and then tweeted its plan to destroy humanity.
AI-Powered Malware: Potential vs Current Reality
While the ChaosGPT experiment does highlight a number of concerning capabilities like manipulating other AI tools and communicating with the outside world via social media, it’s important to keep in mind that, at present, the AI tools that would be used to create fully autonomous malware are still relatively unsophisticated and have significant limitations. Additionally, despite recent headlines that allude to the near-term destruction of humanity by some form of malicious superhuman intelligence, ChaosGPT or otherwise, it will be useful to note what AI is really good at, which is coping with fuzzy data and situations.
When looking at how AI is currently being used, there are certain areas that get a bigger boost than others. For example, to compromise a target, there are three basic phases: Get past the defenses, interact with systems (install software, get data, encrypt data for ransom, etc.), and get the data out. Getting past an organization’s frontline defenses typically comes in two flavors — bypassing security tools or tricking a person into giving you access. In the first case, deterministic programming is the tool of choice. When it comes to the latter, threat actors utilize AI-enabled tools to maximize the effectiveness of their attacks. For example, deepfakes weaponize AI by enabling attackers to impersonate high-profile targets in videos or phone calls, which can then be used in highly targeted social engineering and phishing attacks, or to spread disinformation. A fairly common example is an attacker using deepfakes to impersonate a person’s coworker or boss over a video call. This is exactly what happened in 2019 when the CEO of a U.K.-based energy company complied with an urgent request from his supposed boss to immediately transfer funds to one of the company’s suppliers. Other examples of deepfakes are plentiful on social media platforms, like YouTube, where there are always ‘live streamed’ presentations of Elon Musk pitching crypto currency, even though they’re actually old presentations that scammers have edited.
There are other areas, such as steganography, where attackers leverage AI with malicious intent. Steganography is the practice of hiding data within other data in a way that is difficult to detect and attackers can use it to infiltrate or exfiltrate data by making it look like something else. For example, critical documents can be smuggled out as noise in an image, or an entire database exfiltrated over time as DNS queries. While steganography itself isn’t new, AI techniques can be used to produce unique invisible steganographic codings in the same way they can produce photorealistic images from noise. The tools and technology used in these AI-enabled attacks makes it very difficult for security tools to differentiate between legitimate content and malicious content.
For now, most of the current risk is coming from AI-enabled attacks, or generative AI — faked audio clips, videos, images, etc., — not autonomous AI-powered malware. And while there is always the potential for that reality to shift in the near-term, factors like payload size and computing power present significant hurdles, one of which is simply the cost. It’s far more economical in terms of memory, CPU and transfer rates to create a lot of ‘if-then’ style code, now referred to as GOFAI (Good Old Fashioned AI), as opposed to creating an artificial brain making attack choices on the fly.
For the time being, AI could arguably be considered one of those gray balls from the urn of creativity as AI-enabled attacks still require quite a bit of human intervention and guidance — keeping the reality of autonomous malware at bay. However, AI experts believe the next 5 years will be explosive in terms of AI cognition, so even 6 months from now the reality may be quite different. With as much benefit that we may derive from the advances in AI and Machine Learning for cybersecurity, threat actors are certain to leverage the advances for their own gains.