Friday, November 25, 2022

From zero to Zanubis

 A few weeks ago we stumbled upon a suspicious Android sample from a tweet from @malwrhunterteam which was only detected by four antivirus engines:

Antivirus verdicts didn’t provide specifics about the malware family other than it might be either a banking trojan or spyware, so a first approach to continue the investigation is finding other similar samples that will help provide a picture of this malware family. Continuing with the example, the first step is checking the “Relations” tab in the VirusTotal report to find other related IOCs (Indicators of Compromise). In this case we can observe a few interesting “Contacted URLs” obtained during sandbox detonation:

Although both URLs are interesting, the one used for socket communication (port 8000) uses Spanish strings in the endpoint (“instalado” meaning installed). By clicking on this URL in VirusTotal, we immediately find four additional samples sharing the same described networking behavior: 

These results show that the sample used to start the investigations (first seen 2022-08-27) was not the first one submitted to VirusTotal from the set of samples contacting the suspicious URL. A sample having a similar behavior and named “preso.apk” was submitted two weeks earlier (2022-08-11). The “Network Communications” section in the “Behavior” tab shows how it uses the same URL pattern:

We decided to create a Yara rule to monitor freshly submitted samples also having this same pattern.

Finding patterns for hunting

Writing YARA rules for APK files might be a bit complex, however we have two powerful tools on our side: the “vt” YARA module and VTDiff. 

Using VTDiff 

VTDiff finds common patterns between a set of files and excludes the “too common to be useful” ones. You can also provide an “exclusion list” based on a set of hashes. As a result, we get a shortlist of patterns to consider for our YARA. Considering that the APK format is basically a ZIP file, we need to choose the bundled files to submit to VTDiff. In this case, “classes4.dex” seems a good candidate based on the number of detections:


In our new VTDiff session we add classes4.dex hashes in the first box, and in the second box (exclusion list), a random DEX file from a clean Android file:

After VTDiff does its magic, we get several patterns matching most or all of the files submitted:

We can explore how useful these patterns are for hunting by doing a content search (just by clicking on the pattern). This returns how many files in VirusTotal match a pattern:

In this case, the chosen pattern returns 20+ additional dex files. We can find their corresponding APK parents through their relationships.

VT YARA module

The VT YARA module enhances YARA rules in many ways, including the possibility of including behavioral patterns in the rules’ conditions. This includes matching URLs extracted during sandbox detonation. In this particular case, we will take advantage of two facts to create our rule:
  • Samples connects to an endpoint called instalado
  • Samples create and delete a temporary backup file with .xml.bak extension.

With these very simple ideas, we can create the following rule:

import "vt"

rule Zanubis_Behaviour_VT
      author = "@entdark_"
      email = ""
      created = "2022.09.1"
      for any http_req in vt.behaviour.http_conversations : (                                                                                
        http_req.url contains "instalado") 
      for any deleted_file in vt.behaviour.files_deleted : (
        deleted_file contains ".xml.bak")
      and vt.FileType.ANDROID

In case you are not familiar with YARA, the previous rule:
  • Imports the VT module to obtain access to VirusTotal’s metadata and behavioral data.
  • Iterates all the aggregated HTTP conversations in the behavioral sandbox report and finds patterns contained within the URL field.
  • Iterates all aggregated deleted files on detonation, and searches for the .xml.bak substring.
  • Verifies that the file is an Android file.

To test this YARA rule, we can manually rescan the samples used to generate it and check if they trigger the rule. However, it is more interesting to run a RetroHunt job and observe how many additional samples show up from VirusTotal’s collection:

We started from a single sample that allowed us to find 3 more through pivoting. Now RetroHunt provided us 14 files thanks to our behavior-based Yara rule. That’s great, isn’t it? 

Please note for the moment the “vt” YARA module can only be used for LiveHunts (availability for RetroHunts is in late Beta, very soon available for everyone!).

Analysis of this Android malicious family

Now that we have several samples to work with, we can go into a more manual analysis phase to make sure they are all related and what are the common capabilities. In this case, the samples belong to a family called Zanubis, which is a “work in progress” banking trojan. It uses accessibility services to overlay its victim’s apps with their own login screen. So far, it seems to exclusively target financial entities from Peru and retrieves data from the victim’s device including contacts, device details and SMS data. All the retrieved data is sent back to the remote Command and Control via websockets.

For each iteration, Zanubis’ actors have been adding additional functionality, sometimes completing features where placeholders were found in previous versions. For the last iteration, analyzed samples seem to focus on improving their social engineering capabilities, in this case including links to Peru’s government sites.


The process described in this post is the one many researchers follow when finding an interesting sample. One of the first things to do is finding something unique enough that helps us take a step back and look for additional samples belonging to the same family. In this particular case, networking indicators based on behavioral analysis helped identify the malicious infrastructure, and from here, we were able to find some patterns to continue hunting. With a small set in our hands, we have different options to quickly explore different options to create exploratory YARA rules, in this case we used the “vt” module to add behavior to our rule.

There are many additional ways to reach the same results, as always we are happy to hear any ideas from you side.

Happy hunting!

Thursday, November 17, 2022

Stopping Cobalt Strike with YARA

 VirusTotal was born with the idea of community in mind - an ecosystem where everybody contributes and benefits. This helped grow our product around the concept of crowdsourced intelligence, where all the security community could contribute in different ways to provide more actionable tools for our users, including researchers and analysts, for detection and threat hunting.

Sometimes we have beautiful success stories on how VirusTotal’s users give back to the community what they get from the platform. In this case, our colleagues from Uppercase created a precise set of YARA rules to detect Cobalt Strike components. You can read more about it here.

Unfortunately, Cobalt Strike has become one of the main components in any attacker’s toolset. Albeit a legitimate tool for pentesting, different versions in the last years have been leaked and abused in many different ways. The first step to create a robust set of YARA rules is to have a consistent set of samples, in this case including all the different Cobalt Strike versions we want to be able to detect. VirusTotal was the platform of choice to gather all the samples needed, and thanks to our new Collections, these samples can easily be grouped in a single set - actually, you can find this Cobalt Strike collection here.

Once all the samples are available for the researcher, there is not a single way to create the YARA rules. A first approach could be checking Commonalities among the samples, in case we find any interesting characteristic or metadata among all the samples that we could use for our rule. Below you can see an example of finding commonalities in a collection of suspicious documents with more than 2000 samples:

Another possibility would be using VTDiff to find what particular bytes these samples have in common, and at the same time, have low prevalence in VirusTotal’s collection in order to qualify them as significant for creating a YARA. Independently of using VTDiff, checking the prevalence for any byte sequence or string in VirusTotal’s collection with a quick search is always a great idea to understand how useful they would be in your rule. Remember you can combine different byte sequences (using the “content” modifier) in your VTIntelligence search. If your rule is purely based on strings and byte sequences, you can mostly test its effectiveness with a few searches in VirusTotal.

There is a more technical approach consisting of reversing the samples and finding something interesting and unique for the detection, which is what was used in this case. But how to know if your rules are good enough? Usually this is an iterative process where we want the first versions of the rules to be a bit loose so we can find more suspicious samples. This is a way for us to understand if a rule can be used for hunting. Once we are satisfied with the results (we are finding all the Cobalt Strike samples we wanted), we want to make sure we don’t detect anything else (avoid false positives), especially when it comes to legitimate software.

For the process described above, usually you want to use RetroHunts, as they will check your rules against the whole VirusTotal collection. When launching your RetroHunt, you can specify the collection of samples you want your rules to be checked against, there is one collection of goodware we can use to make sure our rules don’t detect any of these samples by mistake.

There are different ways you can check if the results obtained from your rules are True Positives, usually in VirusTotal you will find plenty of data points you can use to double check, including verdicts, other crowdsourced rules, community comments, presence in other collections, signatures, etc. Unfortunately, sometimes the results from your rules (or VT Intelligence searches) can be huge, in that case we really encourage creating a collection with them and use the Commonalities feature to get a better understanding of how your rule did and discriminate among your results. 

Once we are satisfied with the results, we suggest deploying your rules in LiveHunt for some time, which basically will execute your rule against anything uploaded to VirusTotal from the moment you deploy it. This way you can monitor its effectiveness and do some final polishing if needed. 

And voila! Your rules are ready to be deployed. In this case, the rules can be found here, and now they are part of our set of crowdsourced YARA rules, so everyone in the VirusTotal’s community will collectively benefit from this effort.

We really appreciate the effort and generosity of our colleagues from Uppercase, and we hope these ideas will help everyone understand a bit more about the creation and deployment of YARA rules. As usual, we are happy to hear from you.

Happy hunting!

Tuesday, November 15, 2022

Deception at scale: How attackers abuse governmental infrastructure

 Continuing our initiative of sharing VirusTotal’s visibility to help researchers, security practitioners and the general public better understand the nature of malicious attacks, we are proud to announce our “Deception at scale: How attackers abuse governmental infrastructure” report. Here are some of the main ideas presented there:

  • Governmental domains are among the top categories used by attackers in 2022 to distribute malicious content. 

  • We found dozens of government-related domains hosting many kinds of malware, including trojans, ransomware, phishing, coin miners, banking malware, and lateral movement tools.

  • Although some affected domains seem to be victims of opportunistic attacks, there are indicators that some of them were targeted by sophisticated attackers who abused their infrastructure to deploy their toolsets.

  • Using legitimate government domains for malware hosting can enable an attacker to improve the efficiency of social engineering attacks and avoid defenses and alerts based on deny/allow lists.

  • We also found traces of various webshells hosted in dozens of governmental domains. 

  • More generally, we observed an increase of phishing levels in 2022 along with a large distribution of suspicious PDFs. Recently created XLSX files seem to replace DOCX as the preferred mechanism to distribute  malware.

For full details, you can download the report here

In this blog post we will focus on technical hunting and monitoring ideas you can use to prevent such cyberattacks. We also provide additional technical details for some of the most interesting cases we provide in the report.

Domain categories abused by attackers 

Domains and URLs processed by VirusTotal are categorized by third party solutions, this data can typically be found in the “Details” tab:

You can use the following simple query to get domains categorized as, for example, military:

Unfortunately categories among different vendors might vary a lot as they use different criteria and spelling. Roughly speaking, categories can be grouped in two sets: one related to the business activity of the domain and a second one describing the type of threat detected. This is an important point to keep in mind while working on your own VT Intelligence queries.

We can add additional modifiers to our query. For instance the following query gets URLs (instead of domains) following a particular pattern:

We can add search modifiers for HTML content, metadata, and much more. You can find in the following links the whole list of search keywords for domains and URLs.

Each entity could be tagged by multiple categories at the same time, so you can combine them to search in VT Intelligence, below we provide some examples:

  • Malicious activity on CDN domains:

  • Suspicious phishing/malicious URLs/domains:

  • Suspicious Command and Control login URLs:

  • Suspicious waterhole attacks:

  • Sites distributing cracks for video games:

We did our best to unify different criteria to understand what are the top categories distributing malicious content. We were surprised to see Government among top results by number of domains:

Government-related infrastructure

We found several interesting representative examples when analyzing suspicious activity on governmental infrastructure. The following infographic shows top TLDs for government-related suspicious domains we found in VirusTotal in 2022. We decided to exclude non-specific TLDs (such as .com, .net, .org, etc) from this list.

We manually double-checked different cases we found interesting, described below.

Malware hosting

We found all kinds of malicious content hosted by governmental domains, including phishing, downloaders and trojans, ransomware, lateral movement tools, cryptominers and bankers. To obtain a first list to start working with, we used queries similar to the ones described before to find potentially compromised governmental infrastructure, followed by additional filtering and manual checking. 

Below we describe a few cases we found interesting.  

  • A sample of the Coper Android banking trojan was hosted on an Indonesian governmental entity website.

  • Malware with keylogger and screenshot capabilities hosted in a government office website in Bangladesh for around three months, according to telemetry.

  • A Peruvian governmental site hosted a sample of the dangerous njRAT. This particular sample was first seen in November 2021, and the site already cleaned it up at the time of writing the report, however we found more samples active on the same timeframe and using the same C2 server.

We also found traces of targeted attacks and lateral movement tools hosted in some victims.

  • Traces of Mimikatz hosted in a subdomain of a (likely compromised) public hospital in Indonesia.

  • A Cobalt Strike sample hosted in a Sri Lankan governmental entity last July 2022 under a non-suspicious name.

As for Ransomware, a regional governmental domain in the Philippines was found hosting an AgentTesla sample by mid 2021. Attackers seem to have abused a vulnerability in the CMS to deploy their sample under a URL clearly used for social engineering, greatly increasing its potential to spread.

Suspected webshells

We tried to find (potentially compromised) government-related sites hosting webshells. This is not an easy task, so we played with some ideas. For instance, we searched for common names used by webshells in governmental domains detected as suspicious by antivirus engines. We also combined searching for common content in webshell files with antivirus verdicts. Unfortunately, these queries still provided too many false positives, so we had to manually double check them. Below we detail some interesting findings, all found in governmental infrastructure: 

  • This JPG file embeds PHP code inside the comment section. This is an old well known technique that still seems to be effective to avoid antivirus detection.

Most of the URLs distributing this malware ITW listed in the Relations tab have 0 detections, probably legitimate compromised sites.

  • This PHP webshell, presented as a PDF file, was first seen in VirusTotal in 2013. Since then, the malware has been distributed ITW by at least 29 different domains. Interestingly, when crawled by any search engine bot it returns a 404 error value to avoid being indexed.

  • PHP webshells camouflage themselves as images using different techniques. In this case, this webshell adds the GIF89GHZ string at the beginning of the file to mislead filetype detection. The ITW list of URLs distributing this sample is pretty impressive. It also shows that most of the time this webshell is hosted as a GIF image. Its deobfuscated code shows a very simple upload functionality.

  • A simple PHP uploader sometimes hosted as "favicon.ico". The content tab for this webshell in VirusTotal shows the password it expects from operators.

When webshell content is available, we can use it to pivot (by clicking) to other files with the same content (likely more webshells!). By doing this we easily find nine additional files which also provide new password strings to keep pivoting. 

We also found a second encrypted webshell (first seen in 2016 and embedding two base64 Perl scripts) where the encryption password can be found in clear text in the content tab, which is always a great resource for pivoting. 

  • This file contains a trojanized version of AuraCMS, an Indonesian Content Management Service. The webshell can be found under the "files/siswa/be.php" path.

Interestingly, the file contains a disclaimer visible in the content tab


Nevertheless, it contains an obfuscated and encrypted block of code, which can be used to pivot to find other similar samples. The deobfuscated content shows this webshell is capable of file system navigation, local command execution, read/write files, download/upload, mysql interface, list processes, etc. It also includes four different ciphered blocks of code. Two of them create ordinary backdoor connections: $port_bind_bd_pl (perl) and $port_bind_bd_c (C code to compile locally). The other two blocks of code implement reverse shells: $back_connect (perl) and $back_connect_c (C).

  • This webshell is based on the WSO Webshell project (currently removed from Github, but there's a copy of the original repository here). Unfortunately, this is not the only WSO-based webshell we found in governmental domains (another one here). The content tab shows a password and email that will help us find other samples. We can use this information for a content-based VTI query based:

content:{3d2022323132333266323937613537613561373433383934613065346138303166633322} or content:{406d61696c2827686172645f6c696e7578406d61696c2e7275272c2027726f6f74272c2024746d7029}


Compromising government-related infrastructure represents a potential major threat given the implicit trust it represents. All the examples above show traces of both opportunistic and targeted attacks. The lack of regular maintenance seems to be fundamental for many of the observed attacks.

We suggest several ideas to minimize most common risks:
  • Regularly update and maintain government web sites, especially content management systems (CMS), to address vulnerabilities. 

  • Actively monitor government infrastructure for anomalies, such as malware actively communicating with them or subdomains hosting files with malicious verdicts.

  • Regularly scan all hosted files in government infrastructure, especially in subdomains and personal sites. Do not dismiss phishing, as it can be used in social engineering schemes.

  • Assume traffic from trusted domains might be malicious, as the infrastructure can be used to host lateral movement tools or other advanced malicious toolsets.

  • In case of finding anything suspicious, but especially in case of finding webshells or lateral movement tools in the infrastructure, assume compromise and consider a full investigation. 

We hope the examples provided should serve as a heads up towards better security practices when it comes to sensitive infrastructure. We also hope some ideas presented in this blog post will help defenders implement monitoring for their own infrastructure. As usual, we are happy to hear from you

Happy hunting!