Monday, December 19, 2022

VT Intelligence Cheat Sheet

Monday, December 19, 2022 Anonymous

Many of you asked for this, and today we are happy to announce the release of our VTI Cheat Sheet with hints and examples on the most useful VT Intelligence queries and modifiers. Instead of providing a list of already documented search modifiers, we created something more specific and close to the real life cases, such as searching for files signed with leaked Nvidia certificates or recent samples from collections attributed to CozyBear.

You can find the PDF version of the Cheat Sheet here. In this post we are providing some of the clickable examples with additional explanations.

Entities

One of the basics of VT Intelligence is using the “entity” search keyword to directly specify the type of output you want to get. There are specific modifiers for every entity, here you can find direct links to documentation for file, URL, IP and Domain (Collection will be available soon, stay tuned). Here there are quick examples for each of them:

Specific group activities

There are a number of different ways to explore the latest footprints of certain threat actor in VT Intelligence. 
In case you don’t have any other inputs except the campaign or malware family name, you can leverage AV verdict VTI search:
If you want to search for a verdict from a certain vendor, you can specify it explicitly:
Instead of getting file hashes as your search output, you can list all collections related to a specific actor/campaign:


You can also search within a specific collection, which is very handy when dealing with collections containing a large number of entities:
You can get the collection ID from the browser address bar when navigating a specific collection or simply click “Share the collection” when there.
Another approach for getting files related to a specific threat actor is by leveraging crowdsourced detection rules: Yara, Sigma and IDS. We are always looking for solid and active repositories constantly updated with the latest malware signatures. You can find more details in our Contributors list
For example, the following query provides files matching YARA and IDS rules containing “APT29” or “CozyBear” in their names, as well as files detected by a specific Sigma rules:
At the moment the only way to perform Sigma rules search is specifying the rule hash explicitly, you can find here the full list.
Finally, searching for specific comments can bring valuable results. It’s important to note that there are many third-party solutions contributing with useful comments in VirusTotal. Some of the most popular and reputed ones include THOR APT scanner and Intezer:

Comments on suspected APT29 malware

The following query searches for files containing some APT actor aliases in their comments:
This also works for IPs, domains and URLs: 

Documents

Any query can specify the document format for the results. For example, the following query provides recently created (using “generated” modifier) documents (thanks to the “type” modifier) with macros embedded (by using “tag”), detected at least by 5 AVs (“p” modifier):
A second example retrieves Excel files bundled with powershell scripts and uploaded to VT for the last 10 days:
There is plenty of cool stuff you can do with the “tag:” search modifier, here is the full list with dozens of different tags supported, and here you can find descriptions for the most common ones. For example, you can search for documents with obfuscated VBA code executing other files:
You can also use “type:document” to search for all the document formats (office, pdf, text, rtf, latex, etc). The following query returns all documents having “invoice” in its file name and used as email attachments:
Or documents exploiting any vulnerability published in 2022:
It's important to highlight that “tag” is one of the few modifiers supporting wildcards.
Many of you asked about language specific document searches during our Threat Hunting with VirusTotal session. The “lang:” keyword uses Exif language property to find files matching any language:

Non-Windows samples

We have dedicated keywords to perform searches for platforms such as Android, MacOS or even Symbian. 
For Android, which is one of the most popular non-Windows platforms by number of malware samples, we use Androguard to process all ingested samples. You can query for any Androguard output which is stored under the umbrella “androguard:” keyword:
This includes all sorts of different types of processed data: code strings, manifest entities, certificate signatures, etc. For example, this example looks for APKs that mimic a legitimate app by using the same icon (“main_icon_dhash”), but signed with a different certificate:
However when looking for something specific it is better using “androguard_package:” to search for APKs with explicit package names (please note this  only works for newly indexed files since March 2022):
Since this is a new search modifier, please feel free to share our feedback with us.
In some cases you can skip using any special keyword and build your query using APK internal paths only:
Here are some more examples for different platforms.
tag:iphone tag:signed p:5+ - signed iOS app packages detected by at least 5 AVs.
(type:apple OR type:mac) itw:cdn.discordapp.com - iOS/macOS files served from a given URL.
type:symbian name:"*spy*" - Symbian files (.sis) containing “spy” substring in its name.

Network

First of all, we encourage you to check the full list of network-related search modifiers for URLs, IPs and domains, as the list is too large to fit them all in this post.


However, network modifiers can save the day for different specific cases. Here are a bunch of practical examples.
If you are looking for botnet admin panels within a certain TLD (Top Level Domain) knowing only specific HTML meta information from the response, you can use this query:
In cases when you extracted the C2 endpoint from a malware sample and want to search for other servers using the same backend path:
You can even search for specific HTML body content with the “NOT parent_domain:” syntax to filter out legitimate results:
There are plenty of different ways to detect malicious domains mimicking legitimate ones, including favicon similarity. To get a specific dhash you can just click on the original file/website icon in Virustotal and it will produce the query for you. Then you can check for additional domains using this favicon but detected by different AVs:
Another approach is to use the “fuzzy_domain” keyword, which is based on Levenshtein distance and will get you domain names similar to a given one. The best use case for this is typosquatting attack detection. In the following example, we filter results using “urls_max_detections” to get only domains with URLs detected as malicious:
Instead of an IP range, you can specify the whole ASN:

Summarizing, the number of different options to query VT Intelligence can be overwhelming, but hopefully the above examples and our VTI Cheat Sheet will make your life easier.  We will continuously update the VTI Cheat Sheet to keep it as fresh and useful as possible, we will be announcing every time there is any major update. In the meantime, we hope this will be useful for you, and as usual if you have any suggestions or just want to share feedback please feel free to reach out here.
Happy hunting!

Monday, November 28, 2022

Threat Hunting with VirusTotal

We recently conducted our first “Threat Hunting with VirusTotal” open training session, providing some ideas on how to use VT Intelligence to hunt for in-the-wild examples of modern malware and infamous APT campaigns. In case you missed it, here you can find the video recording available on Brighttalk and Youtube.  We also created a PDF version of the slides with all the queries covered during the session and direct links to the documentation.

We received lots of questions during the session that we decided to answer in this Q&A blog post.

1.  How can we search for “have:itw” with a specific URL?
“have:itw” is a search modifier you can include in your VT Intelligence queries to get all samples we found being distributed in the wild. You can specify any particular domain in your query, for instance the following example finds samples being distributed itw through discord:  

2.  How can we convert the search queries to monitoring alerts?
Good question, at the moment we are working on a solution to do this automatically, hopefully available very soon. In the meantime, there are two workarounds: execute your query through the API or, in some file-related cases, you can rely on the Yara VT module to create and deploy a Livehunt rule.

3.  Is there any documentation on the VT website for all this info?
Yes, here you can find general manuals and dedicated documentation for the API. Another good resource is our getting started site. You can find more resources linked in the training slides.

4.  Can we have sessions on hunting part of VT like this.
Yes, we will be having quarterly “Hunting with VT” sessions (at a minimum).
 
5.  Can we use regex in queries?
You can use regex in VT Grep queries. The following is an example using wildcards for a hexadecimal sequence of bytes: content:{686f6c61 ?? 6d756e646f}
There are no wildcards for most of the regular VT Intelligence queries with the exceptions of “name:” and “domain_regex:”, because we use full text search. In some cases you can achieve the same effect by combining search terms with the “AND” keyword.

6.  Can you point out a location for all the most useful queries?
We are working on a Cheat Sheet which will be available very soon, stay tuned.

7.  Can you see the content tab in the free version?
This is only available to VT Enterprise customers.
 
8.  Can you use wildcards in date notations when searching?
Date notations are quite flexible even without wildcards. For example, for malware submitted to VT in January this year you can use the following: 
You can get malware submitted for the last 5 days with the following query:

9.  What is your keyboard? Super nice sound.
DROP CTRL + Kaihua Speed Silver + T0mb3ry SA Carbon

10.  Does “crowdsourced...:malware_name” will give all the rules sigma/yara written for that malware?
Not really. Rules are not assigned to any malware or actor in particular, so we need to rely on the name of the rule. For instance:

provides you with files detected by the Yara or IDS rules with “Sofacy” in its name.

11.  Does Virustotal collect samples from sources other than user submitted files? For example, does it passively scrape the Android App store to check for new apps created by APTs?
Yes! Just as an example, you can submit a file to VirusTotal from Process Explorer. Also, there are different research groups and other volunteers (thanks again to all of you!) who share new samples with the VT community.

12.  How to fetch all the matched samples for a query by script instead of going through all pages?
You can use this API endpoint.

13.  Sometimes I tried to perform behavior search using powershell commands, but it doesn’t work for me. For example, this one. When clicking to the powershell command itself, returns no results. Also, doing something like behavior:"PAAjAGcAeABwACMAPgAgAFIAZQBnAGkAcwB" does not work.
Thanks for the heads up! There is an issue when transforming super-long strings into a search query, we are already working on a fix. 
Regarding the second question, when doing a full text search unfortunately it is not possible to use substrings (unless it is a separate word). For example:

14.  If I am trying to hunt a certain ransomware, and the actual PE Files are present on VT as per YARA Rules. How can I find the initial vector?
Searching for the initial infection vector is tricky and not always possible as it depends on VirusTotal’s visibility, but you can try the following ideas:

  • To find suspicious (p:5+ for 5 or more AV detections, as rule of thumb) files distributed as email attachment you can use tag:attachment p:5+. Then you can get full details in the “Relationship” tab for every resulting sample:


  • To find suspicious files known to be distributed in-the-wild you can use have:itw p:5+. Details, again, in the “Relationship” tab:




15.  Do you have a link to all available VT "entity:" types?
That would be “file”, “url”, “ip”, “domain” and “collection”. You can find extensive information with documentation links on the slide #5 of PDF slides. This will also be available soon in VirusTotal’s documentation.

16.  How can we utilize the VT API to upload our own PE files to sandboxes for dynamic analysis? I have been running it on some executables and most of the files error out during the sandbox execution (e.g. tries to open a file in the file system that doesn’t exist, etc).
Sandboxing is part of the executable processing pipeline, you can submit your files using this API endpoint. Regarding any technical issues including unexpected sandboxing behavior you can reach out to Virustotal Bot at the bottom of the website.

VirusTotal Support Bot

17.  Can we have the option of moving back to the old color scheme in Retrohunts?
Thanks for your request, we’ll think about how to make the Yara editor pleasant for everybody.

18.  Can you show an example of using language with word document search? How can I search for Follina samples with specific languages?
The “lang:” keyword finds all files whose Exif language property matches the language provided. In Follina case, following our training instructions, we can use the crowdsourced Yara rule with language clarification:

19.  Is there a thinking on future development for GodMode to be able to label deployed malwares by sectors? like Healthcare?

20.  Is there Mitre Attack type T. identifier search? 
Yes, you can use the following keywords and specify any technique or tactic.

21.  Is there a way to run custom tools on top of what is available?
You can always organize a custom post-processing of VirusTotal data by using our API. This is exactly what the “APT dashboard” we showed during the training does.

22.  Were we able to search for a specific email and see all other files distributed by the email? 
You can leverage VT Grep to search for email files with specific content and then pivot to the children entities as it appears under the Relationship tab, for example:

23.  How can we combine a domain and a file seen together?
If you want to get files downloaded from a specific domain (drive.google.com in the example below), you can use:

24.  What about wildcards in androguard on receivers/services/activities?
Unfortunately, this data is indexed as full-text search, so no wildcards are allowed. However, if you feel that this is critical to you, please feel free to reach out.

25.  What encodings are accepted by the "content" query parameter? Is it only plaintext and hexadecimal?
You can also use escaped UTF-8, check the VT Grep manual for details.

26.  Why isn't it possible to combine the "have" and "content" modifiers? I have the following error: "have can't be combined with content modifier", example: have:email_parents AND content:"Hello World!"
Content search is a different querying tool by design, that's why it is not possible to combine with all the other keywords by default. We are collecting the feedback and adding compatibility with the most popular ones (tag, type, p, fs, etc). For more details please refer to VT Grep documentation.

27.  How can I submit a yara or a sigma rule to VT for community use? Are they generally just obtained from curated open source?
Yes, we are always looking for solid and active repositories constantly updated with the latest malware signatures. You can find the number of such repositories in our Contributors list. If you do want to contribute, please let us know.

28.  Will the APT dashboard be downloadable from GitHub? Would it be possible to share one of your jupyter notebooks, please?
At the moment we are not openly sharing such notebooks, which are mostly designed for demoing and internal research, and we are not really sure if they are solid enough for production environments. We are currently studying if this is something we will be doing in the near future. However, we are happy to help you build your own! Please feel free to let us know in case you are interested in this or any other ideas.

If you have any other questions, please feel free to reach out.

Happy hunting!


Friday, November 25, 2022

From zero to Zanubis

 A few weeks ago we stumbled upon a suspicious Android sample from a tweet from @malwrhunterteam which was only detected by four antivirus engines:


Antivirus verdicts didn’t provide specifics about the malware family other than it might be either a banking trojan or spyware, so a first approach to continue the investigation is finding other similar samples that will help provide a picture of this malware family. Continuing with the example, the first step is checking the “Relations” tab in the VirusTotal report to find other related IOCs (Indicators of Compromise). In this case we can observe a few interesting “Contacted URLs” obtained during sandbox detonation:


Although both URLs are interesting, the one used for socket communication (port 8000) uses Spanish strings in the endpoint (“instalado” meaning installed). By clicking on this URL in VirusTotal, we immediately find four additional samples sharing the same described networking behavior: 


These results show that the sample used to start the investigations (first seen 2022-08-27) was not the first one submitted to VirusTotal from the set of samples contacting the suspicious URL. A sample having a similar behavior and named “preso.apk” was submitted two weeks earlier (2022-08-11). The “Network Communications” section in the “Behavior” tab shows how it uses the same URL pattern:


We decided to create a Yara rule to monitor freshly submitted samples also having this same pattern.


Finding patterns for hunting

Writing YARA rules for APK files might be a bit complex, however we have two powerful tools on our side: the “vt” YARA module and VTDiff. 


Using VTDiff 


VTDiff finds common patterns between a set of files and excludes the “too common to be useful” ones. You can also provide an “exclusion list” based on a set of hashes. As a result, we get a shortlist of patterns to consider for our YARA. Considering that the APK format is basically a ZIP file, we need to choose the bundled files to submit to VTDiff. In this case, “classes4.dex” seems a good candidate based on the number of detections:

 


In our new VTDiff session we add classes4.dex hashes in the first box, and in the second box (exclusion list), a random DEX file from a clean Android file:

After VTDiff does its magic, we get several patterns matching most or all of the files submitted:


We can explore how useful these patterns are for hunting by doing a content search (just by clicking on the pattern). This returns how many files in VirusTotal match a pattern:


In this case, the chosen pattern returns 20+ additional dex files. We can find their corresponding APK parents through their relationships.


VT YARA module


The VT YARA module enhances YARA rules in many ways, including the possibility of including behavioral patterns in the rules’ conditions. This includes matching URLs extracted during sandbox detonation. In this particular case, we will take advantage of two facts to create our rule:
  • Samples connects to an endpoint called instalado
  • Samples create and delete a temporary backup file with .xml.bak extension.

With these very simple ideas, we can create the following rule:

import "vt"

rule Zanubis_Behaviour_VT
{
  meta:
      author = "@entdark_"
      email = "fdiaz@virustotal.com"
      created = "2022.09.1"
  condition:
      for any http_req in vt.behaviour.http_conversations : (                                                                                
        http_req.url contains "instalado") 
      and
      for any deleted_file in vt.behaviour.files_deleted : (
        deleted_file contains ".xml.bak")
      and vt.FileType.ANDROID
}

In case you are not familiar with YARA, the previous rule:
  • Imports the VT module to obtain access to VirusTotal’s metadata and behavioral data.
  • Iterates all the aggregated HTTP conversations in the behavioral sandbox report and finds patterns contained within the URL field.
  • Iterates all aggregated deleted files on detonation, and searches for the .xml.bak substring.
  • Verifies that the file is an Android file.

To test this YARA rule, we can manually rescan the samples used to generate it and check if they trigger the rule. However, it is more interesting to run a RetroHunt job and observe how many additional samples show up from VirusTotal’s collection:


We started from a single sample that allowed us to find 3 more through pivoting. Now RetroHunt provided us 14 files thanks to our behavior-based Yara rule. That’s great, isn’t it? 

Please note for the moment the “vt” YARA module can only be used for LiveHunts (availability for RetroHunts is in late Beta, very soon available for everyone!).


Analysis of this Android malicious family

Now that we have several samples to work with, we can go into a more manual analysis phase to make sure they are all related and what are the common capabilities. In this case, the samples belong to a family called Zanubis, which is a “work in progress” banking trojan. It uses accessibility services to overlay its victim’s apps with their own login screen. So far, it seems to exclusively target financial entities from Peru and retrieves data from the victim’s device including contacts, device details and SMS data. All the retrieved data is sent back to the remote Command and Control via websockets.


For each iteration, Zanubis’ actors have been adding additional functionality, sometimes completing features where placeholders were found in previous versions. For the last iteration, analyzed samples seem to focus on improving their social engineering capabilities, in this case including links to Peru’s government sites.

Conclusions

The process described in this post is the one many researchers follow when finding an interesting sample. One of the first things to do is finding something unique enough that helps us take a step back and look for additional samples belonging to the same family. In this particular case, networking indicators based on behavioral analysis helped identify the malicious infrastructure, and from here, we were able to find some patterns to continue hunting. With a small set in our hands, we have different options to quickly explore different options to create exploratory YARA rules, in this case we used the “vt” module to add behavior to our rule.

There are many additional ways to reach the same results, as always we are happy to hear any ideas from you side.

Happy hunting!




Thursday, November 17, 2022

Stopping Cobalt Strike with YARA

 VirusTotal was born with the idea of community in mind - an ecosystem where everybody contributes and benefits. This helped grow our product around the concept of crowdsourced intelligence, where all the security community could contribute in different ways to provide more actionable tools for our users, including researchers and analysts, for detection and threat hunting.

Sometimes we have beautiful success stories on how VirusTotal’s users give back to the community what they get from the platform. In this case, our colleagues from Uppercase created a precise set of YARA rules to detect Cobalt Strike components. You can read more about it here.

Unfortunately, Cobalt Strike has become one of the main components in any attacker’s toolset. Albeit a legitimate tool for pentesting, different versions in the last years have been leaked and abused in many different ways. The first step to create a robust set of YARA rules is to have a consistent set of samples, in this case including all the different Cobalt Strike versions we want to be able to detect. VirusTotal was the platform of choice to gather all the samples needed, and thanks to our new Collections, these samples can easily be grouped in a single set - actually, you can find this Cobalt Strike collection here.

Once all the samples are available for the researcher, there is not a single way to create the YARA rules. A first approach could be checking Commonalities among the samples, in case we find any interesting characteristic or metadata among all the samples that we could use for our rule. Below you can see an example of finding commonalities in a collection of suspicious documents with more than 2000 samples:


Another possibility would be using VTDiff to find what particular bytes these samples have in common, and at the same time, have low prevalence in VirusTotal’s collection in order to qualify them as significant for creating a YARA. Independently of using VTDiff, checking the prevalence for any byte sequence or string in VirusTotal’s collection with a quick search is always a great idea to understand how useful they would be in your rule. Remember you can combine different byte sequences (using the “content” modifier) in your VTIntelligence search. If your rule is purely based on strings and byte sequences, you can mostly test its effectiveness with a few searches in VirusTotal.

There is a more technical approach consisting of reversing the samples and finding something interesting and unique for the detection, which is what was used in this case. But how to know if your rules are good enough? Usually this is an iterative process where we want the first versions of the rules to be a bit loose so we can find more suspicious samples. This is a way for us to understand if a rule can be used for hunting. Once we are satisfied with the results (we are finding all the Cobalt Strike samples we wanted), we want to make sure we don’t detect anything else (avoid false positives), especially when it comes to legitimate software.

For the process described above, usually you want to use RetroHunts, as they will check your rules against the whole VirusTotal collection. When launching your RetroHunt, you can specify the collection of samples you want your rules to be checked against, there is one collection of goodware we can use to make sure our rules don’t detect any of these samples by mistake.


There are different ways you can check if the results obtained from your rules are True Positives, usually in VirusTotal you will find plenty of data points you can use to double check, including verdicts, other crowdsourced rules, community comments, presence in other collections, signatures, etc. Unfortunately, sometimes the results from your rules (or VT Intelligence searches) can be huge, in that case we really encourage creating a collection with them and use the Commonalities feature to get a better understanding of how your rule did and discriminate among your results. 

Once we are satisfied with the results, we suggest deploying your rules in LiveHunt for some time, which basically will execute your rule against anything uploaded to VirusTotal from the moment you deploy it. This way you can monitor its effectiveness and do some final polishing if needed. 

And voila! Your rules are ready to be deployed. In this case, the rules can be found here, and now they are part of our set of crowdsourced YARA rules, so everyone in the VirusTotal’s community will collectively benefit from this effort.

We really appreciate the effort and generosity of our colleagues from Uppercase, and we hope these ideas will help everyone understand a bit more about the creation and deployment of YARA rules. As usual, we are happy to hear from you.

Happy hunting!