Wednesday, September 28, 2022

VT Collections: citius, altius, fortius - communiter

Last November 2021 we launched VirusTotal Collections as a way of helping organize, share and work with IoCs. Today we are announcing significant improvements that make Collections an actionable tool for strategic intelligence.  

As a quick reminder, our original concept for Collections was a simple and shareable mechanism for VirusTotal’s users to group IoCs under a common umbrella. Additionally, Collections are an additional source of context for indicators, shown under the report’s community tab (or through derived relationships) in case any observable belongs to any given collection. For the last months, we have been working hard to make Collections even more actionable.

Since our initial announcement VirusTotal’s users created more than 18000 collections, with more than 100k views in total.

What’s new in Collections?

In this new iteration, Collections go one step beyond to provide data typically related to any security event. The idea is to extend Collections’ boundaries and evolve the concept from simple sets of IoCs to attacks, campaigns and investigations. 

What’s this new Collection data? In addition to the traditional title, description and information about the collection itself (author, creation and last update), we now have adversarial attribution (to one or multiple adversaries, plus their aliases), victimology (both by industry and geography) and external OSINT references, if any. There is also a timeline showing submissions and lookups for the IOCs belonging to the Collection (more details on this later).

You might notice that some of these new fields are not available in your own collections. At the moment, we provided access to them to a limited number of users (typically security experts that historically collaborate with VirusTotal) in an attempt to keep this data as clean as possible. We will be slowly adding more users for everyone to benefit from crowdsourced intelligence while keeping data as accurate as possible. 

Collections also have a few new tabs. The first one details all the IOCs belonging to the collection, divided into the categories indicators belong to (Files, IPs, Domains and URLs). Actually we can visualize them with the autogenerated Graph that summarizes the collection under the Graph tab, presenting all indicators and their corresponding relationships. The Community tab shows OSINT References for the current Collection and additional references where any of the Collection’s IOCs can be found. This tab also provides Related collections and users’ comments.

The Rules tab lists any crowdsourced rules (Yara, Sigma or traffic-based) matching at least one file in the Collection. Happily, you can check exactly which files match every rule by clicking on it.

There are two more tabs, Commonalities and TTPs that we will discuss later in this post.

Auto-generated Collections

Additionally to the collections our users create, we create and manage several auto-generated collections. There are two types:
  • Based on OSINT data: These collections are based on publications from security vendors where details on the attack, including victimology and attribution, are shared along IOCs. At the moment we ingest OSINT data from a set of selected providers.

  • Based on YARA rules: We selected a set of Crowdsourced YARA providers who created rules to detect specific malware families or toolsets, and used the stream of live detections to create several live collections.

Both types of collections help provide context to individual indicators. More importantly, they serve as a continuous feed of technical data that can better serve our users. In addition, collections provide many options in terms of obtaining aggregated data and understanding the evolution of a particular set of samples. 

At the moment we are sourcing these collections from a limited set of trusted providers to keep information as curated and relevant as possible. If you want to contribute to this project, please contact us.

You can conveniently find all Collections under the new “Threat Landscape” section in VTIntelligence.


Threat actors

As already mentioned, we included Attribution to Collections. To keep this data as coherent as possible, we are mapping adversaries to an ontology following industry standards such as MITRE and MISP galaxies. You can click on any adversary to get more details on the actor itself:

Adversary cards provide similar tabs to the ones found in Collections, making it very easy to obtain IOCs, track adversary evolution (in terms of submissions and lookups) and obtain TTPs for all this actor’s campaigns. Actually, tracking new Collections under an Actor card is an excellent way to monitor its evolution and new activity. Under the Community tab we can also find all the OSINT references all in one place.

All actor’s information can be found under its dedicated tab also under the new “Threat landscape” section, where we can use Filters to find the ones we are interested with as described in the next section.

Working with collections in VTIntelligence

Other than the primary role of being a set of indicators as originally described, collections and Threat Actors are a new entity in VirusTotal Intelligence allowing additional context for your search. Let’s start with the basic view listing all collections (in Threat landscape) where you can filter results by different criteria (please be aware of the case-sensitiveness):

Also keep in mind that not all collections include all data, such as Source (of the attack) Region, Victimology (Targeted industries and Targeted Region) and Threat categories (depending on the nature of the attack). The collection's author provides this data.

Some other criteria, such as suspected actor, is not available (yet) in this filter. However this information is indexed and we can filter out collections using VT Intelligence queries like follows:

What else can Collections do for me?

We included TTPs (mapped to MITRE’s) based on the behavior of samples belonging to the collection in our sandboxes. But we did our best to make this actionable.

Other than obtaining a list of TTPs, you can get the subset of samples inside a given collection matching any of them. Which means, it makes it easier getting technical details on how that particular TTP was implemented by attackers (for instance, through Behaviour details or by Commonalities when checking the samples that match this TTP in particular). Not only that, we can use VirusTotal corpus to check how prevalent a technical item is.

This is relevant because that provides a mechanism for triaging effective methods for detecting the technical implementation of a particular TTP by a malware family, as now we can get TTP technical details and check its prevalence to select the relevant and unique ones that define a particular malware.

There is a final powerful tool available for collections under the Aggregations tab. Probably many of you remember the Commonalities tool available after a VTI search, unfortunately only available for the search results we had on screen (typically 20 samples). Well, Aggregations provide identical functionality but for all the samples we have in a particular collection (up to 10000), allowing us to work with a more than reasonable amount of samples. Keep in mind we can always create a “temporal” collection as a result from a VTI query, which would allow us to use this powerful tool.

The first thing we can see is that aggregations are calculated for all IOCs under different categories (Files, Domains and URLs in this case). For all of them we also have different sections with aggregated data as displayed in the menu on the right, as follows:
  • Detections: Popular threat/malware family name, including AVs verdicts and sandbox detection.

  • Distribution Vectors: Which samples were found in the wild, what are execution parents, attachments, etc.

  • Threat Network infrastructure: All infrastructure either samples connected to, or domains/IPs they embed.

  • Similarity Hashes: Clustering based on different algorithms where we found more details, including visual similarity, VTHash, etc.

  • Execution Tracing: Most interesting details extracted from samples’ sandboxes detonation.

  • Static Analysis: Commonalities found in other characteristics such as metadata, signatures, sample geometry, and more.

All the above greatly simplifies analysts’ tasks. Collections are a great way to put all the results from any interesting VTI search and work with them in many different ways, such as checking timelines for submissions and lookups, finding commonalities, finding overlaps with other collections already attributed to other actors/activity, and follow up on their evolution. We can always keep polishing our collections until we are happy with the results. 


Collections not only opened the door to better organizing and sharing IoCs while we work with them in VirusTotal, but helped pivot to a model where IoCs are organized around security events, incidents and campaigns. This helps add valuable context, such as attribution and victimology, and external references. It also helps working with a set of samples for obtaining aggregated information such as TTPs. Collections do also provide even more contextual information to any observable in VirusTotal. Auto-generated collections help include OSINT security events into VirusTotal corpus, and to self-organize malware families.

There is another advantage. Now the Community tab for any indicator will also show if it belongs to a collection, providing additional context.

This is all a big difference. For the first time VirusTotal is providing tools for actionable strategic security decision making. We are incorporating both the tools and the data to help see beyond technical and tactical intelligence. We understand the difficulties of maintaining a clean crowdsourced ecosystem, but we believe the benefits will greatly outweigh them for our community.

We are happy to get any feedback from your side on these new features. 

Happy hunting!


Post a Comment