Thursday, May 30, 2024

We Made It, Together: 20 Years of VirusTotal!

Hi Everyone,

We can hardly believe it, but VirusTotal is turning 20 on June 1st! As we sit down to write this, we’re filled with a mix of pride and gratitude. It's been an incredible journey, and we wouldn't be here without the amazing community that has supported us every step of the way.

When we started VirusTotal, our goal was simple: to help make the internet a safer place. We never imagined that two decades later, we'd be here celebrating this milestone with all of you. From the early days to now, it's always been about working together. Whether you're a user, a contributor, or a supporter, you've played a crucial role in our success.

Over the years, we've had the privilege of collaborating with some of the brightest minds in cybersecurity. We've received support and guidance from industry leaders who believed in our mission and helped us grow. To mark this special occasion, we reached out to a few of these key figures to share their thoughts and memories about VirusTotal. Their testimonials highlight the power of community and collaboration:

Adrian Hendrik

"VirusTotal has consistently tackled tough challenges in cybersecurity. By assisting them with detailed analyses and organizing the first-ever VirusTotal training in Japan, I've seen their impact firsthand. Celebrating their integration into Google's parent company was a milestone. As VirusTotal marks 20 years, it's clear they've become essential for detecting malware and supporting cyber threat intelligence. Their contributions are invaluable to security personnel. I hope the younger generation continues this vital work, ensuring VirusTotal thrives for another 20 years."

Adrian Hendrik (unixfreaxjp), Cyber Emergency Center of LACERT, Japan

Costin G. Raiu

"It’s difficult to think of a project that has had a greater impact on our industry than VirusTotal. I believe its success rests on three key pillars: providing easy access to top antivirus engines for users, enabling researchers to efficiently use YARA for pivoting, and the incredible dedication and passion of its team. On this 20th anniversary, happy birthday to VirusTotal and to everyone who has worked tirelessly to make this dream a reality! Cheers also to all who rely on VirusTotal daily for their work! Analizar, siempre!"

Costin G. Raiu, Independent security researcher

Florian Roth

“I've been using both VirusTotal and YARA since their early days. Over the past 12 years, I've written more than 18,000 YARA rules, greatly aided by the features and capabilities of VirusTotal. Today, I consider VirusTotal an indispensable tool for the cybersecurity community. We rely on it to track threat actors, connect the dots, uncover new undetected malware, quality test our detections, and discover related and still unnoticed threats. VirusTotal stands as one of the central pillars of the cybersecurity toolset, if not the most important one.”

Florian Roth, VP R&D at Nextron Systems

George Kurtz

“VirusTotal has become a vital asset for cybersecurity defenders globally, providing essential insights that accelerate detection and response. At CrowdStrike, we are proud to have been the first to integrate our NGAV technology with VirusTotal, reflecting our shared commitment to innovation and security. By harnessing collective intelligence, VirusTotal has significantly elevated cybersecurity standards, ensuring a safer digital environment for all. Congratulations on this remarkable milestone and thank you for your dedication to supporting the security community and protecting organizations worldwide.”

George Kurtz, President/CEO and co-founder of CrowdStrike

Heather Adkins

“For two decades, VirusTotal has maintained an unwavering commitment to partnering across the community, creating transparency around the tools that threat actors are using to undermine global safety. They have had a meaningful impact on countless individuals and organizations, uplifting security teams across the planet, in a challenging asymmetric threat landscape. Thank you for all that you've done for Google, and the world.”

Heather Adkins, VP/Fellow, Security Engineering at Google

Joe Pichlmayr

“When the first multi-scanner systems went online, we could not have imagined how quickly a simple way to get multiple scanner opinions would become a substantial building block for our daily malware analysis work. VirusTotal's amazing and comprehensive analyses have not only become an indispensable part of our analyzer work but have also become an essential building block for our threat intelligence services.”

Joe Pichlmayr, CEO at IKARUS

John Lambert

“Yara cut the gordian knot paralyzing information sharing. It gave defenders a way to share detection when they could not share samples. VirusTotal sped up global defense by providing a common hunting ground containing the world’s more important threats.”

John Lambert, Corporate Vice President and Security Fellow, Microsoft

Mark Kennedy

“Over the past 20 years, VirusTotal, or VT to most of us, has evolved from a simple multi-scanner to a key source of security intelligence. It is relied on by security companies as well as security professionals. Beyond that, VT has been a reliable partner from the very beginning. They have always been ready and willing to add features and APIs to make using their services and integrating it into both products and workflows easier. The vast wealth of data analytics and historical data on files and families, has permanently stitched VT into the fabric of security intelligence. I cannot wait to see what the next 20 years of innovation will produce. Congratulations on the first 20 years!”

Mark Kennedy, Distinguished Engineer Broadcom, AMTSO Chair

Mark Russinovich

“Microsoft believes security is a team sport and the integration of SysInternals with VirusTotal has made it easier to analyze malware and share those results to improve security for all. In addition, Microsoft Defender XDR uses VirusTotal reports as an accurate threat intelligence source, and VirusTotal uses detections from Microsoft Defender antivirus as a primary source of detection”

Mark Russinovich, Azure CTO and Technical Fellow, Microsoft

Mikko Hyppönen

"VirusTotal was a real gamechanger. In addition of building a technical platform, it also built a community. Thank You for your work!"

Mikko Hyppönen, Technology speaker and author. CRO at WithSecure

Parisa Tabriz

"Reflecting on VirusTotal's 20th anniversary, I still remember the launch of their URL scan service back in 2010 and early collaborations with Google Safe Browsing and Chrome. We all had an aligned mission to make the web a safer place for everyone. Twenty years in, lots of progress to be proud of protecting people around the world, and our work continues!"

Parisa Tabriz, VP/GM Chrome & Google Security Princess

Shane Huntley

“Since the earliest days of TAG in 2010, VirusTotal and the team have been a critical partner helping us to defend Google, Google users and the world. We all owe a huge debt to all this team has done and how they have provided so much to the community of those fighting against online threats.”

Shane Huntley, Sr Director Google Threat Intel and cofounder of TAG

One of the things we’re most proud of is how VirusTotal has always been a team effort. From our dedicated staff to our passionate users, everyone has contributed in their own way. It's this collective effort that has allowed us to innovate, evolve, and stay ahead of the ever-changing threat landscape.

What's Next?

We'd love to hear your stories! Share your favorite memories or how VirusTotal has impacted your work on Twitter/X, LinkedIn, and other social networks with the hashtag #VirusTotal20Years. We'll be collecting the best stories and sending some cool swag to the top contributors. Stay tuned for more exciting announcements, events, and blog posts about some behind-the-scenes stories from our early days and key milestones in our history throughout our anniversary year!

As we look to the future, we remain committed to our mission. There's still a lot of work to be done, and we know we can't do it alone. We're counting on your continued support, feedback, and collaboration to keep pushing the boundaries and making the digital world safer for everyone.

Thank you for being a part of our journey. Here's to many more years of working together to fight cyber threats and protect our digital lives.

Best regards,

The VirusTotal Founding Team

From left to right:

  • Julio Canto: Wrote the very first lines of code for VT and launched the first version, still in charge of adding all the new engines and tools we use.
  • Alejandro Bermúdez: The mastermind behind how our analyzer farm works. He keeps everything running smoothly to this day.
  • Francisco Santos: Started out designing our very first website, databases, and all those storage systems we rely on. Now he leads the backend analysis team.
  • Bernardo Quintero: Had the initial idea for VT (blame him if anything breaks!) and now focuses on using AI to make threat analysis even smarter.
  • Victor Manuel Alvarez: Gave the world YARA, helped design VT Intelligence and Hunting, and just recently announced YARA-X.
  • Emiliano Martínez: If you've used our VT API, that's Emiliano's work. He's also a co-designer of VT Intelligence and currently keeps everything running as our Product Manager.

Wednesday, May 29, 2024

, , , , ,

Tracking Threat Actors Using Images and Artifacts

When tracking adversaries, we commonly focus on the malware they employ in the final stages of the kill chain and infrastructure, often overlooking samples used in the initial ones.
In this post, we will explore some ideas to track adversary activity leveraging images and artifacts mostly used during delivery. We presented this approach at the FIRST CTI in Berlin and at Botconf in Nice.

Hunting early

In threat hunting and detection engineering activities, analysts typically focus heavily on the latter stages of the kill chain – from execution to actions on objectives (Figure 1). This is mainly because there is more information available about adversaries in these phases, and it's easier to search for clues using endpoint detection and response (EDR), security information and event management (SIEM), and other solutions.
Figure 1: Stages of the kill chain categorized by their emphasis on threat hunting and detection engineering.
We have been exploring ideas to improve our hunting focused on samples built in the weaponization phase and distributed in the delivery phase, focused on the detection of suspicious Microsoft Office documents (Word, Excel, and PowerPoint), PDF files, and emails.
In threat intelligence platforms and cybersecurity in general, green and red colors are commonly used to quickly indicate results and identify whether or not something is malicious. This is because they are perceived as representing good or bad, respectively.
Multiple studies in psychology have demonstrated how colors can influence our decision-making process. VirusTotal, through the third-party engines integrated into it, shows users when something is detected and therefore deemed "malicious," and when something is not detected and considered "benign."
For example, the sample in Figure 2 belongs to a Microsoft Word document distributed by the SideWinder group during the year 2024.
Figure 2: Document used by the SideWinder APT group
The sample in question was identified at the time of writing this post by 31 antivirus engines, leaving no doubt that it is indeed a real malware sample. In the process of pivoting to identify new samples or related infrastructure, starting with Figure 2, the analyst will likely click on the URL detected by 11 out of the 91 engines, and the domains detected by 17 and 15 engines, respectively, to see if there are other samples communicating with them. The remaining two domains (related to and in this case are easily identified as legitimate domains that were likely contacted by the sandbox during its execution.
Figure 3: Relationships within the SideWinder APT group document
In the same sample, if you go down in the VirusTotal report (Figure 3), the analyst will likely click on the ZIP file listed as "compressed parent" to check if there are other samples within this ZIP besides the current one. They may also click on the XML file detected by 8 engines, and the LNK file detected by 4 engines. The remaining files in the bundled files section probably won't be clicked, as the green color indicates they are not malicious, and also because they have less enticing formats — mainly XML and JPEG. But what if we explore them?

XML files generated by Microsoft Office

When you create a new Microsoft Office file, it automatically generates a series of embedded XML files containing information about the document. Additionally, if you use images in the document, they are also embedded within it. Microsoft Office files are compressed files (similar to ZIP files). In VirusTotal, when a Microsoft Word file is uploaded, you can see all these embedded files in the embedded files section.
We have mainly focused on three types of embedded files within Office documents:
  • Images:Many threat actors use images related to the organizations or entities they intend to impersonate. They do this to make documents appear legitimate and gain the trust of their victims.

  • [Content_Types].xml:This file specifies the content types and relationships within the Office Open XML (OOXML) document. It essentially defines the types of content and how they are organized within the file structure.

  • Styles.xml:Stores stylistic definitions for your document. These styles provide consistent formatting instructions for fonts, paragraph spacing, colors, numbering, lists, and much more.

Our hypothesis is: If malicious Microsoft Word documents are copied and pasted during the weaponization building process, with only the content being modified, the hashes of the [Content_Types].xml and styles.xml files will likely remain the same.

Office documents

To check our hypothesis, we selected a set of samples used during delivery and belonging the threat actors listed in Figure 4:
Figure 4: Number of samples per actor within the scope
Let’s analyze some of the results we obtained per actor.

APT28 – Images

We started by focusing on images APT28 has reused for different delivery samples (Figure 5).
Figure 5: Images shared in multiple documents by APT28
Each line in the Figure 5 graph represents the same image, and each point represents at least two samples that used that particular image.
The second image of the graph shows how it was used by different Office documents at different points in time, from 2018 to 2022 (dates related to their upload to VirusTotal).
Now, the chart in Figure 6 visualizes each of these images.
Figure 6: Content of the images shared in multiple documents by APT28
  • The first image is just a simple line with no particular meaning. It's embedded in over 100 files known by VirusTotal.

  • The second image is a hand and has 14 compressed parents.

  • The third image consists of black circles and also has over 100 compressed parents.

  • The last image is like a Word page with a table, presenting a fake EDA Roadmap of the European Commission. The image format is EMF (an old format) and it has 4 compressed parents

If we delve into the compressed parents of the second image (the one with the hand), we can see how the image is used in Office documents that are part of a campaign reported by Mandiant attributed to APT28. The image of the hand was used in fake Word documents for hotel reservations, particularly in a small section where the client was supposed to sign.
Figure 7: Pivoting through a specific image used by APT28

SideWinder – Images

SideWinder (aka RAZER TIGER) is a group focused on carrying out operations against military targets in Pakistan. This group traditionally reused images, which might help monitoring their activity.
Figure 8: Images shared in multiple documents by RAZOR TIGER
In particular, the image in Figure 9 was used in a sample uploaded in September 2021 and in a second one uploaded March 2022. The image in question is the signature of Baber Bilal Haider.
Figure 9: Two different samples of RAZOR TIGER share the same image of a handwritten signature

Gamaredon – [Content_Types].xml and styles.xml

For Gamaredon we found they reused styles.xml and [Content_Types].xml in different documents, which helped reveal new samples.
Figure 10 chart displays all the [Content_Types].xml files from Gamaredon's Office documents.
Figure 10: [Content_Types].xml shared in multiple documents by Gamaredon Group
There are a large number of samples that share the same [Content_Types].xml. It's important to highlight that these [Content_Types].xml files are not necessarily exclusively used by Gamaredon, and can be found in other legitimate files created by users worldwide. However, some of these [Content_Types].xml might be interesting to monitor.
Styles.xml files are usually less generic, which should make them a better candidate to monitor:
Figure 11: Styles.xml shared in multiple documents by Gamaredon Group
We see styles.xml files are less reused than [Content_Types].xml. This could be because some of the samples used by this actor for distribution are created from scratch or reusing legitimate documents.
We used identified patterns in the styles.xml files to launch a retrohunt on VirusTotal. Figure 12 visually represents the original set of style.xml files (left) and those that were added later after running the retrohunt (right).
Figure 12: Initial graph of the styles.xml and its parents used by Gamaredon (left). Final graph after identifying new styles.xml and their parents using retrohunt in VirusTotal (right)
One of the new styles.xml files found in our retrohunt has 17 compressed parents, meaning it was included in 17 Office files.
Figure 13: Number of parent documents for a specific styles.xml file used by Gamaredon
All the parents were malicious, some of them identical and the rest very similar between them. The content of many of them referred to "Foreign institutions of Ukraine - Embassy of Ukraine in Hungary," containing a table with phone numbers and information about the embassy, such as social media links and email accounts. Here's an example:
Figure 14: Document used by Gamaredon in one of its campaigns that includes multiple images which can be used to monitor new samples
The information for social media includes the logos of these platforms, such as the Facebook logo, Skype logo, an image of a telephone, etc. By pivoting, on the image of the Facebook icon, we find that it has 12 additional compressed parents, meaning it appears in 12 documents, all of them sharing the same styles.xml file.
Visualizing all together, we find a set of about 12-14 images used within the same timeframe by the actor. All of these images can be found in the “Embassy of Ukraine in Hungary” document.
Figure 15: Pivoting through the Facebook image that included the document in Figure 14
There's a pattern evident in the previous image where different images were included in files uploaded simultaneously. This pattern is associated with multiple documents used in the same campaign of the Embassy of Ukraine in Hungary, all of them were using the same social media images explained before.

Styles.xml shared between threat actors

Another aspect we explored was if different threat actors shared similar styles.xml files in their documents. Styles.xml files are somewhat more specific and unique than [Content_Types].xml files because they can contain styles created by threat actors or by legitimate entities that originally created the document and then were modified by the actor. This makes them stand out more and can help in identifying threat actor activity.
This doesn't necessarily imply they share information to conduct separate operations, although in some cases, it could be a scenario worth considering.
Figure 16: styles.xml shared between different threat actors
Of all styles.xml files related to actors in our initial set, only six of them were found to be shared by at least two actors. Some styles defined by the styles.xml file are very generic and could identify almost any type of file. However, there are others that could be interesting to explore further.
An interesting case is the Styles.xml file, which seems to be shared by Razor Tiger, APT28, and UAC-0099. Specifically, the samples from APT28 and UAC-0099 are attract because they were uploaded to VirusTotal within short time frames, suggesting they might belong to the same threat actor.
You can see the list of hashes in the appendix of this blog

[Content_Types].xml shared between threat actors

Like in the previous case, we checked if there were Office documents among different threat actors sharing [Content_Types].xml:
Figure 17: [Content_Types].xml shared between different threat actors
In this case, there are eleven [Content_Types].xml files that are shared by at least two different actors.
An interesting case here is the file dfa90f373b8fd8147ee3e4bfe1ee059e536cc1b068f7ec140c3fc0e6554f331a, which is shared by Gamaredon, APT37, Mustang Panda, APT28, SideCopy, and UAC-0099. Again, there could be different explanations for this.
Another interesting case that is worth analyzing in detail is [Content_Types].xml with hash 4ea40d34cfcaf69aa35b405c575c7b87e35c72246f04d2d0c5f381bc50fc8b3d, which is only shared by APT28 and APT29.
You can see the list of hashes in the appendix of this blog

AI to the rescue

The images reused by attackers seem to be a promising idea we decided to further explore.
We used the VirusTotal API to download and unzip a set of Office documents used for delivery, this way we obtained all the images. Then we used Gemini to automatically describe what these images were about.
Figure 18: Results obtained with Gemini after processing some of the embedded images in the documents used by the threat actors
Figure 18 shows some examples of images that were incorporated by certain actors. There were also other results that were not helpful, mainly related to images that did not show a logo or anything specific that indicated what they were.
Figure 19: Results obtained with Gemini after processing some of the embedded images in the documents used by the threat actors
Using the VirusTotal API to obtain documents that you might be looking for and combining the results with Gemini to analyze possible images automatically, can potentially help analysts to monitor potential suspicious documents and create your own database of samples using specific images, for example Government images or specific images about companies. This approach is interesting not only for threat hunting but also for brand monitoring.

PDF Documents

Images dropped by Acrobat Reader

Unlike Office documents, PDF files don't contain embedded XML files or images, although some PDF files may be created from Office documents. Some of our sandboxes include Adobe Acrobat Reader to open PDF documents which generates a thumbnail of the first page in BMP format. This image is stored in the directory C:\Users\\AppData\LocalLow\Adobe\Acrobat\DC\ConnectorIcons. Consequently, our sandboxes provide this BMP image as a dropped file from the PDF, allowing us to pivot.
To illustrate this functionality, see Figure 20 attributed to Blind Eagle, a cybercrime actor associated with Latin America.
Figure 20: Content of a PDF file related to Blind Eagle threat actor
Figure 20 was provided by our sandbox. In the "relations" tab, we can see the BMP image as a dropped file:
Figure 21: BMP file generated by the sandbox that can be used for pivoting
The BMP file itself also shows relations, in particular up to 6 PDF files in the "execution parents" section. In other words, there are other PDFs that look exactly the same as the initial one.
Typically, many actors engaged in financial crime activities utilize widely spread PDF files to deceive their victims, making this approach highly valuable. Another interesting example we found involves phishing activities targeting a Russian bank called "Tinkoff Bank."
The PDF files urge victims to accept an invitation from this bank to participate in a project.
Figure 22: The content of a PDF file used by cybercrime actors
Applying the same approach we identified 20 files with identical content, most of them classified as malicious by AV engines.
Figure 23: BMP file generated by the sandbox that can be used for pivoting, in this case having other 20 PDF with the same image
There are some limitations to this approach. For instance, the PDF file might be slightly modified (font size, some letter/word, color, …) which would generate a completely different hash value for the thumbnail we use to pivot.

Images dropped by Acrobat Reader

Just like the BMP files generated by Acrobat Reader, there are other interesting files that might be dropped during sandbox detonation. These artifacts can be useful on some occasions.
The first example is a JavaScript file dropped in another PDF attributed to Blind Eagle.
Figure 24: BMP file generated by the sandbox that can be used for pivoting, another example of Blind Eagle threat actor
The dropped JavaScript file's name during the PDF execution was "Chrome Cache Entry: 566" indicating that this file was likely generated by opening an URL through Chrome, possibly triggered by a sandbox click on a link within the PDF. Examining the file's contents, we observe some strings and variables in Spanish.
Figure 25: Artifact generated by the sandbox via Google Chrome when connecting to a domain
The strings “registerResourceDictionary”, “sampleCustomStringId”, “rf_RefinementTitle_ManagedPropertyName” are related to Microsoft SharePoint as we were able to confirm. These files were probably generated after visiting sites that have Microsoft Sharepoint functionalities. We found that all the PDFs containing this artifact dropped by Google Chrome came from a website belonging to the Government of Colombia.
Figure 26: Flow of artifact generation related to Google Chrome that can be used for pivoting in VirusTotal

Email files

Many threat actors incorporate images in their emails, such as company logos, to deceive victims. We used this to identify several mailing campaigns where the same footer was used.

Campaign impersonating universities

On November 13, 2023, we details about a new campaign impersonating universities, primarily located in Latin America. By leveraging the presence of social network logos in the footer, we were able to find more universities in different continents targeted by the same attacker.
Figure 27: Email impersonating a university that contains multiple images
Figure 27 shows several images, including the University of Chile's logo and building, as well as images related to social networks like YouTube, Facebook, and Twitter.
Pivoting through the images related to the University of Chile doesn't yield good results, as it's too specific. However, if we pivot through the images of the social media footer, represented as email attachments, we can observe multiple files using the same logo.
Figure 28: Using the images from the email footer to pivot and identify new emails
Just by analyzing one of the social media logos, we saw 33 email parents, all of them related to the same campaign.
Figure 29: Other emails identified through image pivoting techniques

Campaigns impersonating companies

Another usual case is adding a company logo in the email signatures to enhance credibility. Delivery companies, banks, and suppliers are some of the most observed images during our research.
For example, this email utilizes the corporate image of China Anhui Technology Import and Export Co Ltd in the footer.
Figure 30: Email impersonating a Chinese organization using the company logo in the footer
Pivoting through the image we found 20 emails using the same logo.
Figure 31: Other emails identified through image pivoting techniques

Wrapping up

We can potentially trace malicious actors by examining artifacts linked to the initial spreading documents, and in the case of images, AI can help us automate potential victim identification and other hunting aspects.
In order to make this even easier, we are planning to incorporate a new bundled_files field into the IOCs JSON structure, which basically will help to create livehunt rules. In the meantime you can use vt_behaviour_files_dropped.sha256 for those scenarios where the files are dropped.
In certain situations, the styles.xml and [Content_Types].xml files within office documents can provide valuable clues for identifying and tracking the same threat actor. The method presented here offers an alternative to traditional hunting or pivoting techniques, serving as a valuable addition to a team's hunting activities.
We hope you found this research interesting and useful, and as always we are happy to hear your feedback.
Happy hunting!


[Content_types].xml shared between threat actors

[Content_Type].xml sha256

Shared by


APT33, APT32


APT29, APT28


FIN7, Gamaredon, APT28, APT32


FIN7, APT33, TA505, Mustang Panda


Gamaredon, APT33


Gamaredon, Hazy Tiger, APT33,


Razor Tiger, APT28, UAC-0099


Razor Tiger, SideCopy


Gamaredon, APT37, Mustang Panda, APT28, UAC-0099, SideCopy


FIN7, Hazy Tiger


Mustang Panda, APT32

styles.xml shared between threat actors

Styles.xml sha256

Shared by


APT28, UAC-0099, Razor Tiger


Hazy Tiger, Gamaredon, APT33


TA505, Gamaredon


APT28, FIN7, Razor Tiger, APT32, APT33


Hazy Tiger, FIN7


APT32, SideCopy, Mustang Panda, Razor Tiger