Monday, May 20, 2024

YARA is dead, long live YARA-X

For over 15 years, YARA has been growing and evolving until it became an indispensable tool in every malware researcher's toolbox. Throughout this time YARA has seen numerous updates, with new features added and countless bugs fixed. But today, I'm excited to announce the biggest change yet: a full rewrite.

YARA-X is a completely new implementation of YARA in Rust, and it has the following goals:

  • Better user experience: The new command-line interface is more modern and colorful, and error reports are now more explicative. More features aimed at improving the user's experience will be incorporated in the future.

  • Rule-level compatibility: While achieving 100% compatibility is tough, our aim is to make YARA-X 99% compatible with YARA at the rule level. Incompatibilities should be minimal and thoroughly documented.

  • Improved performance: YARA is known for its speed, but certain rules, especially those utilizing regular expressions or complex loops, can slow it down. YARA-X excels with these rules, often delivering significantly faster results. Our ultimate goal is for YARA-X to outperform YARA across the board.

  • Enhanced reliability and security: YARA's complexity in C code can lead to bugs and security vulnerabilities. YARA-X is built with Rust, offering greater reliability and security.

  • Developer-friendly: We're prioritizing ease of integration into other projects and simplified maintenance. Official APIs for Python, Golang, and C are provided to facilitate seamless integration. YARA-X also addresses some of the design flaws that made YARA challenging to maintain and extend.

Why a rewrite?

Was a complete rewrite necessary to achieve such goals? This question lingered in my mind for a long time before deciding to rewrite YARA. Rewriting is risky, it introduces new bugs, backward compatibility issues, and doubles the maintenance efforts, since legacy code doesn't disappear after launching the new system. In fact, the legacy system may be still in use for years, if not decades.

However, I believe a rewrite was the right decision for multiple reasons:

  • YARA is not a large project, it's a medium-size project that lacks subsystems or components large enough to be migrated in isolation. Incremental migration to Rust was impractical because large portions of the code are interconnected.
  • The improvements I envisioned required significant design changes. Implementing these in the existing C codebase would involve extensive rewrites, carrying the same risks as starting fresh with Rust.
  • After a year of working on the project, I’ve found Rust easier to maintain than C. Rust offers stronger reliability guarantees and simplifies integrating third-party code, especially for multi-platform projects.

Is YARA really dead?

Despite the dramatic title of this post, YARA is not actually dead. I’m aware that many people and organizations rely on YARA to get important work done, and I don’t want to let them down.

YARA is still being maintained, and future releases will include bug fixes and minor features. However, don’t expect new large features or modules. All efforts to enhance YARA, including the addition of new modules, will now focus on YARA-X.

What's the current state of YARA-X?

YARA-X is still in beta, but is mature and stable enough for use, specially from the command-line interface or one-shot Python scripts. While the APIs may still undergo minor changes, the foundational aspects are already established.

At VirusTotal, we have been running YARA-X alongside YARA for a while, scanning millions of files with tens of thousands of rules, and addressing discrepancies between the two. This means that YARA-X is already battle-tested. These tests have even uncovered YARA bugs!

Please test YARA-X and don't hesitate to open an issue if you find a bug or some feature that you want to see implemented.

What's next?

My aim is to surpass YARA in every possible aspect with YARA-X. I want it to be so superior that existing YARA users willingly migrate to YARA-X for its undeniable advantages, not because they are forced to do so.

Publishing a beta version is only the first step towards this goal. I'll continue to enhance YARA-X, releasing updates and sharing insights through blog posts like this one.

Stay tuned, because this journey has only just begun.

Wednesday, May 15, 2024

Crowdsourced AI += ByteDefend

We are pleased to announce the integration of a new solution into our Crowdsourced AI initiative. This model, developed by Dr. Ran Dubin from the Department of Computer Science at Ariel University and head of ByteDefend Cyber Lab at the Ariel Cyber Innovation Center, is designed to analyze suspicious macros in Microsoft Office files, including Word, Excel, and PowerPoint.

VirusTotal's Crowdsourced AI initiative leverages various AI models and community contributions to strengthen cyber defense strategies. Like any other security solution, AI-based models are not infallible, but they offer invaluable contributions by complementing other technologies in analyzing and detecting new threats. The integration of ByteDefend enhances VirusTotal's Code Insight capabilities, currently with up to three independent AI engines for Microsoft Office documents.

Here is the most recent example at the time of writing: all three models agree that the analyzed XLS file is malicious, each providing different levels of detail.


Here's another example where the models don't agree. ByteDefend flags a DOC file as malicious, while Hispasec's engine says it's benign. These disagreements are interesting because even though the final verdict can be subjective depending on the context (what's risky in one situation might not be in another), the models clearly explain how the macros work. This gives the human analyst all the information they need to make the final call..


AI reports’ results are available via VT Intelligence, allowing the use of the "bytedefend_ai_analysis:" modifier to search into the resulting AI’s output, and "bytedefend_ai_verdict:" to search by verdict - malicious or benign. As an example, below we show the results of searching for ByteDefend reports where "telegram" is mentioned and the verdict is "malicious". This search is performed using the following query: bytedefend_ai_analysis:telegram and bytedefend_ai_verdict:malicious


We extend our thanks to Dr. Ran Dubin and the ByteDefend Cyber Lab for their valuable contribution to VirusTotal's Crowdsourced AI initiative. We are continuously working to expand this effort by welcoming more contributors with diverse skills and expertise. Our goal is to build a collaborative and powerful defense strategy to tackle the constantly evolving landscape of cyber threats. We encourage others in the security community to join us in this effort.

Monday, May 06, 2024

VirusTotal's Mission Continues: Sharing Knowledge, Protecting Together

With the recent announcement of Google Threat Intelligence, I want to take this opportunity, as VirusTotal's founder, to directly address our community and reiterate our unwavering commitment to our core mission.

First and foremost, I want to assure our entire community, from security researchers and industry partners to individual users, that VirusTotal's core mission remains unchanged. We remain deeply dedicated to collective intelligence and collaboration, fostering a platform where everyone can come together to share knowledge, access valuable threat information, and contribute to the fight against cyber threats.

Google Threat Intelligence is a new offering that builds upon the strengths of Google, Mandiant, VirusTotal, and other sources. It will be available as a premium tier, evolving the existing VirusTotal Enterprise platform, as well as the Mandiant Advantage Threat Intelligence one.

Importantly, VirusTotal remains committed to a level playing field, ensuring all partners, including Google Threat Intelligence, have equal access to the crowdsourced data VirusTotal collects. We also want to assure you that the core features and functionalities of VirusTotal will remain free and accessible to everyone, as always.

The strength of VirusTotal lies in its network of contributors and the vast amount of data they provide. This data serves as a valuable resource for the entire security industry, empowering our partners and others to enhance their products and contribute to a more secure digital world. This collaborative approach, based on transparency and equal access, strengthens the industry as a whole, ultimately leading to better protection for everyone.

We understand that change can be unsettling, but we want to assure you that VirusTotal is here to stay. We are excited about the future and the opportunity to continue sharing knowledge and protecting together with all of you, making the digital world a safer place through the power of collective intelligence.

Thank you for your continued support.

Bernardo Quintero
Founder of VirusTotal