Sunday, December 25, 2011

, ,

Moving to Google App Engine

Sunday, December 25, 2011 Marco de la Vega 22 comments
We are releasing a new VirusTotal version, most of you will not notice much of a change (other than the new layout), nonetheless, it is full of exciting features.  So as to describe them, we are also launching this new blog.

The most noticeable difference is in its backstage, we have moved to Google App Engine. We expect this to bring transparent scalability and high availability (cross your fingers) to VirusTotal. Some months ago we migrated our private API to Google's infrastructure and we could not be happier with the decision, we have forgotten about administration and we can now focus exclusively on coding.

You will also perceive that very often your file uploads will be immediate, this is because we have made use of HTML5 (in those browsers that support it) to compute the hash of the file on the client-side so as to avoid submissions of files that are already present in our store. Additionally, the maximum allowed file size has been increased to 32MB, in coherence with the App Engine's request handler limits.

Regarding the URL scanner, it is probably the VirusTotal feature that has experienced the greatest number of changes:
  • Thanks to App Engine's services the analysis is much faster than before.
  • We have integrated a couple of new engines (VX Vault, SCUMWARE.org, CLEAN MX, etc.) and the total number of scanners now adds up to 19.
  • We added an extended additional information section to the URL reports which includes detailed information returned by the scanning engines (Trend Micro description, Websense ThreatSeeker category, etc.) and by other services that provide information related to the domain/host of the scanned URL (e.g. EXPOSURE).
  • As its predecessor, the new URL scanner also downloads the files (response content) at the scanned URLs, however, this new version will only enqueue for antivirus scanning those files that are not text or similar content (HTML, XML, etc.).
  • With independence of the nature of the response content, the URL scanner will always record the server response headers, this might prove itself useful in tracking the bad guys since very often they will be making use of customised server setups that return certain headers that may be used for fingerprinting.
Neither have we forgotten about the public API and we are releasing its second version, improving the response format so as to be able to add new information to it in the future without having to change the parsers that you might have in place. This new API provides a closer integration with a new version of the private API, so that moving from one to another is far easier than before.

VirusTotal Community has also been subjected to several modifications. You may now vote a file or URL as malicious or harmless without having to comment on it. These votes are used (along with other notions provided by the tools present in VirusTotal) to build a file/URL reputation index that replaces the old safety score. This new index runs from -100 (unanimously malicious) to 100 (unanimously harmless). At the same time, there are no longer standard tags in comments, it must be you (making use of the hashtag - # - symbol) the one that explicitly defines a tag for the comment. Other changes include a new user reputation system that is detailed in the corresponding VirusTotal documentation section.

But not everything have been improvements, unfortunately, in this very first App Engine release the twitter-like public profile comments have been removed and the statistics section has been considerably reduced. The latter is something we pretend to improve over the coming weeks along with other new features. Our roadmap for the near future would, thus, look something like this:
  • Recover the ability to compact VirusTotal reports or transform them to popular formats such as bbcode, HTML, CSV, etc.
  • Include the NSRL file information, which has not been migrated for this very first App Engine release.
  • Improve the statistics section, including not only file scanning indicators but also URL scanning statistics and VirusTotal Community activity.
  • Expand the number of notions in the VirusTotal Community tab so as to create a greater buzz.
  • Include new URL and domain scanners: Malware Domain Blocklist, Palevo Tracker, Malware Patrol, etc.
  • Allow VirusTotal API users to define a URL where their scan results can be posted back as soon as they are available so as to avoid periodic polling for result retrieving.
  • Appoint VirusTotal Community moderators that may ban offensive comments, track down users faking their own or other users' reputation, and ensure the overall quality of the comments in the Community.
  • Build a malware research board that complements VirusTotal Community.
  • Translate the site to as many languages as possible.
And some other features that must remain confidential for the time being which we are completely sure that will delight software developers and site owners.

As usual, we would love to receive your feedback and suggestions, and we hope the new release results in a better VirusTotal experience.