Retrohunt

Last updated:
July 2, 2024
User Manual

Retrohunt can find code-wise similar samples of malware with a single click within seconds. This unique capability significantly improves upon the traditional YARA-based approach for retrohunting, which requires more skills and time and is generally less effective at finding malware versions.

A Retrohunt searches through your private data and a global threat feed containing over 100+ millions unique samples, with no limit on how far back in time it searches.

To launch a Retrohunt, click the Retrohunt button on any piece of executable code, such as a submitted file or memory region. Retrohunt is not supported on non-executable code or files, such as scripts or documents.

Each Retrohunt searches for similar samples based on all functions in the binary, excluding benign functions. This prevents finding similar samples based on legitimately shared code, such as compiler code or benign third-party libraries.

In the pictured example, a Retrohunt will search for the 75 relevant functions while excluding 33 benign functions. Retrohunt is agnostic to actual code detections, working equally on detected and undetected code.

Similarity between functions and thus samples is determined by identifying structural similarities rather than byte sequences or patterns. This technology, which is the same as that used for code detections, is explained in detail here.

Once a Retrohunt is launched, a search query (1) is issued. The search syntax and view results are explained here. The hash in the query serves as the selector for the code underlying the binary file with this hash.

The search results (2) are displayed like any other query, listing analyses containing similar code. Three additional columns are included: the similarity score (3), which the table is sorted by, the function counts (4) indicating how many functions the query and match contain, and the match location in the analysis (5). In this example, the matching code is located in the process with PID 1872 at the base address 0x4b0000. Clicking the location entry opens the analysis report and navigates directly to this region for easy pivoting.

You can also Retrohunt for just a selection of functions in a binary file using the IDA plugin described here.