Microsoft Office may be by far the most widely used suite for processing documents, spreadsheets, and
presentations. Due to its popularity, it is continuously utilised to carry out malicious campaigns. Threat
actors, exploiting the platform’s dynamic features, use it to launch their attacks and penetrate millions of
hosts in their campaigns.
This work explores the modern landscape of malicious Microsoft Office documents, exposing the means
that malware authors use. We leverage a taxonomy of the tools used to weaponise Microsoft Office documents and explore the modus operandi of malicious actors. Moreover, we generated and publicly shared
a specially crafted dataset, which relies on incorporating benign and malicious documents containing
many dynamic features such as VBA macros and DDE. The latter is crucial for a fair and realistic analysis,
an open issue in the current state of the art. This allows us to draw safe conclusions on the malicious
features and behaviour. More precisely, we extract the necessary features with an automated analysis
pipeline to efficiently and accurately classify a document as benign or malicious using machine learning
with an F1 score above 0.98, outperforming the current state of the art detection algorithms.