meta_pe: fix rich header length check for hash calculation #50

knowmalware · 2016-09-23T01:16:28Z

The original Rich Signature write-up:
http://www.ntcore.com/files/richsign.htm
searches 400 bytes for the "Rich" string. The pefile module
searches 128 bytes for the string. I have found that 128 is
sometimes not enough, and 400 feels rather large, so I have
choosen a round (hex) value in between.

I also take a lesson from the original write-up and search for
NULL values, but added a search for the PE header as well.

Note that because we rely on pefile module for the Rich Header
Values, that array will be incomplete, and thus differ from that
used for hash calculation, until pefile itself is fixed.

The original Rich Signature write-up: http://www.ntcore.com/files/richsign.htm searches 400 bytes for the "Rich" string. The pefile module searches 128 bytes for the string. I have found that 128 is sometimes not enough, and 400 feels rather large, so I have choosen a round (hex) value in between. I also take a lesson from the original write-up and search for NULL values, but added a search for the PE header as well. Note that because we rely on pefile module for the Rich Header Values, that array will be incomplete, and thus differ from that used for hash calculation, until pefile itself is fixed.

marnao · 2017-01-18T13:45:36Z

@knowmalware thanks for the pull request.. we definitely need to increase the search area for the rich header ending, although I'm not sure what the optimal value is. Your guess is as good as any.

I'm not sure I follow the other part of your modification, specifically around looking for the null values and PE header. When would this be useful? Do you have any samples you could point to?

knowmalware · 2017-01-20T03:02:59Z

The other part of my modification only matters when the Rich header has been tampered with or replaced. The search for the PE header is the first attempt to find the end, as the Rich header should be right before the PE header. The search for NULL values is the fall-back, as the Rich header should not contain any NULL dwords. In practice, I usually see a set of NULL bytes before the PE header, so it made sense to me.

If you're uncomfortable with this, I can change the PR to just not produce a hash if the Rich string doesn't exist, so that the code doesn't cause an exception when analyzing a PE file produced by a non-Microsoft compiler. But I'd prefer to leave it as-is for malware analysis purposes, as any changes to the Rich header could still be interesting from a similarity perspective.

marnao · 2017-02-03T18:41:46Z

@wxsBSD would you mind reviewing this change? You're probably more familiar with this stuff than I am given your work on yara.

wxsBSD · 2017-02-05T03:46:39Z

Shouldn't it be possible to go from the Rich header start to (at most) the NT header start? IE: It should be from 0x80 to uint32(0x3c). You can also be extra careful and ensure it ends with DanS.

wxsBSD · 2017-02-05T12:39:17Z

Also, starting at 0x80 works because nobody ever changes the size of the DOS stub. The right thing to do is calculate the starting offset and ensure it is Rich.

wxsBSD · 2017-02-06T14:42:43Z

I haven't looked at Frank's code, but does erocarrera/pefile@a3e5d09 not look like it makes rich header parsing more robust, in the manner I'm describing? It seems to search for the ending up to the NT header.

knowmalware · 2017-02-07T03:35:20Z

I half agree with Wes. I'll change my PR to search from 0x80 to pe.NT_HEADERS.get_file_offset(), which is what pefile should be doing instead of searching through to the OPTIONAL_HEADER.

If there's non-null bytes between 0x80 and the start of the PE header, I'd still be interested in that result from a malware analysis perspective, but perhaps it should be called something other than the rich sig in that case. I'll updated this PR accordingly.

agrajag9

Mods at 247 look good, but mods at 257 can be better accounted for by code in issue 55 (#55)

The latest version of pefile provides easy access to the deobfuscated rich header by accessing the 'clear_data' key of the parsed rich header.

knowmalware · 2017-09-02T03:19:05Z

Latest release of pefile makes this much easier. Updated the code to use the clear data exposed by pefile.

knowmalware · 2017-09-02T03:21:03Z

should also fix #55

agrajag9 suggested changes Mar 24, 2017

View reviewed changes

Simplify rich header hash calculation with latest pefile.

d99c819

The latest version of pefile provides easy access to the deobfuscated rich header by accessing the 'clear_data' key of the parsed rich header.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

meta_pe: fix rich header length check for hash calculation #50

meta_pe: fix rich header length check for hash calculation #50

knowmalware commented Sep 23, 2016

marnao commented Jan 18, 2017

knowmalware commented Jan 20, 2017

marnao commented Feb 3, 2017

wxsBSD commented Feb 5, 2017

wxsBSD commented Feb 5, 2017

wxsBSD commented Feb 6, 2017

knowmalware commented Feb 7, 2017

agrajag9 left a comment

knowmalware commented Sep 2, 2017

knowmalware commented Sep 2, 2017

meta_pe: fix rich header length check for hash calculation #50

Are you sure you want to change the base?

meta_pe: fix rich header length check for hash calculation #50

Conversation

knowmalware commented Sep 23, 2016

marnao commented Jan 18, 2017

knowmalware commented Jan 20, 2017

marnao commented Feb 3, 2017

wxsBSD commented Feb 5, 2017

wxsBSD commented Feb 5, 2017

wxsBSD commented Feb 6, 2017

knowmalware commented Feb 7, 2017

agrajag9 left a comment

Choose a reason for hiding this comment

knowmalware commented Sep 2, 2017

knowmalware commented Sep 2, 2017