-
Notifications
You must be signed in to change notification settings - Fork 156
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
meta_pe: fix rich header length check for hash calculation #50
base: master
Are you sure you want to change the base?
Conversation
The original Rich Signature write-up: http://www.ntcore.com/files/richsign.htm searches 400 bytes for the "Rich" string. The pefile module searches 128 bytes for the string. I have found that 128 is sometimes not enough, and 400 feels rather large, so I have choosen a round (hex) value in between. I also take a lesson from the original write-up and search for NULL values, but added a search for the PE header as well. Note that because we rely on pefile module for the Rich Header Values, that array will be incomplete, and thus differ from that used for hash calculation, until pefile itself is fixed.
@knowmalware thanks for the pull request.. we definitely need to increase the search area for the rich header ending, although I'm not sure what the optimal value is. Your guess is as good as any. I'm not sure I follow the other part of your modification, specifically around looking for the null values and PE header. When would this be useful? Do you have any samples you could point to? |
The other part of my modification only matters when the Rich header has been tampered with or replaced. The search for the PE header is the first attempt to find the end, as the Rich header should be right before the PE header. The search for NULL values is the fall-back, as the Rich header should not contain any NULL dwords. In practice, I usually see a set of NULL bytes before the PE header, so it made sense to me. If you're uncomfortable with this, I can change the PR to just not produce a hash if the Rich string doesn't exist, so that the code doesn't cause an exception when analyzing a PE file produced by a non-Microsoft compiler. But I'd prefer to leave it as-is for malware analysis purposes, as any changes to the Rich header could still be interesting from a similarity perspective. |
@wxsBSD would you mind reviewing this change? You're probably more familiar with this stuff than I am given your work on yara. |
Shouldn't it be possible to go from the Rich header start to (at most) the NT header start? IE: It should be from 0x80 to uint32(0x3c). You can also be extra careful and ensure it ends with DanS. |
Also, starting at 0x80 works because nobody ever changes the size of the DOS stub. The right thing to do is calculate the starting offset and ensure it is Rich. |
I haven't looked at Frank's code, but does erocarrera/pefile@a3e5d09 not look like it makes rich header parsing more robust, in the manner I'm describing? It seems to search for the ending up to the NT header. |
I half agree with Wes. I'll change my PR to search from 0x80 to If there's non-null bytes between 0x80 and the start of the PE header, I'd still be interested in that result from a malware analysis perspective, but perhaps it should be called something other than the rich sig in that case. I'll updated this PR accordingly. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mods at 247 look good, but mods at 257 can be better accounted for by code in issue 55 (#55)
The latest version of pefile provides easy access to the deobfuscated rich header by accessing the 'clear_data' key of the parsed rich header.
Latest release of pefile makes this much easier. Updated the code to use the clear data exposed by pefile. |
should also fix #55 |
The original Rich Signature write-up:
http://www.ntcore.com/files/richsign.htm
searches 400 bytes for the "Rich" string. The pefile module
searches 128 bytes for the string. I have found that 128 is
sometimes not enough, and 400 feels rather large, so I have
choosen a round (hex) value in between.
I also take a lesson from the original write-up and search for
NULL values, but added a search for the PE header as well.
Note that because we rely on pefile module for the Rich Header
Values, that array will be incomplete, and thus differ from that
used for hash calculation, until pefile itself is fixed.