-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Speed up instr_zeroes_ymmh/zmmh() #1
base: master
Are you sure you want to change the base?
Conversation
Check operands before decoding the instruction because it's much faster.
If we have raw bits, then look at them instead of iterating over all encodings because it's much faster.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks overall reasonable to me, if you're happy with it...simpler than trying to get the flag even if the overhead maybe isn't quite as low.
byte *bits; | ||
|
||
/* Handle zeroall/vzeroupper special case. */ | ||
if (instr->opcode == OP_vzeroall || (zmm && instr->opcode == OP_vzeroupper)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Per DynamoRIO#6217 we want to remove the zmm
condition on vzeroupper.
vex_evex = false; | ||
num_prefixes = 0; | ||
bits = instr_get_raw_bits(instr); | ||
decode_sizeof(GLOBAL_DCONTEXT, bits, &num_prefixes, &rip_rel_pos); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Best to check the return value to ensure success.
bits = instr_get_raw_bits(instr); | ||
decode_sizeof(GLOBAL_DCONTEXT, bits, &num_prefixes, &rip_rel_pos); | ||
for (i = 0; i < num_prefixes; i++) { | ||
vex_evex |= bits[i] == VEX_2BYTE_PREFIX_OPCODE || |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is C where bool is not a first-class type and using bitwise operators can get into trouble...even when all components are pure logical 0-or-1 operations it still makes me nervous (maybe some numeric-as-bool condition will be added in the future and result in a subtle hard to find bug...it has happened before) and I would request logical rather than bitwise combination.
bool | ||
instr_zeroes_ymmh(instr_t *instr) | ||
static bool | ||
instr_zeroes_high(instr_t *instr, bool zmm) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
instr_zeroes_simd_high or instr_zeroes_high_simd to distinguish from GPRs?
vex_evex = false; | ||
num_prefixes = 0; | ||
bits = instr_get_raw_bits(instr); | ||
decode_sizeof(GLOBAL_DCONTEXT, bits, &num_prefixes, &rip_rel_pos); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh so you asked about the dcontext: right. It is used only for the ISA mode. Really we want to use the ISA mode of the instr, from instr_get_isa_mode(instr).
Hmm, this ends up being messy, unfotunately:
I see code in instr_get_eflags() among others which does this for such a situation:
dcontext_t *dcontext = get_thread_private_dcontext();
dr_isa_mode_t old_mode;
dr_set_isa_mode(dcontext, instr_get_isa_mode(instr), &old_mode);
decode_eflags_usage(dcontext, instr_get_raw_bits(instr), &instr->eflags,
DR_QUERY_INCLUDE_ALL);
dr_set_isa_mode(dcontext, old_mode, NULL);
But...for standalone decoding there is no thread-private dcontext, and it's racy to mess with the global dcontext for a local operation: that is DynamoRIO#1595 and it looks like nobody has tried to solve it.
A different solution would be to pass in the ISA mode instead of the dcontext, maybe to a non-exported inner function.
(This will only matter for mixed-bitwidth decoding.)
No description provided.