Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

basic block tracing broken #68

Open
pbst opened this issue Dec 10, 2018 · 3 comments · May be fixed by angr/angr#1348
Open

basic block tracing broken #68

pbst opened this issue Dec 10, 2018 · 3 comments · May be fixed by angr/angr#1348

Comments

@pbst
Copy link
Contributor

pbst commented Dec 10, 2018

I already asked for help on this topic before in the slack help
channel
. Then I
tried to use driller for Linux ELF binaries. Now I am running it against the
CGC challenge and also documented the process a lot more verbose.

The official driller example looks
similar to the code snipped I used. In order to test different binaries the
path to the target binary has to be changed accordingly.

#!/usr/bin/env python3

import driller
import logging

logging.getLogger('driller.driller').setLevel('DEBUG')
logging.getLogger('angr.exploration_techniques.tracer').setLevel('DEBUG')

d = driller.Driller("./CADET_00001",  # path to the target binary
                    b"racecar", # initial testcase
                    b"\xff" * 65535, # AFL bitmap with no discovered transitions
                   )

new_inputs = d.drill()

print("new_inputs", new_inputs)

Everything works nice when I use the
CADET_00001
challenge binary taken from the angr-doc repo.

Now. This is not true for CADET_00001.2018-12-03(see attached).
This binary was built using the cgc-linux-dev box from darpa cgc repo.

DEBUG   | 2018-12-09 11:13:53,779 | driller.driller | [CADET_00001.2018-12-03] drilling started on Sun Dec  9 11:13:53 2018.
WARNING | 2018-12-09 11:13:53,780 | driller.driller | Debug directory is not set. Will not log fuzzing bitmap.
DEBUG   | 2018-12-09 11:13:53,947 | driller.driller | Drilling into b'racecar'.
DEBUG   | 2018-12-09 11:13:53,948 | driller.driller | Input is b'racecar'.
DEBUG   | 2018-12-09 11:13:53,955 | angr.exploration_techniques.tracer | Trace: 1/241
DEBUG   | 2018-12-09 11:13:53,966 | angr.exploration_techniques.tracer | Trace: 2/241
DEBUG   | 2018-12-09 11:13:53,982 | angr.exploration_techniques.tracer | Trace: 3/241
DEBUG   | 2018-12-09 11:13:53,990 | angr.exploration_techniques.tracer | Trace: 4/241
DEBUG   | 2018-12-09 11:13:54,031 | angr.exploration_techniques.tracer | Trace: 5/241
DEBUG   | 2018-12-09 11:13:54,031 | driller.driller | Found a diverted state, exploring to some extent.
DEBUG   | 2018-12-09 11:13:54,032 | driller.driller | [CADET_00001.2018-12-03] dumping input for 0x8048716 -> 0x8048601.
DEBUG   | 2018-12-09 11:13:54,032 | driller.driller | Generated: b''
DEBUG   | 2018-12-09 11:13:54,033 | driller.driller | [CADET_00001.2018-12-03] started symbolic exploration at Sun Dec  9 11:13:54 2018.
DEBUG   | 2018-12-09 11:13:56,330 | driller.driller | [CADET_00001.2018-12-03] stopped symbolic exploration at Sun Dec  9 11:13:56 2018.
DEBUG   | 2018-12-09 11:13:56,344 | angr.exploration_techniques.tracer | Trace: 6/241
Traceback (most recent call last):
  File "./doit1.py", line 14, in <module>
    new_inputs = d.drill()
  File "/home/phil/src/angr-dev/driller/driller/driller_main.py", line 85, in drill
    list(self._drill_input())
  File "/home/phil/src/angr-dev/driller/driller/driller_main.py", line 141, in _drill_input
    simgr.step()
  File "/home/phil/src/angr-dev/angr/angr/misc/hookset.py", line 75, in __call__
    result = current_hook(self.func.__self__, *args, **kwargs)
  File "/home/phil/src/angr-dev/angr/angr/exploration_techniques/driller_core.py", line 39, in step
    simgr.step(stash=stash, **kwargs)
  File "/home/phil/src/angr-dev/angr/angr/misc/hookset.py", line 75, in __call__
    result = current_hook(self.func.__self__, *args, **kwargs)
  File "/home/phil/src/angr-dev/angr/angr/exploration_techniques/tracer.py", line 115, in step
    return simgr.step(stash=stash, **kwargs)
  File "/home/phil/src/angr-dev/angr/angr/misc/hookset.py", line 80, in __call__
    return self.func(*args, **kwargs)
  File "/home/phil/src/angr-dev/angr/angr/sim_manager.py", line 344, in step
    successors = self.step_state(state, successor_func=successor_func, **run_args)
  File "/home/phil/src/angr-dev/angr/angr/misc/hookset.py", line 75, in __call__
    result = current_hook(self.func.__self__, *args, **kwargs)
  File "/home/phil/src/angr-dev/angr/angr/exploration_techniques/tracer.py", line 146, in step_state
    assert len(succs_dict[None]) == 1
AssertionError

After I added some extra debug output it looks like this.

DEBUG   | 2018-12-09 21:23:04,365 | driller.driller | [CADET_00001.2018-12-03] drilling started on Sun Dec  9 21:23:04 2018.
WARNING | 2018-12-09 21:23:04,365 | driller.driller | Debug directory is not set. Will not log fuzzing bitmap.
DEBUG   | 2018-12-09 21:23:04,510 | driller.driller | Drilling into b'racecar'.
DEBUG   | 2018-12-09 21:23:04,510 | driller.driller | Input is b'racecar'.
DEBUG   | 2018-12-09 21:23:04,517 | angr.exploration_techniques.tracer | state.history.events[-1].constant.ast <Bool file_0_stdin_6_6_8{UNINITIALIZED} == 114>
DEBUG   | 2018-12-09 21:23:04,518 | angr.exploration_techniques.tracer | Trace: 1/241 (0x8048705)
DEBUG   | 2018-12-09 21:23:04,527 | angr.exploration_techniques.tracer | state.history.events[-1].constant.ast <Bool file_0_stdin_6_6_8{UNINITIALIZED} == 114>
DEBUG   | 2018-12-09 21:23:04,528 | angr.exploration_techniques.tracer | Trace: 2/241 (0x8048735)
DEBUG   | 2018-12-09 21:23:04,542 | angr.exploration_techniques.tracer | state.history.events[-1].constant.ast <Bool file_0_stdin_6_6_8{UNINITIALIZED} == 114>
DEBUG   | 2018-12-09 21:23:04,543 | angr.exploration_techniques.tracer | Trace: 3/241 (0x804874c)
DEBUG   | 2018-12-09 21:23:04,550 | angr.exploration_techniques.tracer | state.history.events[-1].constant.ast <Bool file_0_stdin_6_6_8{UNINITIALIZED} == 114>
DEBUG   | 2018-12-09 21:23:04,550 | angr.exploration_techniques.tracer | Trace: 4/241 (0x8048716)
DEBUG   | 2018-12-09 21:23:04,586 | angr.exploration_techniques.tracer | state.history.events[-1].constant.ast <Bool file_0_stdin_6_6_8{UNINITIALIZED} == 114>
DEBUG   | 2018-12-09 21:23:04,586 | angr.exploration_techniques.tracer | Trace: 5/241 (0x8048731)
DEBUG   | 2018-12-09 21:23:04,587 | driller.driller | Found a diverted state, exploring to some extent.
DEBUG   | 2018-12-09 21:23:04,587 | driller.driller | [CADET_00001.2018-12-03] dumping input for 0x8048716 -> 0x8048601.
DEBUG   | 2018-12-09 21:23:04,587 | driller.driller | Generated: b''
DEBUG   | 2018-12-09 21:23:04,588 | driller.driller | [CADET_00001.2018-12-03] started symbolic exploration at Sun Dec  9 21:23:04 2018.
DEBUG   | 2018-12-09 21:23:06,662 | driller.driller | [CADET_00001.2018-12-03] stopped symbolic exploration at Sun Dec  9 21:23:06 2018.
DEBUG   | 2018-12-09 21:23:06,673 | angr.exploration_techniques.tracer | state.history.events[-1].constant.ast <Bool False>
DEBUG   | 2018-12-09 21:23:06,674 | angr.exploration_techniques.tracer | Trace: 6/241 (0x8048601)
Traceback (most recent call last):
  File "./doit1.py", line 16, in <module>
    new_inputs = d.drill()
  File "/home/phil/src/angr-dev/driller/driller/driller_main.py", line 85, in drill
    list(self._drill_input())
  File "/home/phil/src/angr-dev/driller/driller/driller_main.py", line 141, in _drill_input
    simgr.step()
  File "/home/phil/src/angr-dev/angr/angr/misc/hookset.py", line 75, in __call__
    result = current_hook(self.func.__self__, *args, **kwargs)
  File "/home/phil/src/angr-dev/angr/angr/exploration_techniques/driller_core.py", line 39, in step
    simgr.step(stash=stash, **kwargs)
  File "/home/phil/src/angr-dev/angr/angr/misc/hookset.py", line 75, in __call__
    result = current_hook(self.func.__self__, *args, **kwargs)
  File "/home/phil/src/angr-dev/angr/angr/exploration_techniques/tracer.py", line 115, in step
    return simgr.step(stash=stash, **kwargs)
  File "/home/phil/src/angr-dev/angr/angr/misc/hookset.py", line 80, in __call__
    return self.func(*args, **kwargs)
  File "/home/phil/src/angr-dev/angr/angr/sim_manager.py", line 344, in step
    successors = self.step_state(state, successor_func=successor_func, **run_args)
  File "/home/phil/src/angr-dev/angr/angr/misc/hookset.py", line 75, in __call__
    result = current_hook(self.func.__self__, *args, **kwargs)
  File "/home/phil/src/angr-dev/angr/angr/exploration_techniques/tracer.py", line 154, in step_state
    assert len(succs_dict[None]) == 1
AssertionError

rhelmot wrote "if the constraint is False it means it was a false comparison
between two concrete values, which is very scary".

A False constraint appears before we even enter the main function.

In the CADET_00001.2018-12-03 binary the main function is not at the entry
point, but there is some init function before it. For the testcase binary
CADET_00001 the entrypoint points sits right on it's main function.

I patched CADET_00001.2018-12-03 entrypoint from 0x080485fc to 0x08048601
(now refered to as CADET_00001.2018-12-03.fix). This will skip the init
function and go right into the main.

Driller seems to work fine on CADET_00001.2018-12-03.fix.

All the init stub does is:

  1. call a subroutine with static arguments that effectively does nothing.
  2. zero out the parity flag, zero flag, sign flag
  3. return 0

To my understanding only the lowest address of a basic block should be in the
trace. The last non-False entry in the log points at 0x08048731, this area does not
look like the beginning or end of a basic block.

0x0804872f      50             push eax
0x08048730      9d             popfd
0x08048731      6a00           push 0

The qemu trace logs 0x08048731.

Trace 0x55c7453ec350 [080485fc]
Trace 0x55c7453ec3a0 [08048705]
Trace 0x55c7453ec420 [08048735]
Trace 0x55c7453ec4e0 [0804874c]
Trace 0x55c7453ec540 [08048716]
Trace 0x55c7453ec650 [08048731]
Trace 0x55c7453ec6a0 [08048601]
[...]

When the push eax; popfd get nopped out everything works fine.

If we leave the binary untouched but remove the address 0x08048731 from the
qemu_runner trace it works too.

So I guess driller/angr/vex/... don't like it when they get fed basic blocks
that aren't basic blocks? Where would I have to look next in order to fix this issue or
get it fixed?

Environment

I am using angr-dev, so the version of driller and its dependencies should be
fairly recent. The OS is Ubuntu 18.04 on amd64.

Files

zip contains all the mentioned CADET binaries
CADET_00001.zip

@rhelmot
Copy link
Member

rhelmot commented Dec 10, 2018

The issue of how to synchronize qemu and angr's differing notions of basic blocks is a perpetual headache for tracer. Tracer was refactored recently in order to make it less of a pile of spaghetti code and more extensible, and it looks like we lost track of whatever trick we were using beforehand to solve this particular edge case.

The tracer refactor mostly tried to make explicit each edge case that would need to be synchronized. If you look into exploration_techniques/tracer.py, the _update_state_tracking function does this more or less. Each if-branch constitutes a different kind of easily fixable desync, until the bottom case which calls _analyze_misfollow, which is for more complicated desyncs. I think this should fall under the latter category. So basically if you can identify the root cause behind this desync (does qemu or angr just end basic blocks after popfd?) you can add a case for it there.

@pbst
Copy link
Contributor Author

pbst commented Dec 11, 2018

Inside _update_state_tracking this example gets handled as the 'normal case'.

The jumpkind gets detected as Ijk_EmWarn. Does it sound reasonable to checking for this specific
jumpkind and then skip the step be a reasonable heuristic?

Would a Ijk_EmWarn appear under other 'normal' circumstances? To my understanding it only shows up when there is a 'emulation critical (FATAL) error', so this would just be a recovery attempt to get back on the trace by skipping (only?) invalid basic blocks.

@pbst pbst linked a pull request Dec 16, 2018 that will close this issue
@marekzmyslowski
Copy link

I'm having similar issue. I'm using the shellphish/mechaphish docker image and very simple example compiled with gcc. The code is executed with shellphuzz -c 1 -d 1 /home/angr/a.out

#include <stdio.h>
#include <unistd.h>

int main(int argc, char *argv[]) {
  char buffer[6] = {0};
  int i;
  int *null = 0;

  read(0, buffer, 6);
  if (buffer[0] == '7' && buffer[1] == '/' && buffer[2] == '4'
      && buffer[3] == '2' && buffer[4] == 'a' && buffer[5] == '8') {
    i = *null;
  }

  puts("No problem");
}
WARNING | 2019-03-12 14:38:01,455 | angr.state_plugins.symbolic_memory | Filling memory at 0x7ffffffffff0028 with 8 unconstrained bytes referenced from 0x400ee1 (__libc_start_main+0x71 in a.out (0x400ee1))
WARNING | 2019-03-12 14:38:01,485 | angr.state_plugins.symbolic_memory | Filling memory at 0x7ffffffffff0030 with 8 unconstrained bytes referenced from 0x400ee1 (__libc_start_main+0x71 in a.out (0x400ee1))
WARNING | 2019-03-12 14:38:01,515 | angr.state_plugins.symbolic_memory | Filling memory at 0x7ffffffffff0038 with 8 unconstrained bytes referenced from 0x400ee1 (__libc_start_main+0x71 in a.out (0x400ee1))
WARNING | 2019-03-12 14:38:01,537 | angr.state_plugins.symbolic_memory | Filling memory at 0x7ffffffffff0040 with 8 unconstrained bytes referenced from 0x400ee1 (__libc_start_main+0x71 in a.out (0x400ee1))
WARNING | 2019-03-12 14:38:01,562 | angr.state_plugins.symbolic_memory | Filling memory at 0x7ffffffffff0048 with 8 unconstrained bytes referenced from 0x400ee1 (__libc_start_main+0x71 in a.out (0x400ee1))
Traceback (most recent call last):
  File "/home/angr/angr-dev/driller/driller/local_callback.py", line 122, in <module>
    for new_input in d.drill_generator():
  File "/home/angr/angr-dev/driller/driller/driller_main.py", line 101, in drill_generator
    for i in self._drill_input():
  File "/home/angr/angr-dev/driller/driller/driller_main.py", line 141, in _drill_input
    simgr.step()
  File "/home/angr/angr-dev/angr/angr/misc/hookset.py", line 75, in __call__
    result = current_hook(self.func.__self__, *args, **kwargs)
  File "/home/angr/angr-dev/angr/angr/exploration_techniques/driller_core.py", line 39, in step
    simgr.step(stash=stash, **kwargs)
  File "/home/angr/angr-dev/angr/angr/misc/hookset.py", line 75, in __call__
    result = current_hook(self.func.__self__, *args, **kwargs)
  File "/home/angr/angr-dev/angr/angr/exploration_techniques/tracer.py", line 115, in step
    return simgr.step(stash=stash, **kwargs)
  File "/home/angr/angr-dev/angr/angr/misc/hookset.py", line 80, in __call__
    return self.func(*args, **kwargs)
  File "/home/angr/angr-dev/angr/angr/sim_manager.py", line 343, in step
    successors = self.step_state(state, successor_func=successor_func, **run_args)
  File "/home/angr/angr-dev/angr/angr/misc/hookset.py", line 75, in __call__
    result = current_hook(self.func.__self__, *args, **kwargs)
  File "/home/angr/angr-dev/angr/angr/exploration_techniques/tracer.py", line 142, in step_state
    assert len(succs_dict[None]) == 1
AssertionError

Any Idea what the issue can be?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants