-
Notifications
You must be signed in to change notification settings - Fork 83
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Number states of the env #70
Comments
Here's what I'd say, you may want to cross-check this. Consider a random room from the standard The total number of items is the sum of the item type counts: Note that box-targets are technically just regular floor tiles, as they can be either empty, occupied by a player or a box. To calculate the number of arrangements, we can use the concept of permutations of multisets. Since the boxes are identical, the arrangements of the boxes themselves do not contribute to different possibilities. Furthermore, we know that the floor tiles are also identical and cannot be distinguished from one another. Therefore, the number of arrangements can be calculated like this: Or, more generally. if the multiplicities of the elements of a finite multiset For Sokoban, the room size, box amount and even player amount can vary. So let's define a function that calculates the state space given an arbitrary environment to find out how many possible states there are. def calculate_state_space(env_or_obs):
# allow for both env and observation as input
if isinstance(env_or_obs, gym.Env):
arr_walls, arr_goals, arr_boxes, arr_player = env_or_obs.render(mode='raw')
else:
arr_walls, arr_goals, arr_boxes, arr_player = env_or_obs # assuming render mode is 'raw'
room_size = int(np.sum((arr_walls-1)*-1))
boxes = int(np.sum(arr_boxes))
players = int(np.sum(arr_player))
floor = room_size-boxes-players
state_amount = math.factorial(room_size) / (math.factorial(players) * math.factorial(boxes) * math.factorial(floor))
return int(state_amount) env = gym.make('Sokoban-v0')
env.reset()
print(calculate_state_space(env)) |
How I can get the number states of env? I try to used policy iteration and value iteration to solve this!
The text was updated successfully, but these errors were encountered: