kyle8581's picture
upload
dd39c08
Action space
____________
The action space is a set of primitives that the agent can use to interact with the environment.
The primitives are divided into categories based on the type of interaction they perform.
Below a list of the primitives supported by BrowserGym:
+----------+-----------------------------------------------------------+----------------------------------------------------------------------------------------------+
| Category | Primitive | Description |
+==========+===========================================================+==============================================================================================+
| bid | fill(bid, text) | Fill an input field with text. |
+ +-----------------------------------------------------------+----------------------------------------------------------------------------------------------+
| | click(bid, button) | Click an element. |
+ +-----------------------------------------------------------+----------------------------------------------------------------------------------------------+
| | dblclick(bid, button) | Double-click an element. |
+ +-----------------------------------------------------------+----------------------------------------------------------------------------------------------+
| | hover(bid) | Hover the mouse over an element. |
+ +-----------------------------------------------------------+----------------------------------------------------------------------------------------------+
| | press(bid, key comb) | Focus an element and press a combination of keys. |
+ +-----------------------------------------------------------+----------------------------------------------------------------------------------------------+
| | focus(bid) | Focus an element. |
+ +-----------------------------------------------------------+----------------------------------------------------------------------------------------------+
| | clear(bid) | Clear an input field. |
+ +-----------------------------------------------------------+----------------------------------------------------------------------------------------------+
| | select_option(bid, options) | Select one or multiple options in a drop-down element. |
+ +-----------------------------------------------------------+----------------------------------------------------------------------------------------------+
| | drag_and_drop(from bid, to bid) | Drag and drop one element to another. |
+----------+-----------------------------------------------------------+----------------------------------------------------------------------------------------------+
| coord | mouse_move(x, y) | Move the mouse to a location. |
| | | then press and hold a mouse button. |
+ +-----------------------------------------------------------+----------------------------------------------------------------------------------------------+
| | mouse_up(x, y, button) | Move the mouse to a location then release a mouse button. |
+ +-----------------------------------------------------------+----------------------------------------------------------------------------------------------+
| | mouse_click(x, y, button) | Move the mouse to a location and click a mouse button. |
+ +-----------------------------------------------------------+----------------------------------------------------------------------------------------------+
| | mouse_dblclick(x, y, button) | Move the mouse to a location and double-click a mouse button. |
+ +-----------------------------------------------------------+----------------------------------------------------------------------------------------------+
| | mouse_drag_and_drop(from x, from y, to x, to y) | Drag and drop from a location to a location. |
+----------+-----------------------------------------------------------+----------------------------------------------------------------------------------------------+
| keyboard | keyboard_down(key) | Press and holds a keyboard key. |
+ +-----------------------------------------------------------+----------------------------------------------------------------------------------------------+
| | keyboard_up(key) | Release a keyboard key. |
+ +-----------------------------------------------------------+----------------------------------------------------------------------------------------------+
| | keyboard_press(key comb) | Press a combination of keys. |
+ +-----------------------------------------------------------+----------------------------------------------------------------------------------------------+
| | keyboard_type(text) | Types a string of text through the keyboard. |
+ +-----------------------------------------------------------+----------------------------------------------------------------------------------------------+
| | keyboard_inset_text(text) | Insert a string of text in the currently focused element. |
+----------+-----------------------------------------------------------+----------------------------------------------------------------------------------------------+
| tab | new_tab() | Open a new tab. |
+ +-----------------------------------------------------------+----------------------------------------------------------------------------------------------+
| | tab_close() | Close the current tab. |
+ +-----------------------------------------------------------+----------------------------------------------------------------------------------------------+
| | tab_focus(index) | Bring a tab to front (activate tab). |
+----------+-----------------------------------------------------------+----------------------------------------------------------------------------------------------+
| nav | go_back() | Navigate to the previous page in history. |
+ +-----------------------------------------------------------+----------------------------------------------------------------------------------------------+
| | go_forward() | Navigate to the next page in history. |
+ +-----------------------------------------------------------+----------------------------------------------------------------------------------------------+
| | goto(url) | Navigate to a url. |
+----------+-----------------------------------------------------------+----------------------------------------------------------------------------------------------+
| misc | scroll(dx, dy) | |
+ +-----------------------------------------------------------+----------------------------------------------------------------------------------------------+
| | send_msg_to_user(text) | |
+ +-----------------------------------------------------------+----------------------------------------------------------------------------------------------+
| | noop() | Do nothing. |
+----------+-----------------------------------------------------------+----------------------------------------------------------------------------------------------+
| python | Any python code (UNSAFE!) | Executes code with playwright, the active page and the send msg to user primitive available. |
+----------+-----------------------------------------------------------+----------------------------------------------------------------------------------------------+
Example
"""""""
.. code-block:: python
import gymnasium as gym
import browsergym.workarena
import time
env = gym.make(
"browsergym/workarena.servicenow.filter-asset-list",
headless=False,
)
try:
obs, info = env.reset(seed=10)
# Perform the following sequence of actions
actions = ["click('a46')", "click('a2157')", "fill('a2164', 'Asset tag')"]
for action in actions:
obs, reward, terminated, truncated, info = env.step(action)
# Sleep for 3 seconds to see the effect of the action
time.sleep(3)
finally:
env.close()
For more details please refer to the `WorkArena paper <https://arxiv.org/abs/2403.07718>`_.