Spaces:
Running
Running
File size: 4,201 Bytes
dd39c08 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 |
Creating a custom task
______________________
Creating a custom task in `BrowserGym` can be done easily by inheriting from the `AbstractBrowserTask` class.
Let's start with an example, we will build a task that starts from the Google Search page and asks for the Eiffel Tower Wikipedia page.
* **Goal**: Search for 'Eiffel Tower' Wikipedia page.
* **Reward**: Gets reward = 1 if reaches the expected Wikipedia page on Eiffel Tower, else gets 0.
.. code-block:: python
from typing import Tuple
import playwright.sync_api
from browsergym.core.task import AbstractBrowserTask
class SampleTask(AbstractBrowserTask):
def __init__(self, seed: int) -> None:
super().__init__(seed)
@classmethod
def get_task_id(cls):
return "sample_task"
First, let's setup the task. To do this, we need to implement the `setup()` function. The starting point is *https://www.google.com* page and the goal is *"Search for 'Eiffel Tower' Wikipedia page"*.
.. code-block:: python
class SampleTask(AbstractBrowserTask):
# ...
# Code above
# ...
def setup(self, page: playwright.sync_api.Page) -> Tuple[str, dict]:
"""Set up everything needed to execute the task."""
page.goto("https://www.google.com", timeout=10000)
goal = "Search for 'Eiffel Tower' Wikipedia page."
info = {}
return goal, info
Next, we need to compute a reward. For this, we'll implement our validation criteria in the `validate()` function. In our case, we consider the task being completed when the Eiffel Tower Wikipedia page is reached. Else it's a fail.
.. code-block:: python
class SampleTask(AbstractBrowserTask):
# ...
# Code above
# ...
def validate(
self, page: playwright.sync_api.Page, chat_messages: list[str]
) -> Tuple[float, bool, str, dict]:
"""Compute reward based on reaching final URL."""
if page.url == "https://en.wikipedia.org/wiki/Eiffel_Tower":
return 1.0, True, "Task completed", {}
else:
return 0.0, False, "", {}
We can also implement the code for completing the task, it's an oracle (a.k.a. cheat) version. For this, we'll fill out the `cheat()` function.
.. code-block:: python
class SampleTask(AbstractBrowserTask):
# ...
# Code above
# ...
def cheat(self, page: playwright.sync_api.Page, chat_messages: list[str]) -> None:
"""Solve the task in a single step using a hard-coded Playwright solution."""
page.get_by_text("Search").fill("Eiffel Tower")
page.get_by_text("Google Search").click()
page.get_by_text("Eiffel Tower - Wikipedia").click()
Finally, the `teardown()` function. This function allows to clean resources before closing the enviroment. In our case, nothing need to be done, so we will leave it empty.
.. code-block:: python
class SampleTask(AbstractBrowserTask):
# ...
# Code above
# ...
def teardown(self) -> None:
# Nothing to do for this task.
pass
Our folder structure should look like the following:
.. code-block:: bash
.
|ββ tasks
| βββ __init__.py
| βββ sample_task.py
βββ run_task.py
Now we should register the task in the gym environment using the following code in the `__init__.py` of your package:
.. code-block:: python
from browsergym.core.registration import register_task
from .sample_task import SampleTask
register_task(id=SampleTask.get_task_id(), task_class=SampleTask)
Now that the task is registered it can be called via this code that you can put in `run_task.py` file:
.. code-block:: python
import gymnasium as gym
import tasks # will register the gym environment
env = gym.make("browsergym/sample_task")
obs, info = env.reset()
done = False
while not done:
action = "noop()"
obs, reward, terminated, truncated, info = env.step(action)
print(f"Reward: {reward}, Done: {done}, Info: {info}")
|