Spaces:
Running
Running
Commit
ยท
1596fb1
1
Parent(s):
df2b142
style(nyz): modify WIP icon
Browse files
README.md
CHANGED
@@ -26,18 +26,23 @@ If you want to contact us & join us, you can โ๏ธ to our team : <opendilab@p
|
|
26 |
|
27 |
# Overview of Model Zoo
|
28 |
<sup>(1): "๐" means that this algorithm doesn't support this environment.</sup>
|
29 |
-
<sup>(2): "
|
30 |
### Deep Reinforcement Learning
|
|
|
|
|
|
|
31 |
| Algo.\Env. | [LunarLander](https://di-engine-docs.readthedocs.io/en/latest/13_envs/lunarlander.html) | [BipedalWalker](https://di-engine-docs.readthedocs.io/en/latest/13_envs/bipedalwalker.html) | [Pendulum](https://di-engine-docs.readthedocs.io/en/latest/13_envs/pendulum.html) | [Pong](https://di-engine-docs.readthedocs.io/en/latest/13_envs/atari.html) | [SpaceInvaders](https://di-engine-docs.readthedocs.io/en/latest/13_envs/atari.html) | [Qbert](https://di-engine-docs.readthedocs.io/en/latest/13_envs/atari.html) | [Hopper](https://di-engine-docs.readthedocs.io/en/latest/13_envs/mujoco.html) | [Halfcheetah](https://di-engine-docs.readthedocs.io/en/latest/13_envs/mujoco.html) | [Walker2d](https://di-engine-docs.readthedocs.io/en/latest/13_envs/mujoco.html) |
|
32 |
| :-------------: | :-------------: | :------------------------: | :------------: | :--------------: | :------------: | :------------------: | :---------: | :---------: | :---------: |
|
33 |
| [PPO](https://arxiv.org/pdf/1707.06347.pdf) | [โ
](https://huggingface.co/OpenDILabCommunity/LunarLander-v2-ppo) | | | | | | [โ
](https://huggingface.co/OpenDILabCommunity/Hopper-v4-PPO) | | |
|
34 |
-
| [PG](https://proceedings.neurips.cc/paper/1999/file/464d828b85b0bed98e80ade0a5c43b0f-Paper.pdf) |
|
35 |
-
| [A2C](https://arxiv.org/pdf/1602.01783.pdf) |
|
36 |
-
| [IMPALA](https://arxiv.org/pdf/1802.01561.pdf)
|
37 |
-
| [DQN](https://storage.googleapis.com/deepmind-media/dqn/DQNNaturePaper.pdf) |
|
38 |
-
| [DDPG](https://arxiv.org/pdf/1509.02971.pdf) |
|
39 |
-
| [TD3](https://arxiv.org/pdf/1802.09477.pdf) |
|
40 |
-
| [SAC](https://arxiv.org/pdf/1801.01290.pdf)
|
|
|
|
|
41 |
|
42 |
|
43 |
### Multi-Agent Reinforcement Learning
|
|
|
26 |
|
27 |
# Overview of Model Zoo
|
28 |
<sup>(1): "๐" means that this algorithm doesn't support this environment.</sup>
|
29 |
+
<sup>(2): "โณ" means that the corresponding model is in the upload waitinglist (Work In Progress).</sup>
|
30 |
### Deep Reinforcement Learning
|
31 |
+
<details open>
|
32 |
+
<summary>(Click to Collapse)</summary>
|
33 |
+
|
34 |
| Algo.\Env. | [LunarLander](https://di-engine-docs.readthedocs.io/en/latest/13_envs/lunarlander.html) | [BipedalWalker](https://di-engine-docs.readthedocs.io/en/latest/13_envs/bipedalwalker.html) | [Pendulum](https://di-engine-docs.readthedocs.io/en/latest/13_envs/pendulum.html) | [Pong](https://di-engine-docs.readthedocs.io/en/latest/13_envs/atari.html) | [SpaceInvaders](https://di-engine-docs.readthedocs.io/en/latest/13_envs/atari.html) | [Qbert](https://di-engine-docs.readthedocs.io/en/latest/13_envs/atari.html) | [Hopper](https://di-engine-docs.readthedocs.io/en/latest/13_envs/mujoco.html) | [Halfcheetah](https://di-engine-docs.readthedocs.io/en/latest/13_envs/mujoco.html) | [Walker2d](https://di-engine-docs.readthedocs.io/en/latest/13_envs/mujoco.html) |
|
35 |
| :-------------: | :-------------: | :------------------------: | :------------: | :--------------: | :------------: | :------------------: | :---------: | :---------: | :---------: |
|
36 |
| [PPO](https://arxiv.org/pdf/1707.06347.pdf) | [โ
](https://huggingface.co/OpenDILabCommunity/LunarLander-v2-ppo) | | | | | | [โ
](https://huggingface.co/OpenDILabCommunity/Hopper-v4-PPO) | | |
|
37 |
+
| [PG](https://proceedings.neurips.cc/paper/1999/file/464d828b85b0bed98e80ade0a5c43b0f-Paper.pdf) | โณ | | | | | | โณ | | |
|
38 |
+
| [A2C](https://arxiv.org/pdf/1602.01783.pdf) | โณ | | | | | | โณ | | |
|
39 |
+
| [IMPALA](https://arxiv.org/pdf/1802.01561.pdf) | โณ | | | | | | โณ | | |
|
40 |
+
| [DQN](https://storage.googleapis.com/deepmind-media/dqn/DQNNaturePaper.pdf) | โณ | | | | | | ๐ | ๐ | ๐ |
|
41 |
+
| [DDPG](https://arxiv.org/pdf/1509.02971.pdf) | โณ | | | ๐ | ๐ | ๐ | โณ | | |
|
42 |
+
| [TD3](https://arxiv.org/pdf/1802.09477.pdf) | โณ | | | ๐ | ๐ | ๐ |[โ
](https://huggingface.co/OpenDILabCommunity/Hopper-v4-TD3) | | |
|
43 |
+
| [SAC](https://arxiv.org/pdf/1801.01290.pdf) | โณ | | | ๐ | ๐ | ๐ | [โ
](https://huggingface.co/OpenDILabCommunity/Hopper-v4-SAC) | | |
|
44 |
+
|
45 |
+
</details>
|
46 |
|
47 |
|
48 |
### Multi-Agent Reinforcement Learning
|