File size: 2,553 Bytes
8f5b32c
 
 
 
2c895e1
 
caaf857
2c895e1
fd24c90
 
8f5b32c
 
2c895e1
 
b58ad55
 
2c895e1
 
 
 
 
 
 
 
fd24c90
 
 
 
2c895e1
 
 
 
5bb6e18
2c895e1
5bb6e18
 
 
2c895e1
 
 
 
b58ad55
2c895e1
5bb6e18
b58ad55
5bb6e18
 
 
 
b58ad55
5bb6e18
 
 
 
b58ad55
5bb6e18
 
 
 
 
 
 
b58ad55
 
 
 
 
 
2c895e1
 
 
 
 
41c50d9
2c895e1
 
 
 
 
b58ad55
 
2c895e1
 
 
 
 
 
 
 
b58ad55
2c895e1
 
 
 
 
fd24c90
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
---
license: mit
tags:
- chess
- game-ai
- pytorch
- safetensors
library_name: transformers
datasets:
- Maxlegrec/ChessFENS
---

# ChessBot Chess Model

This is a ChessBot model for chess move prediction and position evaluation. This model is way worse than stockfish. It is better than most humans however.
For stronger play, reducing temperature T (lower is stronger) is suggested.

## Model Description

The ChessBot model is a transformer-based architecture designed for chess gameplay. It can:
- Predict the next best move given a chess position (FEN)
- Evaluate chess positions
- Generate move probabilities

## Please Like if this model is useful to you :)

A like goes a long way !

## Usage

```python
import torch
from transformers import AutoModel

model = AutoModel.from_pretrained("Maxlegrec/ChessBot", trust_remote_code=True)
device = "cuda" if torch.cuda.is_available() else "cpu"
model = model.to(device)

# Example usage
fen = "rnbqkbnr/pppppppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq - 0 1"

# Sample move from policy
move = model.get_move_from_fen_no_thinking(fen, T=0.1, device=device)
print(f"Policy-based move: {move}")
#e2e4

# Get the best move using value analysis
value_move = model.get_best_move_value(fen, T=0, device=device)
print(f"Value-based move: {value_move}")
#e2e4

# Get position evaluation
position_value = model.get_position_value(fen, device=device)
print(f"Position value [black_win, draw, white_win]: {position_value}")
#[0.2318, 0.4618, 0.3064]

# Get move probabilities
probs = model.get_move_from_fen_no_thinking(fen, T=1, device=device, return_probs=True)
top_moves = sorted(probs.items(), key=lambda x: x[1], reverse=True)[:5]
print("Top 5 moves:")
for move, prob in top_moves:
    print(f"  {move}: {prob:.4f}")
#Top 5 moves:
#  e2e4: 0.9285
#  d2d4: 0.0712
#  g1f3: 0.0001
#  e2e3: 0.0000
#  c2c3: 0.0000
```

## Requirements

- torch>=2.0.0
- transformers>=4.48.1
- python-chess>=1.10.0
- numpy>=1.21.0

## Model Architecture

The architecture is strongly inspired from the LCzero project. Although written in pytorch.

- **Transformer layers**: 10
- **Hidden size**: 512
- **Feed-forward size**: 736
- **Attention heads**: 8
- **Vocabulary size**: 1929 (chess moves)

## Training Data

This model was trained on training data from the LCzero project. It consists of around 750M chess positions. I will publish the training dataset very soon.

## Limitations

- The model works best with standard chess positions
- Performance may vary with unusual or rare positions
- Requires GPU for optimal inference speed