|
--- |
|
datasets: |
|
- imagenet-1k |
|
metrics: |
|
- accuracy |
|
library_name: timm |
|
--- |
|
|
|
# Model Card for Model ID |
|
|
|
Based on **quasi-linear hyperbolic systems of PDEs** [[Liu et al, 2023](https://github.com/liuyao12/ConvNets-PDE-perspective)], the QLNet enters uncharted waters of ConvNet model space marked by the use of (element-wise) multiplication instead of ReLU as the primary nonlinearity. It achieves comparable performance as ResNet50 on ImageNet-1k (acc=**78.4**), demonstrating that it has the same level of capacity/expressivity, and deserves more study (hyper-paremeter tuning, optimizer, etc.) by the community. |
|
|
|
|
|
 |
|
|
|
One notable feature is that the architecture (trained or not) admits a *continuous* symmetry in its parameters. Check out the [notebook](https://colab.research.google.com/#fileId=https://huggingface.co/liuyao/QLNet/blob/main/QLNet_symmetry.ipynb) for a demo that makes a particular transformation on the weights while leaving the output *unchanged*. |
|
|
|
*This modelcard aims to be a base template for new models. It has been generated using [this raw template](https://github.com/huggingface/huggingface_hub/blob/main/src/huggingface_hub/templates/modelcard_template.md?plain=1).* |
|
|
|
|
|
## Model Details |
|
|
|
### Model Description |
|
|
|
Instead of the `bottleneck` block of ResNet50 which consists of 1x1, 3x3, 1x1 in succession, this simplest version of QLNet does a 1x1, splits into two equal halves and **multiplies** them, then applies a 3x3 (depthwise), and a 1x1, all *without* activation functions except at the end of the block, where a *radial activation function* that we call `hardball` is applied. |
|
|
|
|
|
|
|
- **Developed by:** Yao Liu 刘杳 |
|
- **Model type:** Convolutional Neural Network (ConvNet) |
|
- **License:** [More Information Needed] |
|
- **Finetuned from model:** N/A (*trained from scratch*) |
|
|
|
### Model Sources [optional] |
|
|
|
<!-- Provide the basic links for the model. --> |
|
|
|
- **Repository:** [ConvNet from the PDE perspective](https://github.com/liuyao12/ConvNets-PDE-perspective) |
|
- **Paper:** [A Novel ConvNet Architecture with a Continuous Symmetry](https://arxiv.org/abs/2308.01621) |
|
- **Demo:** [More Information Needed] |
|
|
|
## How to Get Started with the Model |
|
|
|
Use the code below to get started with the model. |
|
|
|
[More Information Needed] |
|
|
|
## Training Details |
|
|
|
### Training and Testing Data |
|
|
|
ImageNet-1k |
|
|
|
[More Information Needed] |
|
|
|
### Training Procedure |
|
|
|
We use the training script in `timm` |
|
|
|
``` |
|
python3 train.py ../datasets/imagenet/ --model resnet50 --num-classes 1000 --lr 0.1 --warmup-epochs 5 --epochs 240 --weight-decay 1e-4 --sched cosine --reprob 0.4 --recount 3 --remode pixel --aa rand-m7-mstd0.5-inc1 -b 192 -j 6 --amp --dist-bn reduce |
|
``` |
|
|
|
#### Preprocessing [optional] |
|
|
|
[More Information Needed] |
|
|
|
|
|
#### Training Hyperparameters |
|
|
|
- **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision --> |
|
|
|
#### Speeds, Sizes, Times [optional] |
|
|
|
<!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. --> |
|
|
|
### Results |
|
|
|
top1 acc = 78.40 |
|
|
|
#### Summary |
|
|
|
|
|
|
|
## Model Examination [optional] |
|
|
|
<!-- Relevant interpretability work for the model goes here --> |
|
|
|
[More Information Needed] |
|
|
|
## Technical Specifications [optional] |
|
|
|
### Model Architecture and Objective |
|
|
|
[More Information Needed] |
|
|
|
### Compute Infrastructure |
|
|
|
[More Information Needed] |
|
|
|
#### Hardware |
|
|
|
single GPU :( |
|
|
|
#### Software |
|
|
|
[More Information Needed] |
|
|
|
## Citation [optional] |
|
|
|
<!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. --> |
|
|
|
**BibTeX:** |
|
|
|
[More Information Needed] |
|
|
|
**APA:** |
|
|
|
[More Information Needed] |
|
|
|
## Glossary [optional] |
|
|
|
<!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. --> |
|
|
|
[More Information Needed] |
|
|
|
## More Information [optional] |
|
|
|
[More Information Needed] |
|
|
|
## Model Card Authors [optional] |
|
|
|
[More Information Needed] |
|
|
|
## Model Card Contact |
|
|
|
[More Information Needed] |