EAR-WACV25-DAKiet-TSM

The model was presented in the paper .

This model is a Temporal Shift Module (TSM) based video classification model with a resnext50_32x4d backbone.

Github Repository: https://github.com/fdfyaytkt/EAR-WACV25-DAKiet-TSM

Data

The model was trained on a combination of datasets:

  • Toyota Smarthome dataset: Used for activity recognition.
  • ETRI-Activity3D: RGB videos (specific subsets or full dataset used depending on configuration).
  • ETRI-Activity3D-LivingLab: RGB videos (specific subsets or full dataset used depending on configuration).

Two configurations are detailed below, with their respective public leaderboard scores:

Config 1 (Public Leaderboard: 0.84402)

  • Toyota Smarthome dataset
  • ETRI-Activity3D - RGB videos (RGB_P091-P100)
  • ETRI-Activity3D-LivingLab - RGB videos (RGB(P201-P230))

Config 2 (Public Leaderboard: 0.78856)

  • Toyota Smarthome dataset
  • ETRI-Activity3D - RGB videos (full)
  • ETRI-Activity3D-LivingLab - RGB videos (full)

Running

Example training and evaluation commands are provided below. Refer to the repository for complete details and options:

Train

python main.py elderly RGB --arch resnext50_32x4d --num_segments 8 --gd 20 --lr 0.001 --wd 1e-4 --lr_steps 20 40 --epochs 100 --batch-size 4 -j 32 --dropout 0.5 --consensus_type=avg --eval-freq=1 --shift --shift_div=8 --shift_place=blockres --npb

Eval

python generate_submission.py elderly --arch=resnext50_32x4d --csv_file=submission.csv  --weights=checkpoint/TSM_elderly_RGB_resnext50_32x4d_shift8_blockres_avg_segment8_e100/ckpt.best.pth.tar --test_segments=8 --batch_size=1 --test_crops=1
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The HF Inference API does not support video-classification models for pytorch library.