TUC-AR-C3D / README.md
SchulzR97's picture
Update README.md
ed602a4 verified
metadata
license: mit
language:
  - en
metrics:
  - accuracy
pipeline_tag: video-classification
tags:
  - robotics
drawing

University of Technology Chemnitz, Germany
Department Robotics and Human Machine Interaction
Author: Robert Schulz

Action Recognition

Table of Contents

1 Overview

Here, we provide a PyTorch model which was trained on different datasets (see 2 Pretrained Models). The model consists of a 3D CNN multi-stage feature extraction module, followed by a classification head. It achieves state-of-the-art results on the UCF101 dataset.

Figure 1 Model architecture

2 Pretrained Models

2.1 TUC-AR Dataset

Dataset Homepage

Short Description

  • RGB and depth input recorded by Intel RealSense D435 depth camera
  • 7 subjects
  • 3 perspectives per sequence
  • 11,031 sequences (train 8,893/ val 2,138)
  • 6(+1) action categories

Input

Dimension Fixed Value Parameter Description
0 no ? Batch Size Number of samples that will be propagated through the network (number of sequences)
1 yes 30 Sequence Length Number of frames in one sequence
2 yes 4 Input Channels Number of channels of one frame (RGB+D=4)
3 yes 400 Width Width of one frame
4 yes 400 Height Height of one frame

Output

Dimension Fixed Value Parameter Description
0 no ? Batch Size Number of samples that will be propagated through the network (number of sequences)
1 yes 10 Number of action classes Number of action classes
0 - None
1 - Waving
2 - Pointing
3 - Clapping
4 - Follow
5 - Walking
6 - Stop

Usage

from huggingface_hub import HfApi

api = HfApi()
model_path = api.hf_hub_download('SchulzR97/TUC-AR-C3D', filename='tuc-ar.pth')
model = torch.load(model_path)

2.2 UCF101 Dataset

Dataset Homepage

Input

Dimension Fixed Value Parameter Description
0 no ? Batch Size Number of samples that will be propagated through the network (number of sequences)
1 yes 60 Sequence Length Number of frames in one sequence
2 yes 3 Input Channels Number of channels of one frame (RGB=3)
3 yes 400 Width Width of one frame
4 yes 400 Height Height of one frame

Output

Dimension Fixed Value Parameter Description
0 no ? Batch Size Number of samples that will be propagated through the network (number of sequences)
1 yes 101 Number of action classes Number of action classes

Usage

from huggingface_hub import HfApi

api = HfApi()
model_path = api.hf_hub_download('SchulzR97/TUC-AR-C3D', filename='ucf101.pth')
model = torch.load(model_path)