Developer Guide¶
Requirements Installation¶
Use the following commands to install dependencies for each model, taking the non-local model as an example:
pip install -r mindvideo/example/nonlocal/requirements.txt
Configuration Files¶
The configuration files of each supported model are presented in ./mindvideo/config. Each .yaml file contains information about the supported model training, evaluation and inference, for example, model name, model, learning rate, loss, optimizer, etc.
Load Model Checkpoints¶
All links to download the pre-train models are presented in https://gitee.com/yanlq46462828/zjut_mindvideo/tree/master
Dataset Preparation¶
The links of MindVideo supported dataset are presented in: https://gitee.com/yanlq46462828/zjut_mindvideo/tree/master, including activitynet, Kinetics400, Kinetics600, UCF101, Caltech Pedestrian, CityPersons, CUHK-SYSU, PRW, ETHZ, MOT17, MOT16, charades, Collective Activity, columbia Consumer Video, davis, hmdb51, fbms, msvd, Sports-1M, THUMOS, UBI-Fights, tyvos.
Then put all training and evaluation data into one directory and then change data_root to that directory in data.json, like this:
"data_root": "/home/publicfile/dataset/tracking"
Within mindvideo, all data processing methods according to each dataset used can be found under the data folder.
Customize a Model¶
Here, we present how to use a model, and apply it to the MindSpore. MindSpore supports C3D, I3D, X3D, R(2+1)D, NonLocal, ViST, fairMOT, VisTR and ARN models.
Create a Model
To begin with, we should create a model implementing from one of C3D, I3D, X3D, R(2+1)D, NonLocal, ViST, fairMOT, VisTR and ARN models. For example, we would like to develop a model named as I3D and write the code to builder.py.
def build_model(cfg):
"""build model"""
return ClassFactory.get_instance_from_cfg(cfg, ModuleType.MODEL)
def build_layer(cfg):
"""build layer"""
return ClassFactory.get_instance_from_cfg(cfg, ModuleType.LAYER)
Pass Parameters
Then, we need to indicate .yaml files to define the parameters of the model. Taking I3D model as example:
model_name: i3d_rgb
dataset_sink_mode: False
Context settings¶
context:
mode: 0 #0--Graph Mode; 1--Pynative Mode
device_target: "GPU"
Model settings¶
model:
type: i3d_rgb
num_classes: 400
learning_rate:
lr_scheduler: "cosine_annealing"
lr: 0.0012
lr_epochs: [2, 4]
lr_gamma: 0.1
eta_min: 0.0
t_max: 100
max_epoch: 5
warmup_epochs: 4
optimizer:
type: 'SGD'
momentum: 0.9
weight_decay: 0.0004
loss_scale: 1024
loss:
type: SoftmaxCrossEntropyWithLogits
sparse: True
reduction: "mean"
train:
pre_trained: False
pretrained_model: ""
ckpt_path: "./output/"
epochs: 100
save_checkpoint_epochs: 5
save_checkpoint_steps: 1875
keep_checkpoint_max: 10
run_distribute: False
eval:
pretrained_model: ""
infer:
pretrained_model: ""
batch_size: 16
image_path: ""
normalize: True
output_dir: "./infer_output"
Kinetics Dataset Config¶
data_loader:
train:
dataset:
type: Kinetic400
path: "/home/publicfile/kinetics-400"
shuffle: True
split: 'train'
seq: 64
num_parallel_workers: 8
shuffle: True
batch_size: 16
map:
operations:
- type: VideoResize
size: [256, 256]
- type: VideoRandomCrop
size: [224, 224]
- type: VideoRandomHorizontalFlip
prob: 0.5
- type: VideoToTensor
input_columns: ["video"]
eval:
dataset:
type: Kinetic400
path: "/home/publicfile/kinetics-dataset"
split: 'val'
seq: 64
shuffle: Ture
num_parallel_workers: 8
seq_mode: 'discrete'
map:
operations:
- type: VideoShortEdgeResize
size: 256
- type: VideoCenterCrop
size: [224, 224]
- type: VideoToTensor
input_columns: ["video"]
group_size: 1
Customize DataLoaders¶
Here, we present how to develop a new DataLoader, and apply it into our tool. If we have a model, and there is special requirement for loading the data, then we need to design a new DataLoader.
In this project, here is a abstract dataloaders: builder.py file in ./mindvideo/data.
In general, the new dataloader include four function: build_dataset_sampler, builder_dataset, build_transforms, register_builtin_dataset. The build_dataset_sampler function is used to build sampler, the build_dataset function is used to build dataset, the build_transforms function is used to build data transform pipeline, the register_builtin_dataset function is used to register MindSpore builtin dataset class.
Customize Trainers¶
There are two approaches provided for training, evaluation and inference within mindvideo for each supported model. After installing MindSpore via the official website, one is to run the training or evaluation files under the example folder, which is a independent module for training and evaluation specifically designed for starters, according to each model’s name. And the other is to use the train and inference interfaces for all models under the root folder of the repository when working with the YAML file containing the parameters needed for each model as we also support some parameter configurations for quick start. For this method, take I3D for example, just run following commands for training:
python train.py -c zjut_mindvideo/mindvideo/config/i3d/i3d_rgb.yaml
and run following commands for inference and evaluation:
python infer.py -c zjut_mindvideo/mindvideo/config/i3d/i3d_rgb.yaml
python eval.py -c zjut_mindvideo/mindvideo/config/i3d/i3d_rgb.yaml