# Developer Guide

### Requirements Installation
Use the following commands to install dependencies for each model, taking the non-local model as an example:

```text
pip install -r mindvideo/example/nonlocal/requirements.txt
```
### Configuration Files
The configuration files of each supported model are presented in ./mindvideo/config. Each .yaml file contains information about the supported model training, evaluation and inference, for example, model name, model, learning rate, loss, optimizer, etc.

### Load Model Checkpoints
All links to download the pre-train models are presented in https://gitee.com/yanlq46462828/zjut_mindvideo/tree/master

### Dataset Preparation
The links of MindVideo supported dataset are presented in: https://gitee.com/yanlq46462828/zjut_mindvideo/tree/master, including activitynet, Kinetics400, Kinetics600, UCF101, Caltech Pedestrian, CityPersons, CUHK-SYSU, PRW, ETHZ, MOT17, MOT16, charades, Collective Activity, columbia Consumer Video, davis, hmdb51, fbms, msvd, Sports-1M, THUMOS, UBI-Fights, tyvos.

Then put all training and evaluation data into one directory and then change **data_root** to that directory in data.json, like this:
```text
"data_root": "/home/publicfile/dataset/tracking"
```
Within mindvideo, all data processing methods according to each dataset used can be found under the data folder.

### Customize a Model
Here, we present how to use a model, and apply it to the MindSpore.
MindSpore supports C3D, I3D, X3D, R(2+1)D, NonLocal, ViST, fairMOT, VisTR and ARN models. 

- Create a Model

To begin with, we should create a model implementing from one of C3D, I3D, X3D, R(2+1)D, NonLocal, ViST, fairMOT, VisTR and ARN models. For example, we would like to develop a model named as I3D and write the code to builder.py.
```text
def build_model(cfg):
    """build model"""
    return ClassFactory.get_instance_from_cfg(cfg, ModuleType.MODEL)


def build_layer(cfg):
    """build layer"""
    return ClassFactory.get_instance_from_cfg(cfg, ModuleType.LAYER)
```
- Pass Parameters

Then, we need to indicate .yaml files to define the parameters of the model. Taking I3D model as example:
```text
model_name: i3d_rgb
dataset_sink_mode: False
```

### Context settings
```text
context:
    mode: 0 #0--Graph Mode; 1--Pynative Mode
    device_target: "GPU"
```

### Model settings
```text
model:
    type: i3d_rgb 
    num_classes: 400


learning_rate:
    lr_scheduler: "cosine_annealing"
    lr: 0.0012
    lr_epochs: [2, 4]
    lr_gamma: 0.1
    eta_min: 0.0
    t_max: 100
    max_epoch: 5
    warmup_epochs: 4

optimizer:
    type: 'SGD'
    momentum: 0.9
    weight_decay: 0.0004
    loss_scale: 1024

loss:
    type: SoftmaxCrossEntropyWithLogits
    sparse: True
    reduction: "mean"

train:
    pre_trained: False
    pretrained_model: ""
    ckpt_path: "./output/"
    epochs: 100
    save_checkpoint_epochs: 5
    save_checkpoint_steps: 1875
    keep_checkpoint_max: 10
    run_distribute: False

eval:
    pretrained_model: ""

infer:
    pretrained_model: ""
    batch_size: 16
    image_path: ""
    normalize: True
    output_dir: "./infer_output"
```

### Kinetics Dataset Config
```text
data_loader:
    train:
        dataset:
              type: Kinetic400
              path: "/home/publicfile/kinetics-400"
              shuffle: True
              split: 'train'
              seq: 64
              num_parallel_workers: 8
              shuffle: True
              batch_size: 16
              
        map:
            operations:
                - type: VideoResize
                  size: [256, 256]
                - type: VideoRandomCrop
                  size: [224, 224]
                - type: VideoRandomHorizontalFlip
                  prob: 0.5
                - type: VideoToTensor
            input_columns: ["video"]

    eval:
        dataset:
            type: Kinetic400
            path: "/home/publicfile/kinetics-dataset"
            split: 'val'
            seq: 64
            shuffle: Ture
            num_parallel_workers: 8
            seq_mode: 'discrete'
            
        map:
            operations:
                - type: VideoShortEdgeResize
                  size: 256
                - type: VideoCenterCrop
                  size: [224, 224]
                - type: VideoToTensor
            input_columns: ["video"]
group_size: 1
```

### Customize DataLoaders
Here, we present how to develop a new DataLoader, and apply it into our tool. If we have a model, and there is special requirement for loading the data, then we need to design a new DataLoader.

In this project, here is a abstract dataloaders: builder.py file in ./mindvideo/data.

In general, the new dataloader include four function: build_dataset_sampler, builder_dataset, build_transforms, register_builtin_dataset. The build_dataset_sampler function is used to build sampler, the build_dataset function is used to build dataset, the build_transforms function is used to build data transform pipeline, the register_builtin_dataset function is used to register MindSpore builtin dataset class.

### Customize Trainers
There are two approaches provided for training, evaluation and inference within mindvideo for each supported model. After installing MindSpore via the official website, one is to run the training or evaluation files under the example folder, which is a independent module for training and evaluation specifically designed for starters, according to each model's name. And the other is to use the train and inference interfaces for all models under the root folder of the repository when working with the YAML file containing the parameters needed for each model as we also support some parameter configurations for quick start. For this method, take I3D for example, just run following commands for training:
```text
python train.py -c zjut_mindvideo/mindvideo/config/i3d/i3d_rgb.yaml
```
and run following commands for inference and evaluation:
```text
python infer.py -c zjut_mindvideo/mindvideo/config/i3d/i3d_rgb.yaml
python eval.py -c zjut_mindvideo/mindvideo/config/i3d/i3d_rgb.yaml
```