Dataset Annotation and Format Conversion¶

This guide will help you tackle your datasets to get them ready for training and testing.

Dataset Annotation¶

For users of Label Studio, please follow the instructions in the Label Studio to COCO document to annotate and export the results as a Label Studio .json file. And save the Code from the Labeling Interface as an .xml file.

Note

MMPose DOSE NOT impose any restrictions on the annotation tools used by users. As long as the final annotated results meet MMPose’s data format requirements, they are acceptable. We warmly welcome community users to contribute more tutorials and conversion scripts for using various dataset annotation tools.

Browse Dataset¶

MMPose provides a useful tool to browse the dataset. You can visualize the raw annotations and the transformed annotations after data augmentation, which is helpful for debugging.

Please refer to this document for more details.

Download Open-source Datasets via MIM¶

By using OpenXLab, you can obtain free formatted datasets in various fields. Through the search function of the platform, you may address the dataset they look for quickly and easily. Using the formatted datasets from the platform, you can efficiently conduct tasks across datasets.

We recommend you check out this how-to guide to learn more details.

Format Conversion Scripts¶

We provide some scripts to convert the raw annotations into the format compatible with MMPose (namely, COCO style).

Animal Pose¶

Animal-Pose (ICCV'2019)

@InProceedings{Cao_2019_ICCV,
    author = {Cao, Jinkun and Tang, Hongyang and Fang, Hao-Shu and Shen, Xiaoyong and Lu, Cewu and Tai, Yu-Wing},
    title = {Cross-Domain Adaptation for Animal Pose Estimation},
    booktitle = {The IEEE International Conference on Computer Vision (ICCV)},
    month = {October},
    year = {2019}
}

For Animal-Pose dataset, the images and annotations can be downloaded from official website. The script tools/dataset_converters/parse_animalpose_dataset.py converts raw annotations into the format compatible with MMPose. The pre-processed annotation files are available. If you would like to generate the annotations by yourself, please follow:

Download the raw images and annotations and extract them under $MMPOSE/data. Make them look like this:

mmpose
├── mmpose
├── docs
├── tests
├── tools
├── configs
`── data
    │── animalpose
        │
        │-- VOC2012
        │   │-- Annotations
        │   │-- ImageSets
        │   │-- JPEGImages
        │   │-- SegmentationClass
        │   │-- SegmentationObject
        │
        │-- animalpose_image_part2
        │   │-- cat
        │   │-- cow
        │   │-- dog
        │   │-- horse
        │   │-- sheep
        │
        │-- PASCAL2011_animal_annotation
        │   │-- cat
        │   │   |-- 2007_000528_1.xml
        │   │   |-- 2007_000549_1.xml
        │   │   │-- ...
        │   │-- cow
        │   │-- dog
        │   │-- horse
        │   │-- sheep
        │
        │-- annimalpose_anno2
        │   │-- cat
        │   │   |-- ca1.xml
        │   │   |-- ca2.xml
        │   │   │-- ...
        │   │-- cow
        │   │-- dog
        │   │-- horse
        │   │-- sheep

Run command
```
python tools/dataset_converters/parse_animalpose_dataset.py
```
The generated annotation files are put in $MMPOSE/data/animalpose/annotations.

The official dataset does not provide the official train/val/test set split. We choose the images from PascalVOC for train & val. In total, we have 3608 images and 5117 annotations for train+val, where 2798 images with 4000 annotations are used for training, and 810 images with 1117 annotations are used for validation. Those images from other sources (1000 images with 1000 annotations) are used for testing.

COFW¶

COFW (ICCV'2013)

@inproceedings{burgos2013robust,
  title={Robust face landmark estimation under occlusion},
  author={Burgos-Artizzu, Xavier P and Perona, Pietro and Doll{\'a}r, Piotr},
  booktitle={Proceedings of the IEEE international conference on computer vision},
  pages={1513--1520},
  year={2013}
}

For COFW data, please download from COFW Dataset (Color Images). Move COFW_train_color.mat and COFW_test_color.mat to $MMPOSE/data/cofw/ and make them look like:

mmpose
├── mmpose
├── docs
├── tests
├── tools
├── configs
`── data
    │── cofw
        |── COFW_train_color.mat
        |── COFW_test_color.mat

Run pip install h5py first to install the dependency, then run the following script under $MMPOSE:

python tools/dataset_converters/parse_cofw_dataset.py

And you will get

mmpose
├── mmpose
├── docs
├── tests
├── tools
├── configs
`── data
    │── cofw
        |── COFW_train_color.mat
        |── COFW_test_color.mat
        |── annotations
        |   |── cofw_train.json
        |   |── cofw_test.json
        |── images
            |── 000001.jpg
            |── 000002.jpg

DeepposeKit¶

Desert Locust (Elife'2019)

@article{graving2019deepposekit,
  title={DeepPoseKit, a software toolkit for fast and robust animal pose estimation using deep learning},
  author={Graving, Jacob M and Chae, Daniel and Naik, Hemal and Li, Liang and Koger, Benjamin and Costelloe, Blair R and Couzin, Iain D},
  journal={Elife},
  volume={8},
  pages={e47994},
  year={2019},
  publisher={eLife Sciences Publications Limited}
}

For Vinegar Fly, Desert Locust, and Grévy’s Zebra dataset, the annotations files can be downloaded from DeepPoseKit-Data. The script tools/dataset_converters/parse_deepposekit_dataset.py converts raw annotations into the format compatible with MMPose. The pre-processed annotation files are available at vinegar_fly_annotations, locust_annotations, and zebra_annotations. If you would like to generate the annotations by yourself, please follows:

Download the raw images and annotations and extract them under $MMPOSE/data. Make them look like this:

mmpose
├── mmpose
├── docs
├── tests
├── tools
├── configs
`── data
    |
    |── DeepPoseKit-Data
    |   `── datasets
    |       |── fly
    |       |   |── annotation_data_release.h5
    |       |   |── skeleton.csv
    |       |   |── ...
    |       |
    |       |── locust
    |       |   |── annotation_data_release.h5
    |       |   |── skeleton.csv
    |       |   |── ...
    |       |
    |       `── zebra
    |           |── annotation_data_release.h5
    |           |── skeleton.csv
    |           |── ...
    |
    │── fly
        `-- images
            │-- 0.jpg
            │-- 1.jpg
            │-- ...

Note that the images can be downloaded from vinegar_fly_images, locust_images, and zebra_images.

Run command
```
python tools/dataset_converters/parse_deepposekit_dataset.py
```
The generated annotation files are put in $MMPOSE/data/fly/annotations, $MMPOSE/data/locust/annotations, and $MMPOSE/data/zebra/annotations.

Since the official dataset does not provide the test set, we randomly select 90% images for training, and the rest (10%) for evaluation.

Macaque¶

MacaquePose (bioRxiv'2020)

@article{labuguen2020macaquepose,
  title={MacaquePose: A novel ‘in the wild’macaque monkey pose dataset for markerless motion capture},
  author={Labuguen, Rollyn and Matsumoto, Jumpei and Negrete, Salvador and Nishimaru, Hiroshi and Nishijo, Hisao and Takada, Masahiko and Go, Yasuhiro and Inoue, Ken-ichi and Shibata, Tomohiro},
  journal={bioRxiv},
  year={2020},
  publisher={Cold Spring Harbor Laboratory}
}

For MacaquePose dataset, images and annotations can be downloaded from download. The script tools/dataset_converters/parse_macaquepose_dataset.py converts raw annotations into the format compatible with MMPose. The pre-processed macaque_annotations are available. If you would like to generate the annotations by yourself, please follows:

Download the raw images and annotations and extract them under $MMPOSE/data. Make them look like this:

mmpose
├── mmpose
├── docs
├── tests
├── tools
├── configs
`── data
    │── macaque
        │-- annotations.csv
        │-- images
        │   │-- 01418849d54b3005.jpg
        │   │-- 0142d1d1a6904a70.jpg
        │   │-- 01ef2c4c260321b7.jpg
        │   │-- 020a1c75c8c85238.jpg
        │   │-- 020b1506eef2557d.jpg
        │   │-- ...

Run command
```
python tools/dataset_converters/parse_macaquepose_dataset.py
```
The generated annotation files are put in $MMPOSE/data/macaque/annotations.

Since the official dataset does not provide the test set, we randomly select 12500 images for training, and the rest for evaluation.

Human3.6M¶

Human3.6M (TPAMI'2014)

@article{h36m_pami,
  author = {Ionescu, Catalin and Papava, Dragos and Olaru, Vlad and Sminchisescu,  Cristian},
  title = {Human3.6M: Large Scale Datasets and Predictive Methods for 3D Human Sensing in Natural Environments},
  journal = {IEEE Transactions on Pattern Analysis and Machine Intelligence},
  publisher = {IEEE Computer Society},
  volume = {36},
  number = {7},
  pages = {1325-1339},
  month = {jul},
  year = {2014}
}

For Human3.6M, please download from the official website and place the files under $MMPOSE/data/h36m. Then run the preprocessing script:

python tools/dataset_converters/preprocess_h36m.py --metadata {path to metadata.xml} --original data/h36m

This will extract camera parameters and pose annotations at full framerate (50 FPS) and downsampled framerate (10 FPS). The processed data should have the following structure:

mmpose
├── mmpose
├── docs
├── tests
├── tools
├── configs
`── data
    ├── h36m
        ├── annotation_body3d
        |   ├── cameras.pkl
        |   ├── fps50
        |   |   ├── h36m_test.npz
        |   |   ├── h36m_train.npz
        |   |   ├── joint2d_rel_stats.pkl
        |   |   ├── joint2d_stats.pkl
        |   |   ├── joint3d_rel_stats.pkl
        |   |   `── joint3d_stats.pkl
        |   `── fps10
        |       ├── h36m_test.npz
        |       ├── h36m_train.npz
        |       ├── joint2d_rel_stats.pkl
        |       ├── joint2d_stats.pkl
        |       ├── joint3d_rel_stats.pkl
        |       `── joint3d_stats.pkl
        `── images
            ├── S1
            |   ├── S1_Directions_1.54138969
            |   |   ├── S1_Directions_1.54138969_00001.jpg
            |   |   ├── S1_Directions_1.54138969_00002.jpg
            |   |   ├── ...
            |   ├── ...
            ├── S5
            ├── S6
            ├── S7
            ├── S8
            ├── S9
            `── S11

After that, the annotations need to be transformed into COCO format which is compatible with MMPose. Please run:

python tools/dataset_converters/h36m_to_coco.py

MPII¶

MPII (CVPR'2014)

@inproceedings{andriluka14cvpr,
  author = {Mykhaylo Andriluka and Leonid Pishchulin and Peter Gehler and Schiele, Bernt},
  title = {2D Human Pose Estimation: New Benchmark and State of the Art Analysis},
  booktitle = {IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
  year = {2014},
  month = {June}
}

During training and inference for MPII, the prediction result will be saved as ‘.mat’ format by default. We also provide a tool to convert this .mat to more readable .json format.

python tools/dataset_converters/mat2json ${PRED_MAT_FILE} ${GT_JSON_FILE} ${OUTPUT_PRED_JSON_FILE}

For example,

python tools/dataset/mat2json work_dirs/res50_mpii_256x256/pred.mat data/mpii/annotations/mpii_val.json pred.json

Label Studio¶

Label Studio

@misc{Label Studio,
  title={{Label Studio}: Data labeling software},
  url={https://github.com/heartexlabs/label-studio},
  note={Open source software available from https://github.com/heartexlabs/label-studio},
  author={
    Maxim Tkachenko and
    Mikhail Malyuk and
    Andrey Holmanyuk and
    Nikolai Liubimov},
  year={2020-2022},
}

For users of Label Studio, please follow the instructions in the Label Studio to COCO document to annotate and export the results as a Label Studio .json file. And save the Code from the Labeling Interface as an .xml file.

We provide a script to convert Label Studio .json annotation file to COCO .json format file. It can be used by running the following command:

python tools/dataset_converters/labelstudio2coco.py ${LS_JSON_FILE} ${LS_XML_FILE} ${OUTPUT_COCO_JSON_FILE}

For example,

python tools/dataset_converters/labelstudio2coco.py config.xml project-1-at-2023-05-13-09-22-91b53efa.json output/result.json

UBody2D¶

UBody (CVPR'2023)

@article{lin2023one,
  title={One-Stage 3D Whole-Body Mesh Recovery with Component Aware Transformer},
  author={Lin, Jing and Zeng, Ailing and Wang, Haoqian and Zhang, Lei and Li, Yu},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  year={2023},
}

For Ubody dataset, videos and annotations can be downloaded from OSX homepage.

Download and extract them under $MMPOSE/data, and make them look like this:

mmpose
├── mmpose
├── docs
├── tests
├── tools
├── configs
`── data
    │── UBody
        ├── annotations
        │   ├── ConductMusic
        │   ├── Entertainment
        │   ├── Fitness
        │   ├── Interview
        │   ├── LiveVlog
        │   ├── Magic_show
        │   ├── Movie
        │   ├── Olympic
        │   ├── Online_class
        │   ├── SignLanguage
        │   ├── Singing
        │   ├── Speech
        │   ├── TVShow
        │   ├── TalkShow
        │   └── VideoConference
        ├── splits
        │   ├── inter_scene_test_list.npy
        │   └── intra_scene_test_list.npy
        ├── videos
        │   ├── ConductMusic
        │   ├── Entertainment
        │   ├── Fitness
        │   ├── Interview
        │   ├── LiveVlog
        │   ├── Magic_show
        │   ├── Movie
        │   ├── Olympic
        │   ├── Online_class
        │   ├── SignLanguage
        │   ├── Singing
        │   ├── Speech
        │   ├── TVShow
        │   ├── TalkShow
        │   └── VideoConference

We provide a script to convert vidoes to images and split annotations to train/val sets. It can be used by running the following command:

python tools/dataset_converters/ubody_kpts_to_coco.py --data-root ${UBODY_DATA_ROOT}

For example,

python tools/dataset_converters/ubody_kpts_to_coco.py --data-root data/UBody