This chapter will introduce you to the overall framework of MMPose and provide links to detailed tutorials.

What is MMPose


MMPose is a Pytorch-based pose estimation open-source toolkit, a member of the OpenMMLab Project. It contains a rich set of algorithms for 2d multi-person human pose estimation, 2d hand pose estimation, 2d face landmark detection, 133 keypoint whole-body human pose estimation, fashion landmark detection and animal pose estimation as well as related components and modules, below is its overall framework.

MMPose consists of 8 main components:

  • apis provides high-level APIs for model inference

  • structures provides data structures like bbox, keypoint and PoseDataSample

  • datasets supports various datasets for pose estimation

    • transforms contains a lot of useful data augmentation transforms

  • codecs provides pose encoders and decoders: an encoder encodes poses (mostly keypoints) into learning targets (e.g. heatmaps), and a decoder decodes model outputs into pose predictions

  • models provides all components of pose estimation models in a modular structure

    • pose_estimators defines all pose estimation model classes

    • data_preprocessors is for preprocessing the input data of the model

    • backbones provides a collection of backbone networks

    • necks contains various neck modules

    • heads contains various prediction heads that perform pose estimation

    • losses contains various loss functions

  • engine provides runtime components related to pose estimation

    • hooks provides various hooks of the runner

  • evaluation provides metrics for evaluating model performance

  • visualization is for visualizing skeletons, heatmaps and other information

How to Use this Guide

We have prepared detailed guidelines for all types of users:

