T2M-GPT+:Text-to-Motion Generation with Discrete Representations and Large Language Models
Jianrong Zhang1,*
Yangsong Zhang2,*
Xiaodong Cun3
Xi Shen4
Xiaojian Shen5
Hehe Fan1
Yi Yang1,†
* Equal Contribution
† Corresponding Author
1 Zhejiang University
2 Ant Group
3 Tencent AI Lab
4 Intellindust
5 Jilin University
Visual Results
[1] Zhang et al. Motiondiffuse:
Text-driven human motion generation with diffusion model. arXiv, 2022.
[2] Zhang et al. T2M-GPT: Generating Human Motion from
Textual Descriptions with Discrete Representations. CVPR, 2023.
[3] Tevet et al. Human motion diffusion model. ICLR, 2023.
[4] Chen et al. Executing your Commands via Motion Diffusion in Latent Space. CVPR, 2023.
Text Prompting: a man walks forward slowly, then turns around.
MotionDiffuse [1]
T2M-GPT [2]
MDM [3]
MLD [4]
Ours (T2M-GPT+)
Ours (T2M-GIT+)
Ground-Truth
Text Prompting: person appears to be running in straight line then jumps over something and continues running.
MotionDiffuse [1]
T2M-GPT [2]
MDM [3]
MLD [4]
Ours (T2M-GPT+)
Ours (T2M-GIT+)
Ground-Truth
Text Prompting: a person marches in place while raising its arms, then it takes a break, then it
starts running in place with raised arms.
MotionDiffuse [1]
T2M-GPT [2]
MDM [3]
MLD [4]
Ours (T2M-GPT+)
Ours (T2M-GIT+)
Ground-Truth
Text Prompting: a person walks in a right bend direction.
MotionDiffuse [1]
T2M-GPT [2]
MDM [3]
MLD [4]
Ours (T2M-GPT+)
Ours (T2M-GIT+)
Ground-Truth
Text Prompting: a person walks straight forward, turns, then runs back.
MotionDiffuse [1]
T2M-GPT [2]
MDM [3]
MLD [4]
Ours (T2M-GPT+)
Ours (T2M-GIT+)
Ground-Truth
Text Prompting: a person carries something it their hands and then uses their right hand to
throw it.
MotionDiffuse [1]
T2M-GPT [2]
MDM [3]
MLD [4]
Ours (T2M-GPT+)
Ours (T2M-GIT+)
Ground-Truth
Text Prompting: hands in fighting position while the left foot kicks aggressively up and over.
MotionDiffuse [1]
T2M-GPT [2]
MDM [3]
MLD [4]
Ours (T2M-GPT+)
Ours (T2M-GIT+)
Ground-Truth
Text Prompting: a person does jumping jacks.
MotionDiffuse [1]
T2M-GPT [2]
MDM [3]
MLD [4]
Ours (T2M-GPT+)
Ours (T2M-GIT+)
Ground-Truth
Text Prompting: someone waits a moment and jumps to the right.
MotionDiffuse [1]
T2M-GPT [2]
MDM [3]
MLD [4]
Ours (T2M-GPT+)
Ours (T2M-GIT+)
Ground-Truth
Text Prompting: the person is running over a vault.
MotionDiffuse [1]
T2M-GPT [2]
MDM [3]
MLD [4]
Ours (T2M-GPT+)
Ours (T2M-GIT+)
Ground-Truth
© This webpage was in part inspired from this
template.