Qualitative Result
Text Prompt: “a man does a push-up and then uses his arms to balance himself back to his feet”
Our Result:
![](https://mscvprojects.ri.cmu.edu/f23team11/wp-content/uploads/sites/88/2023/05/mdm_results_1-1024x287.png)
Baseline:
![](https://mscvprojects.ri.cmu.edu/f23team11/wp-content/uploads/sites/88/2023/05/ours_results_1-1024x208.png)
Here the text prompt is unseen in the training set, and has some complexity. It can be observed that baseline method can not generate motion of “push-up” correctly. However, with Large Language Model to further explain “push-up” and “balance himself” to the model, our frameworks successfully generates the correct “push-up” motion and the subsequent motions.