{"id":402,"date":"2023-12-18T02:16:12","date_gmt":"2023-12-18T02:16:12","guid":{"rendered":"https:\/\/mscvprojects.ri.cmu.edu\/f23team11\/?page_id=402"},"modified":"2023-12-19T04:33:28","modified_gmt":"2023-12-19T04:33:28","slug":"human-object-interaction-fall-2024","status":"publish","type":"page","link":"https:\/\/mscvprojects.ri.cmu.edu\/f23team11\/human-object-interaction-fall-2024\/","title":{"rendered":"Human-Object Interaction (Fall 2024)"},"content":{"rendered":"\n<p>Our framework consists of two modules, a human-object interaction synthesis module to generate the full-body motion and a hand-object interaction synthesis module to refine hand pose and hand-object interaction. <\/p>\n\n\n\n<h3 class=\"wp-block-heading\"><strong>Human-Object Interaction Synthesis<\/strong><\/h3>\n\n\n\n<p>We propose a framework controlling existing diffusion-based text-driven human motion synthesis model to perform hand-object interaction by prompt engineering.<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"607\" src=\"https:\/\/mscvprojects.ri.cmu.edu\/f23team11\/wp-content\/uploads\/sites\/88\/2023\/12\/image-11-1024x607.png\" alt=\"\" class=\"wp-image-494\" srcset=\"https:\/\/mscvprojects.ri.cmu.edu\/f23team11\/wp-content\/uploads\/sites\/88\/2023\/12\/image-11-1024x607.png 1024w, https:\/\/mscvprojects.ri.cmu.edu\/f23team11\/wp-content\/uploads\/sites\/88\/2023\/12\/image-11-300x178.png 300w, https:\/\/mscvprojects.ri.cmu.edu\/f23team11\/wp-content\/uploads\/sites\/88\/2023\/12\/image-11-768x455.png 768w, https:\/\/mscvprojects.ri.cmu.edu\/f23team11\/wp-content\/uploads\/sites\/88\/2023\/12\/image-11-1536x911.png 1536w, https:\/\/mscvprojects.ri.cmu.edu\/f23team11\/wp-content\/uploads\/sites\/88\/2023\/12\/image-11.png 1700w\" sizes=\"auto, (max-width: 706px) 89vw, (max-width: 767px) 82vw, 740px\" \/><figcaption class=\"wp-element-caption\">Framework of Our Human-Object Interaction Synthesis Module<\/figcaption><\/figure>\n\n\n\n<p>A CLIP text encoder is first used to process the text prompt to text embedding, which is used as an input for the human motion synthesis model. We use an object encoder with an initial object mesh as input and the output embedding is used to control the synthesis process of the frozen pretrained diffusion-based human motion synthesis model.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\"><strong>Hand-Object Contactness Module<\/strong><\/h3>\n\n\n\n<p>Our hand-object contactness module contains two networks: a hand pose CVAE model, which generates the hand pose, and a Contactnet, which generates the contact region on the object. A contact consistency optimization module is designed to use the contact region prediction to further refine the hand pose. This framework is inspired by the GraspTTA model mentioned in the related work.<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"328\" src=\"https:\/\/mscvprojects.ri.cmu.edu\/f23team11\/wp-content\/uploads\/sites\/88\/2023\/12\/image-12-1024x328.png\" alt=\"\" class=\"wp-image-495\" srcset=\"https:\/\/mscvprojects.ri.cmu.edu\/f23team11\/wp-content\/uploads\/sites\/88\/2023\/12\/image-12-1024x328.png 1024w, https:\/\/mscvprojects.ri.cmu.edu\/f23team11\/wp-content\/uploads\/sites\/88\/2023\/12\/image-12-300x96.png 300w, https:\/\/mscvprojects.ri.cmu.edu\/f23team11\/wp-content\/uploads\/sites\/88\/2023\/12\/image-12-768x246.png 768w, https:\/\/mscvprojects.ri.cmu.edu\/f23team11\/wp-content\/uploads\/sites\/88\/2023\/12\/image-12-1536x492.png 1536w, https:\/\/mscvprojects.ri.cmu.edu\/f23team11\/wp-content\/uploads\/sites\/88\/2023\/12\/image-12.png 1940w\" sizes=\"auto, (max-width: 706px) 89vw, (max-width: 767px) 82vw, 740px\" \/><figcaption class=\"wp-element-caption\">Framework of Our Hand Object Contactness Synthesis Module<\/figcaption><\/figure>\n\n\n\n<p>The ContactNet is a ConvNet with text and object point cloud as input, and the contact region is predicted as a 2-class (contact\/not-contact) classification task.<\/p>\n\n\n\n<p>The other parts of this framework are still under refinements so they are not elaborated. We present preliminary results in the Experiments Section.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Our framework consists of two modules, a human-object interaction synthesis module to generate the full-body motion and a hand-object interaction synthesis module to refine hand pose and hand-object interaction. Human-Object Interaction Synthesis We propose a framework controlling existing diffusion-based text-driven human motion synthesis model to perform hand-object interaction by prompt engineering. A CLIP text encoder &hellip; <\/p>\n<p class=\"link-more\"><a href=\"https:\/\/mscvprojects.ri.cmu.edu\/f23team11\/human-object-interaction-fall-2024\/\" class=\"more-link\">Continue reading<span class=\"screen-reader-text\"> &#8220;Human-Object Interaction (Fall 2024)&#8221;<\/span><\/a><\/p>\n","protected":false},"author":173,"featured_media":0,"parent":0,"menu_order":0,"comment_status":"closed","ping_status":"closed","template":"","meta":{"footnotes":""},"class_list":["post-402","page","type-page","status-publish","hentry"],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v27.4 - https:\/\/yoast.com\/product\/yoast-seo-wordpress\/ -->\n<title>Human-Object Interaction (Fall 2024) - Human Motion Synthesis<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/mscvprojects.ri.cmu.edu\/f23team11\/human-object-interaction-fall-2024\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Human-Object Interaction (Fall 2024) - Human Motion Synthesis\" \/>\n<meta property=\"og:description\" content=\"Our framework consists of two modules, a human-object interaction synthesis module to generate the full-body motion and a hand-object interaction synthesis module to refine hand pose and hand-object interaction. Human-Object Interaction Synthesis We propose a framework controlling existing diffusion-based text-driven human motion synthesis model to perform hand-object interaction by prompt engineering. A CLIP text encoder &hellip; Continue reading &quot;Human-Object Interaction (Fall 2024)&quot;\" \/>\n<meta property=\"og:url\" content=\"https:\/\/mscvprojects.ri.cmu.edu\/f23team11\/human-object-interaction-fall-2024\/\" \/>\n<meta property=\"og:site_name\" content=\"Human Motion Synthesis\" \/>\n<meta property=\"article:modified_time\" content=\"2023-12-19T04:33:28+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/mscvprojects.ri.cmu.edu\/f23team11\/wp-content\/uploads\/sites\/88\/2023\/12\/image-11-1024x607.png\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data1\" content=\"2 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\\\/\\\/schema.org\",\"@graph\":[{\"@type\":\"WebPage\",\"@id\":\"https:\\\/\\\/mscvprojects.ri.cmu.edu\\\/f23team11\\\/human-object-interaction-fall-2024\\\/\",\"url\":\"https:\\\/\\\/mscvprojects.ri.cmu.edu\\\/f23team11\\\/human-object-interaction-fall-2024\\\/\",\"name\":\"Human-Object Interaction (Fall 2024) - Human Motion Synthesis\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/mscvprojects.ri.cmu.edu\\\/f23team11\\\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\\\/\\\/mscvprojects.ri.cmu.edu\\\/f23team11\\\/human-object-interaction-fall-2024\\\/#primaryimage\"},\"image\":{\"@id\":\"https:\\\/\\\/mscvprojects.ri.cmu.edu\\\/f23team11\\\/human-object-interaction-fall-2024\\\/#primaryimage\"},\"thumbnailUrl\":\"https:\\\/\\\/mscvprojects.ri.cmu.edu\\\/f23team11\\\/wp-content\\\/uploads\\\/sites\\\/88\\\/2023\\\/12\\\/image-11-1024x607.png\",\"datePublished\":\"2023-12-18T02:16:12+00:00\",\"dateModified\":\"2023-12-19T04:33:28+00:00\",\"breadcrumb\":{\"@id\":\"https:\\\/\\\/mscvprojects.ri.cmu.edu\\\/f23team11\\\/human-object-interaction-fall-2024\\\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\\\/\\\/mscvprojects.ri.cmu.edu\\\/f23team11\\\/human-object-interaction-fall-2024\\\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\\\/\\\/mscvprojects.ri.cmu.edu\\\/f23team11\\\/human-object-interaction-fall-2024\\\/#primaryimage\",\"url\":\"https:\\\/\\\/mscvprojects.ri.cmu.edu\\\/f23team11\\\/wp-content\\\/uploads\\\/sites\\\/88\\\/2023\\\/12\\\/image-11.png\",\"contentUrl\":\"https:\\\/\\\/mscvprojects.ri.cmu.edu\\\/f23team11\\\/wp-content\\\/uploads\\\/sites\\\/88\\\/2023\\\/12\\\/image-11.png\",\"width\":1700,\"height\":1008},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\\\/\\\/mscvprojects.ri.cmu.edu\\\/f23team11\\\/human-object-interaction-fall-2024\\\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\\\/\\\/mscvprojects.ri.cmu.edu\\\/f23team11\\\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Human-Object Interaction (Fall 2024)\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\\\/\\\/mscvprojects.ri.cmu.edu\\\/f23team11\\\/#website\",\"url\":\"https:\\\/\\\/mscvprojects.ri.cmu.edu\\\/f23team11\\\/\",\"name\":\"Human Motion Synthesis\",\"description\":\"Chih-Chun Yang, Tianhui Cai | Advisor: Fernando De la Torre\",\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\\\/\\\/mscvprojects.ri.cmu.edu\\\/f23team11\\\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Human-Object Interaction (Fall 2024) - Human Motion Synthesis","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/mscvprojects.ri.cmu.edu\/f23team11\/human-object-interaction-fall-2024\/","og_locale":"en_US","og_type":"article","og_title":"Human-Object Interaction (Fall 2024) - Human Motion Synthesis","og_description":"Our framework consists of two modules, a human-object interaction synthesis module to generate the full-body motion and a hand-object interaction synthesis module to refine hand pose and hand-object interaction. Human-Object Interaction Synthesis We propose a framework controlling existing diffusion-based text-driven human motion synthesis model to perform hand-object interaction by prompt engineering. A CLIP text encoder &hellip; Continue reading \"Human-Object Interaction (Fall 2024)\"","og_url":"https:\/\/mscvprojects.ri.cmu.edu\/f23team11\/human-object-interaction-fall-2024\/","og_site_name":"Human Motion Synthesis","article_modified_time":"2023-12-19T04:33:28+00:00","og_image":[{"url":"https:\/\/mscvprojects.ri.cmu.edu\/f23team11\/wp-content\/uploads\/sites\/88\/2023\/12\/image-11-1024x607.png","type":"","width":"","height":""}],"twitter_card":"summary_large_image","twitter_misc":{"Est. reading time":"2 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"WebPage","@id":"https:\/\/mscvprojects.ri.cmu.edu\/f23team11\/human-object-interaction-fall-2024\/","url":"https:\/\/mscvprojects.ri.cmu.edu\/f23team11\/human-object-interaction-fall-2024\/","name":"Human-Object Interaction (Fall 2024) - Human Motion Synthesis","isPartOf":{"@id":"https:\/\/mscvprojects.ri.cmu.edu\/f23team11\/#website"},"primaryImageOfPage":{"@id":"https:\/\/mscvprojects.ri.cmu.edu\/f23team11\/human-object-interaction-fall-2024\/#primaryimage"},"image":{"@id":"https:\/\/mscvprojects.ri.cmu.edu\/f23team11\/human-object-interaction-fall-2024\/#primaryimage"},"thumbnailUrl":"https:\/\/mscvprojects.ri.cmu.edu\/f23team11\/wp-content\/uploads\/sites\/88\/2023\/12\/image-11-1024x607.png","datePublished":"2023-12-18T02:16:12+00:00","dateModified":"2023-12-19T04:33:28+00:00","breadcrumb":{"@id":"https:\/\/mscvprojects.ri.cmu.edu\/f23team11\/human-object-interaction-fall-2024\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/mscvprojects.ri.cmu.edu\/f23team11\/human-object-interaction-fall-2024\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/mscvprojects.ri.cmu.edu\/f23team11\/human-object-interaction-fall-2024\/#primaryimage","url":"https:\/\/mscvprojects.ri.cmu.edu\/f23team11\/wp-content\/uploads\/sites\/88\/2023\/12\/image-11.png","contentUrl":"https:\/\/mscvprojects.ri.cmu.edu\/f23team11\/wp-content\/uploads\/sites\/88\/2023\/12\/image-11.png","width":1700,"height":1008},{"@type":"BreadcrumbList","@id":"https:\/\/mscvprojects.ri.cmu.edu\/f23team11\/human-object-interaction-fall-2024\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/mscvprojects.ri.cmu.edu\/f23team11\/"},{"@type":"ListItem","position":2,"name":"Human-Object Interaction (Fall 2024)"}]},{"@type":"WebSite","@id":"https:\/\/mscvprojects.ri.cmu.edu\/f23team11\/#website","url":"https:\/\/mscvprojects.ri.cmu.edu\/f23team11\/","name":"Human Motion Synthesis","description":"Chih-Chun Yang, Tianhui Cai | Advisor: Fernando De la Torre","potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/mscvprojects.ri.cmu.edu\/f23team11\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"}]}},"_links":{"self":[{"href":"https:\/\/mscvprojects.ri.cmu.edu\/f23team11\/wp-json\/wp\/v2\/pages\/402","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/mscvprojects.ri.cmu.edu\/f23team11\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/mscvprojects.ri.cmu.edu\/f23team11\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/mscvprojects.ri.cmu.edu\/f23team11\/wp-json\/wp\/v2\/users\/173"}],"replies":[{"embeddable":true,"href":"https:\/\/mscvprojects.ri.cmu.edu\/f23team11\/wp-json\/wp\/v2\/comments?post=402"}],"version-history":[{"count":4,"href":"https:\/\/mscvprojects.ri.cmu.edu\/f23team11\/wp-json\/wp\/v2\/pages\/402\/revisions"}],"predecessor-version":[{"id":517,"href":"https:\/\/mscvprojects.ri.cmu.edu\/f23team11\/wp-json\/wp\/v2\/pages\/402\/revisions\/517"}],"wp:attachment":[{"href":"https:\/\/mscvprojects.ri.cmu.edu\/f23team11\/wp-json\/wp\/v2\/media?parent=402"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}