{"id":324,"date":"2022-12-21T00:52:46","date_gmt":"2022-12-21T00:52:46","guid":{"rendered":"https:\/\/mscvprojects.ri.cmu.edu\/2022team11\/?page_id=324"},"modified":"2022-12-21T01:01:46","modified_gmt":"2022-12-21T01:01:46","slug":"spring-2022-2","status":"publish","type":"page","link":"https:\/\/mscvprojects.ri.cmu.edu\/2022team11\/spring-2022-2\/","title":{"rendered":"Spring 2022"},"content":{"rendered":"\n<h2 class=\"wp-block-heading\">2D Pose Estimation<\/h2>\n\n\n\n<p>We experimented extensively with Openpose in both real-world and simulation settings. For the real-world data, we experimented on the Shibuya crossing live video &#8211; a traffic intersection that streams a live video feed 24\/7. For the simulation setting, we set up the JTA Dataset Mods [2] to hook on to the GTA 5 game and collect the dataset. <\/p>\n\n\n\n<p><strong><span style=\"text-decoration: underline\">Experiments and Results on Shibuya videos<\/span><\/strong><\/p>\n\n\n\n<p>When we naively apply openpose on the test video we get the below result:<\/p>\n\n\n\n<figure class=\"wp-block-video\"><video height=\"1080\" style=\"aspect-ratio: 1920 \/ 1080;\" width=\"1920\" controls loop muted src=\"https:\/\/mscvprojects.ri.cmu.edu\/2022team11\/wp-content\/uploads\/sites\/66\/2022\/04\/1.mp4\"><\/video><figcaption>Directly applying Openpose on test video<\/figcaption><\/figure>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"932\" height=\"463\" src=\"https:\/\/mscvprojects.ri.cmu.edu\/2022team11\/wp-content\/uploads\/sites\/66\/2022\/12\/image-16.png\" alt=\"\" class=\"wp-image-236\" srcset=\"https:\/\/mscvprojects.ri.cmu.edu\/2022team11\/wp-content\/uploads\/sites\/66\/2022\/12\/image-16.png 932w, https:\/\/mscvprojects.ri.cmu.edu\/2022team11\/wp-content\/uploads\/sites\/66\/2022\/12\/image-16-300x149.png 300w, https:\/\/mscvprojects.ri.cmu.edu\/2022team11\/wp-content\/uploads\/sites\/66\/2022\/12\/image-16-768x382.png 768w\" sizes=\"auto, (max-width: 767px) 89vw, (max-width: 1000px) 54vw, (max-width: 1071px) 543px, 580px\" \/><\/figure>\n\n\n\n<p>But since we know there are people mainly in the bottom left, and top right portion of the frame when people are waiting and in the middle when people are crossing the intersection, we can crop the video to these sections of the video and run Openpose. The results are as follows:<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"506\" src=\"https:\/\/mscvprojects.ri.cmu.edu\/2022team11\/wp-content\/uploads\/sites\/66\/2022\/12\/image-17-1024x506.png\" alt=\"\" class=\"wp-image-237\" srcset=\"https:\/\/mscvprojects.ri.cmu.edu\/2022team11\/wp-content\/uploads\/sites\/66\/2022\/12\/image-17-1024x506.png 1024w, https:\/\/mscvprojects.ri.cmu.edu\/2022team11\/wp-content\/uploads\/sites\/66\/2022\/12\/image-17-300x148.png 300w, https:\/\/mscvprojects.ri.cmu.edu\/2022team11\/wp-content\/uploads\/sites\/66\/2022\/12\/image-17-768x379.png 768w, https:\/\/mscvprojects.ri.cmu.edu\/2022team11\/wp-content\/uploads\/sites\/66\/2022\/12\/image-17.png 1116w\" sizes=\"auto, (max-width: 767px) 89vw, (max-width: 1000px) 54vw, (max-width: 1071px) 543px, 580px\" \/><figcaption>One solution: Crop and resize<\/figcaption><\/figure>\n\n\n\n<figure class=\"wp-block-video\"><video height=\"360\" style=\"aspect-ratio: 640 \/ 360;\" width=\"640\" controls loop muted src=\"https:\/\/mscvprojects.ri.cmu.edu\/2022team11\/wp-content\/uploads\/sites\/66\/2022\/04\/2.mp4\"><\/video><figcaption>Bottom left crop<\/figcaption><\/figure>\n\n\n\n<figure class=\"wp-block-video\"><video height=\"180\" style=\"aspect-ratio: 320 \/ 180;\" width=\"320\" controls loop muted src=\"https:\/\/mscvprojects.ri.cmu.edu\/2022team11\/wp-content\/uploads\/sites\/66\/2022\/04\/3.mp4\"><\/video><figcaption>Top right crop<\/figcaption><\/figure>\n\n\n\n<figure class=\"wp-block-video\"><video height=\"360\" style=\"aspect-ratio: 640 \/ 360;\" width=\"640\" controls loop muted src=\"https:\/\/mscvprojects.ri.cmu.edu\/2022team11\/wp-content\/uploads\/sites\/66\/2022\/04\/4.mp4\"><\/video><figcaption>Center crop<\/figcaption><\/figure>\n\n\n\n<p>We can observe that Openpose performs well for a specific resolution of video and scale of persons in the video. We make the following observations:<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"490\" height=\"280\" src=\"https:\/\/mscvprojects.ri.cmu.edu\/2022team11\/wp-content\/uploads\/sites\/66\/2022\/04\/scale.jpg\" alt=\"\" class=\"wp-image-106\" srcset=\"https:\/\/mscvprojects.ri.cmu.edu\/2022team11\/wp-content\/uploads\/sites\/66\/2022\/04\/scale.jpg 490w, https:\/\/mscvprojects.ri.cmu.edu\/2022team11\/wp-content\/uploads\/sites\/66\/2022\/04\/scale-300x171.jpg 300w\" sizes=\"auto, (max-width: 490px) 100vw, 490px\" \/><figcaption>The relative scale of humans in the video frame<\/figcaption><\/figure>\n\n\n\n<p>As a base requirement, we require a relative scale: of 1\/10 x 1\/16 (Height x Width),  and an absolute scale: of 59 x 23 (pixels). We found that anything lesser than this has a low detection fidelity. <\/p>\n\n\n\n<p>We also did some experiments on various Zoom levels:<\/p>\n\n\n\n<figure class=\"wp-block-video\"><video height=\"720\" style=\"aspect-ratio: 1280 \/ 720;\" width=\"1280\" controls loop muted src=\"https:\/\/mscvprojects.ri.cmu.edu\/2022team11\/wp-content\/uploads\/sites\/66\/2022\/04\/1-1.mp4\"><\/video><figcaption>Directly applying Openpose on test video<\/figcaption><\/figure>\n\n\n\n<figure class=\"wp-block-video\"><video height=\"360\" style=\"aspect-ratio: 640 \/ 360;\" width=\"640\" controls src=\"https:\/\/mscvprojects.ri.cmu.edu\/2022team11\/wp-content\/uploads\/sites\/66\/2022\/04\/2-1.mp4\"><\/video><figcaption>2x Zoom<\/figcaption><\/figure>\n\n\n\n<figure class=\"wp-block-video\"><video height=\"240\" style=\"aspect-ratio: 426 \/ 240;\" width=\"426\" controls src=\"https:\/\/mscvprojects.ri.cmu.edu\/2022team11\/wp-content\/uploads\/sites\/66\/2022\/04\/3-1.mp4\"><\/video><figcaption>3x Zoom<\/figcaption><\/figure>\n\n\n\n<figure class=\"wp-block-video\"><video height=\"180\" style=\"aspect-ratio: 320 \/ 180;\" width=\"320\" controls src=\"https:\/\/mscvprojects.ri.cmu.edu\/2022team11\/wp-content\/uploads\/sites\/66\/2022\/04\/4-1.mp4\"><\/video><figcaption>4x Zoom<\/figcaption><\/figure>\n\n\n\n<p>We also experimented with combining both cropping and zooming the video:<\/p>\n\n\n\n<figure class=\"wp-block-video\"><video height=\"720\" style=\"aspect-ratio: 1280 \/ 720;\" width=\"1280\" controls loop muted src=\"https:\/\/mscvprojects.ri.cmu.edu\/2022team11\/wp-content\/uploads\/sites\/66\/2022\/04\/1-2.mp4\"><\/video><figcaption>Directly applying Openpose on test video<\/figcaption><\/figure>\n\n\n\n<figure class=\"wp-block-video\"><video height=\"360\" style=\"aspect-ratio: 640 \/ 360;\" width=\"640\" controls loop muted src=\"https:\/\/mscvprojects.ri.cmu.edu\/2022team11\/wp-content\/uploads\/sites\/66\/2022\/04\/2-2.mp4\"><\/video><figcaption>Video crop to left bottom<\/figcaption><\/figure>\n\n\n\n<figure class=\"wp-block-video\"><video height=\"180\" style=\"aspect-ratio: 320 \/ 180;\" width=\"320\" controls src=\"https:\/\/mscvprojects.ri.cmu.edu\/2022team11\/wp-content\/uploads\/sites\/66\/2022\/04\/3-2.mp4\"><\/video><figcaption>4x zoom<\/figcaption><\/figure>\n\n\n\n<figure class=\"wp-block-video\"><video height=\"280\" style=\"aspect-ratio: 432 \/ 280;\" width=\"432\" controls loop muted src=\"https:\/\/mscvprojects.ri.cmu.edu\/2022team11\/wp-content\/uploads\/sites\/66\/2022\/04\/4-2.mp4\"><\/video><figcaption>Video crop to right bottom<\/figcaption><\/figure>\n\n\n\n<figure class=\"wp-block-video\"><video height=\"110\" style=\"aspect-ratio: 188 \/ 110;\" width=\"188\" controls loop muted src=\"https:\/\/mscvprojects.ri.cmu.edu\/2022team11\/wp-content\/uploads\/sites\/66\/2022\/04\/5.mp4\"><\/video><figcaption>4x zoom<\/figcaption><\/figure>\n\n\n\n<p>We can see an improvement in results by crop and zoom of the video. But is this a feasible method? Can we always do this during deployment in our test domain?<\/p>\n\n\n\n<p>Disadvantages:<\/p>\n\n\n\n<p>(1) Low resolution causes the model unable to tell objects from people. <\/p>\n\n\n\n<p>(2) Low resolution makes it hard for models to tell people apart from the background. <\/p>\n\n\n\n<p>(3) Existing models are not trained with the \u201cbird\u2019s eye view\u201d.<\/p>\n\n\n\n<p>A solution we build our dataset since it has the following advantages:<\/p>\n\n\n\n<p>(1) Control the number of cameras and the angle of the cameras<\/p>\n\n\n\n<p>(2) Define the scene: weather, time of the day, etc.<\/p>\n\n\n\n<p>(3) Obtain ground truth joints key points from the physical engine.<\/p>\n\n\n\n<p>So we try the above approach even on the synthetic data. And the results are as follows:<\/p>\n\n\n\n<figure class=\"wp-block-video\"><video height=\"400\" style=\"aspect-ratio: 600 \/ 400;\" width=\"600\" controls loop muted src=\"https:\/\/mscvprojects.ri.cmu.edu\/2022team11\/wp-content\/uploads\/sites\/66\/2022\/04\/1-3.mp4\"><\/video><figcaption>Test video after zoom and crop to region-of-interest<\/figcaption><\/figure>\n\n\n\n<figure class=\"wp-block-video\"><video height=\"400\" style=\"aspect-ratio: 600 \/ 400;\" width=\"600\" controls loop muted src=\"https:\/\/mscvprojects.ri.cmu.edu\/2022team11\/wp-content\/uploads\/sites\/66\/2022\/04\/3-3.mp4\"><\/video><figcaption>Results on Openpose<\/figcaption><\/figure>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"312\" src=\"https:\/\/mscvprojects.ri.cmu.edu\/2022team11\/wp-content\/uploads\/sites\/66\/2022\/12\/image-21-1024x312.png\" alt=\"\" class=\"wp-image-246\" srcset=\"https:\/\/mscvprojects.ri.cmu.edu\/2022team11\/wp-content\/uploads\/sites\/66\/2022\/12\/image-21-1024x312.png 1024w, https:\/\/mscvprojects.ri.cmu.edu\/2022team11\/wp-content\/uploads\/sites\/66\/2022\/12\/image-21-300x92.png 300w, https:\/\/mscvprojects.ri.cmu.edu\/2022team11\/wp-content\/uploads\/sites\/66\/2022\/12\/image-21-768x234.png 768w, https:\/\/mscvprojects.ri.cmu.edu\/2022team11\/wp-content\/uploads\/sites\/66\/2022\/12\/image-21.png 1121w\" sizes=\"auto, (max-width: 767px) 89vw, (max-width: 1000px) 54vw, (max-width: 1071px) 543px, 580px\" \/><figcaption>Left: Ground truth; Right: OpenPose Prediction<\/figcaption><\/figure>\n\n\n\n<p>We can observe that the OpenPose mispredicted some joints even after crop-and-resize. Hence, we can&#8217;t use Openpose off-the-shelf and we need to train our pose estimation models on the dataset we generate.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">3D Pose Estimation<\/h2>\n\n\n\n<p>1. PARE<\/p>\n\n\n\n<figure class=\"wp-block-video\"><video height=\"720\" style=\"aspect-ratio: 1280 \/ 720;\" width=\"1280\" controls loop muted src=\"http:\/\/mscvprojects.ri.cmu.edu\/2022team11\/wp-content\/uploads\/sites\/66\/2022\/04\/2-3.mp4\"><\/video><\/figure>\n\n\n\n<p>2. ROMP<\/p>\n\n\n\n<figure class=\"wp-block-video\"><video height=\"720\" style=\"aspect-ratio: 1280 \/ 720;\" width=\"1280\" controls loop muted src=\"https:\/\/mscvprojects.ri.cmu.edu\/2022team11\/wp-content\/uploads\/sites\/66\/2022\/04\/1-5.mp4\"><\/video><\/figure>\n\n\n\n<p>PARE and ROMP fail when camera angles are elevated<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Failure cases:<\/h2>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"935\" height=\"466\" src=\"https:\/\/mscvprojects.ri.cmu.edu\/2022team11\/wp-content\/uploads\/sites\/66\/2022\/12\/image-18.png\" alt=\"\" class=\"wp-image-239\" srcset=\"https:\/\/mscvprojects.ri.cmu.edu\/2022team11\/wp-content\/uploads\/sites\/66\/2022\/12\/image-18.png 935w, https:\/\/mscvprojects.ri.cmu.edu\/2022team11\/wp-content\/uploads\/sites\/66\/2022\/12\/image-18-300x150.png 300w, https:\/\/mscvprojects.ri.cmu.edu\/2022team11\/wp-content\/uploads\/sites\/66\/2022\/12\/image-18-768x383.png 768w\" sizes=\"auto, (max-width: 767px) 89vw, (max-width: 1000px) 54vw, (max-width: 1071px) 543px, 580px\" \/><figcaption>3D pose estimation failure cases<\/figcaption><\/figure>\n\n\n\n<figure class=\"wp-block-video\"><video height=\"1280\" style=\"aspect-ratio: 2288 \/ 1280;\" width=\"2288\" controls loop muted src=\"https:\/\/mscvprojects.ri.cmu.edu\/2022team11\/wp-content\/uploads\/sites\/66\/2022\/04\/2-4.mp4\"><\/video><figcaption>People at the top aren&#8217;t detected due to the scale issue mentioned above<\/figcaption><\/figure>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"498\" src=\"https:\/\/mscvprojects.ri.cmu.edu\/2022team11\/wp-content\/uploads\/sites\/66\/2022\/12\/image-19-1024x498.png\" alt=\"\" class=\"wp-image-241\" srcset=\"https:\/\/mscvprojects.ri.cmu.edu\/2022team11\/wp-content\/uploads\/sites\/66\/2022\/12\/image-19-1024x498.png 1024w, https:\/\/mscvprojects.ri.cmu.edu\/2022team11\/wp-content\/uploads\/sites\/66\/2022\/12\/image-19-300x146.png 300w, https:\/\/mscvprojects.ri.cmu.edu\/2022team11\/wp-content\/uploads\/sites\/66\/2022\/12\/image-19-768x373.png 768w, https:\/\/mscvprojects.ri.cmu.edu\/2022team11\/wp-content\/uploads\/sites\/66\/2022\/12\/image-19.png 1127w\" sizes=\"auto, (max-width: 767px) 89vw, (max-width: 1000px) 54vw, (max-width: 1071px) 543px, 580px\" \/><figcaption>Apply the crop-and-resize approach to 3D pose estimation models as well<\/figcaption><\/figure>\n\n\n\n<figure class=\"wp-block-video\"><video height=\"304\" style=\"aspect-ratio: 560 \/ 304;\" width=\"560\" controls loop muted src=\"https:\/\/mscvprojects.ri.cmu.edu\/2022team11\/wp-content\/uploads\/sites\/66\/2022\/04\/3-4.mp4\"><\/video><figcaption>Cropping video and applying ROMP. Results on Top Right Crop<\/figcaption><\/figure>\n\n\n\n<figure class=\"wp-block-video\"><video height=\"304\" style=\"aspect-ratio: 512 \/ 304;\" width=\"512\" controls loop muted src=\"https:\/\/mscvprojects.ri.cmu.edu\/2022team11\/wp-content\/uploads\/sites\/66\/2022\/04\/4-3.mp4\"><\/video><figcaption>Results after applying Super-resolution using SRGAN[3] on cropped video<\/figcaption><\/figure>\n\n\n\n<p>We can see that even with cropping and applying super-resolution the results aren&#8217;t great. This is also an issue with the dataset. The datasets these models were trained on, were as follows:<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"1003\" height=\"368\" src=\"https:\/\/mscvprojects.ri.cmu.edu\/2022team11\/wp-content\/uploads\/sites\/66\/2022\/12\/image-20.png\" alt=\"\" class=\"wp-image-243\" srcset=\"https:\/\/mscvprojects.ri.cmu.edu\/2022team11\/wp-content\/uploads\/sites\/66\/2022\/12\/image-20.png 1003w, https:\/\/mscvprojects.ri.cmu.edu\/2022team11\/wp-content\/uploads\/sites\/66\/2022\/12\/image-20-300x110.png 300w, https:\/\/mscvprojects.ri.cmu.edu\/2022team11\/wp-content\/uploads\/sites\/66\/2022\/12\/image-20-768x282.png 768w\" sizes=\"auto, (max-width: 767px) 89vw, (max-width: 1000px) 54vw, (max-width: 1071px) 543px, 580px\" \/><figcaption>Challenges with 3D pose estimation<\/figcaption><\/figure>\n\n\n\n<p>Hence, both ROMP and PARE don&#8217;t perform too well in the test domain. We then test the BEV model on the Tepper dataset. <\/p>\n\n\n\n<p>3. BEV<\/p>\n\n\n\n<figure class=\"wp-block-video\"><video height=\"1024\" style=\"aspect-ratio: 3584 \/ 1024;\" width=\"3584\" controls loop muted src=\"https:\/\/mscvprojects.ri.cmu.edu\/2022team11\/wp-content\/uploads\/sites\/66\/2022\/04\/3__.mp4\"><\/video><figcaption>We see an improvement in this camera angle on using BEV<\/figcaption><\/figure>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"256\" src=\"https:\/\/mscvprojects.ri.cmu.edu\/2022team11\/wp-content\/uploads\/sites\/66\/2022\/04\/dl_0.1-1024x256.jpg\" alt=\"\" class=\"wp-image-150\" srcset=\"https:\/\/mscvprojects.ri.cmu.edu\/2022team11\/wp-content\/uploads\/sites\/66\/2022\/04\/dl_0.1-1024x256.jpg 1024w, https:\/\/mscvprojects.ri.cmu.edu\/2022team11\/wp-content\/uploads\/sites\/66\/2022\/04\/dl_0.1-300x75.jpg 300w, https:\/\/mscvprojects.ri.cmu.edu\/2022team11\/wp-content\/uploads\/sites\/66\/2022\/04\/dl_0.1-768x192.jpg 768w, https:\/\/mscvprojects.ri.cmu.edu\/2022team11\/wp-content\/uploads\/sites\/66\/2022\/04\/dl_0.1-1536x384.jpg 1536w, https:\/\/mscvprojects.ri.cmu.edu\/2022team11\/wp-content\/uploads\/sites\/66\/2022\/04\/dl_0.1-2048x512.jpg 2048w\" sizes=\"auto, (max-width: 767px) 89vw, (max-width: 1000px) 54vw, (max-width: 1071px) 543px, 580px\" \/><figcaption>BEV predicts accurate human meshes in front and BeV<\/figcaption><\/figure>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"345\" src=\"https:\/\/mscvprojects.ri.cmu.edu\/2022team11\/wp-content\/uploads\/sites\/66\/2022\/04\/aa_0.1-1024x345.jpg\" alt=\"\" class=\"wp-image-153\" srcset=\"https:\/\/mscvprojects.ri.cmu.edu\/2022team11\/wp-content\/uploads\/sites\/66\/2022\/04\/aa_0.1-1024x345.jpg 1024w, https:\/\/mscvprojects.ri.cmu.edu\/2022team11\/wp-content\/uploads\/sites\/66\/2022\/04\/aa_0.1-300x101.jpg 300w, https:\/\/mscvprojects.ri.cmu.edu\/2022team11\/wp-content\/uploads\/sites\/66\/2022\/04\/aa_0.1-768x259.jpg 768w, https:\/\/mscvprojects.ri.cmu.edu\/2022team11\/wp-content\/uploads\/sites\/66\/2022\/04\/aa_0.1-1536x518.jpg 1536w, https:\/\/mscvprojects.ri.cmu.edu\/2022team11\/wp-content\/uploads\/sites\/66\/2022\/04\/aa_0.1-2048x691.jpg 2048w\" sizes=\"auto, (max-width: 767px) 89vw, (max-width: 1000px) 54vw, (max-width: 1071px) 543px, 580px\" \/><figcaption>However, it still fails when we test it on cropped and zoom images of our GTA 5 dataset<\/figcaption><\/figure>\n\n\n\n<p>Hence, like in the 2D pose estimation case we still have to train the BEV model on the GTA5 dataset we create<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">References<\/h2>\n\n\n\n<p>[1]&nbsp;<em>Zhe Cao et. al.<\/em>&nbsp;<em>\u201c<\/em>OpenPose: Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields<em>\u201d&nbsp;<\/em>CVPR 2017<\/p>\n\n\n\n<p>[2] Fabbri, Matteo, et al. &#8220;Learning to detect and track visible and occluded body joints in a virtual world.&#8221;&nbsp;<em>ECCV<\/em> 2018. <\/p>\n\n\n\n<p>[3] Ledig, Christian, et al. &#8220;Photo-realistic single image super-resolution using a generative adversarial network.&#8221;&nbsp;<em>CVPR<\/em> 2017.<\/p>\n\n\n\n<p>[4] Wang, A., Biswas, A., Admoni, H., &amp; Steinfeld, A. (2022). Towards Rich, Portable, and Large-Scale Pedestrian Data Collection. ArXiv, abs\/2203.01974.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>2D Pose Estimation We experimented extensively with Openpose in both real-world and simulation settings. For the real-world data, we experimented on the Shibuya crossing live video &#8211; a traffic intersection that streams a live video feed 24\/7. For the simulation setting, we set up the JTA Dataset Mods [2] to hook on to the GTA &hellip; <\/p>\n<p class=\"link-more\"><a href=\"https:\/\/mscvprojects.ri.cmu.edu\/2022team11\/spring-2022-2\/\" class=\"more-link\">Continue reading<span class=\"screen-reader-text\"> &#8220;Spring 2022&#8221;<\/span><\/a><\/p>\n","protected":false},"author":137,"featured_media":0,"parent":0,"menu_order":0,"comment_status":"closed","ping_status":"closed","template":"","meta":{"footnotes":""},"class_list":["post-324","page","type-page","status-publish","hentry"],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v27.4 - https:\/\/yoast.com\/product\/yoast-seo-wordpress\/ -->\n<title>Spring 2022 - Modeling and Understanding Pedestrian Behavior<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/mscvprojects.ri.cmu.edu\/2022team11\/spring-2022-2\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Spring 2022 - Modeling and Understanding Pedestrian Behavior\" \/>\n<meta property=\"og:description\" content=\"2D Pose Estimation We experimented extensively with Openpose in both real-world and simulation settings. For the real-world data, we experimented on the Shibuya crossing live video &#8211; a traffic intersection that streams a live video feed 24\/7. For the simulation setting, we set up the JTA Dataset Mods [2] to hook on to the GTA &hellip; Continue reading &quot;Spring 2022&quot;\" \/>\n<meta property=\"og:url\" content=\"https:\/\/mscvprojects.ri.cmu.edu\/2022team11\/spring-2022-2\/\" \/>\n<meta property=\"og:site_name\" content=\"Modeling and Understanding Pedestrian Behavior\" \/>\n<meta property=\"article:modified_time\" content=\"2022-12-21T01:01:46+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/mscvprojects.ri.cmu.edu\/2022team11\/wp-content\/uploads\/sites\/66\/2022\/12\/image-16.png\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data1\" content=\"5 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\\\/\\\/schema.org\",\"@graph\":[{\"@type\":\"WebPage\",\"@id\":\"https:\\\/\\\/mscvprojects.ri.cmu.edu\\\/2022team11\\\/spring-2022-2\\\/\",\"url\":\"https:\\\/\\\/mscvprojects.ri.cmu.edu\\\/2022team11\\\/spring-2022-2\\\/\",\"name\":\"Spring 2022 - Modeling and Understanding Pedestrian Behavior\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/mscvprojects.ri.cmu.edu\\\/2022team11\\\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\\\/\\\/mscvprojects.ri.cmu.edu\\\/2022team11\\\/spring-2022-2\\\/#primaryimage\"},\"image\":{\"@id\":\"https:\\\/\\\/mscvprojects.ri.cmu.edu\\\/2022team11\\\/spring-2022-2\\\/#primaryimage\"},\"thumbnailUrl\":\"https:\\\/\\\/mscvprojects.ri.cmu.edu\\\/2022team11\\\/wp-content\\\/uploads\\\/sites\\\/66\\\/2022\\\/12\\\/image-16.png\",\"datePublished\":\"2022-12-21T00:52:46+00:00\",\"dateModified\":\"2022-12-21T01:01:46+00:00\",\"breadcrumb\":{\"@id\":\"https:\\\/\\\/mscvprojects.ri.cmu.edu\\\/2022team11\\\/spring-2022-2\\\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\\\/\\\/mscvprojects.ri.cmu.edu\\\/2022team11\\\/spring-2022-2\\\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\\\/\\\/mscvprojects.ri.cmu.edu\\\/2022team11\\\/spring-2022-2\\\/#primaryimage\",\"url\":\"https:\\\/\\\/mscvprojects.ri.cmu.edu\\\/2022team11\\\/wp-content\\\/uploads\\\/sites\\\/66\\\/2022\\\/12\\\/image-16.png\",\"contentUrl\":\"https:\\\/\\\/mscvprojects.ri.cmu.edu\\\/2022team11\\\/wp-content\\\/uploads\\\/sites\\\/66\\\/2022\\\/12\\\/image-16.png\",\"width\":932,\"height\":463},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\\\/\\\/mscvprojects.ri.cmu.edu\\\/2022team11\\\/spring-2022-2\\\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\\\/\\\/mscvprojects.ri.cmu.edu\\\/2022team11\\\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Spring 2022\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\\\/\\\/mscvprojects.ri.cmu.edu\\\/2022team11\\\/#website\",\"url\":\"https:\\\/\\\/mscvprojects.ri.cmu.edu\\\/2022team11\\\/\",\"name\":\"Modeling and Understanding Pedestrian Behavior\",\"description\":\"Students: Adithya Sampath, Mu Chien Hsu | Advisors: Laszlo Jeni (CMU) and Koichiro Niinuma (Fujitsu) | Industry Sponsor: Fujitsu |\",\"publisher\":{\"@id\":\"https:\\\/\\\/mscvprojects.ri.cmu.edu\\\/2022team11\\\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\\\/\\\/mscvprojects.ri.cmu.edu\\\/2022team11\\\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":\"Organization\",\"@id\":\"https:\\\/\\\/mscvprojects.ri.cmu.edu\\\/2022team11\\\/#organization\",\"name\":\"Modeling and Understanding Pedestrian Behavior\",\"url\":\"https:\\\/\\\/mscvprojects.ri.cmu.edu\\\/2022team11\\\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\\\/\\\/mscvprojects.ri.cmu.edu\\\/2022team11\\\/#\\\/schema\\\/logo\\\/image\\\/\",\"url\":\"https:\\\/\\\/mscvprojects.ri.cmu.edu\\\/2022team11\\\/wp-content\\\/uploads\\\/sites\\\/66\\\/2022\\\/05\\\/cropped-download.png\",\"contentUrl\":\"https:\\\/\\\/mscvprojects.ri.cmu.edu\\\/2022team11\\\/wp-content\\\/uploads\\\/sites\\\/66\\\/2022\\\/05\\\/cropped-download.png\",\"width\":532,\"height\":250,\"caption\":\"Modeling and Understanding Pedestrian Behavior\"},\"image\":{\"@id\":\"https:\\\/\\\/mscvprojects.ri.cmu.edu\\\/2022team11\\\/#\\\/schema\\\/logo\\\/image\\\/\"}}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Spring 2022 - Modeling and Understanding Pedestrian Behavior","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/mscvprojects.ri.cmu.edu\/2022team11\/spring-2022-2\/","og_locale":"en_US","og_type":"article","og_title":"Spring 2022 - Modeling and Understanding Pedestrian Behavior","og_description":"2D Pose Estimation We experimented extensively with Openpose in both real-world and simulation settings. For the real-world data, we experimented on the Shibuya crossing live video &#8211; a traffic intersection that streams a live video feed 24\/7. For the simulation setting, we set up the JTA Dataset Mods [2] to hook on to the GTA &hellip; Continue reading \"Spring 2022\"","og_url":"https:\/\/mscvprojects.ri.cmu.edu\/2022team11\/spring-2022-2\/","og_site_name":"Modeling and Understanding Pedestrian Behavior","article_modified_time":"2022-12-21T01:01:46+00:00","og_image":[{"url":"https:\/\/mscvprojects.ri.cmu.edu\/2022team11\/wp-content\/uploads\/sites\/66\/2022\/12\/image-16.png","type":"","width":"","height":""}],"twitter_card":"summary_large_image","twitter_misc":{"Est. reading time":"5 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"WebPage","@id":"https:\/\/mscvprojects.ri.cmu.edu\/2022team11\/spring-2022-2\/","url":"https:\/\/mscvprojects.ri.cmu.edu\/2022team11\/spring-2022-2\/","name":"Spring 2022 - Modeling and Understanding Pedestrian Behavior","isPartOf":{"@id":"https:\/\/mscvprojects.ri.cmu.edu\/2022team11\/#website"},"primaryImageOfPage":{"@id":"https:\/\/mscvprojects.ri.cmu.edu\/2022team11\/spring-2022-2\/#primaryimage"},"image":{"@id":"https:\/\/mscvprojects.ri.cmu.edu\/2022team11\/spring-2022-2\/#primaryimage"},"thumbnailUrl":"https:\/\/mscvprojects.ri.cmu.edu\/2022team11\/wp-content\/uploads\/sites\/66\/2022\/12\/image-16.png","datePublished":"2022-12-21T00:52:46+00:00","dateModified":"2022-12-21T01:01:46+00:00","breadcrumb":{"@id":"https:\/\/mscvprojects.ri.cmu.edu\/2022team11\/spring-2022-2\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/mscvprojects.ri.cmu.edu\/2022team11\/spring-2022-2\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/mscvprojects.ri.cmu.edu\/2022team11\/spring-2022-2\/#primaryimage","url":"https:\/\/mscvprojects.ri.cmu.edu\/2022team11\/wp-content\/uploads\/sites\/66\/2022\/12\/image-16.png","contentUrl":"https:\/\/mscvprojects.ri.cmu.edu\/2022team11\/wp-content\/uploads\/sites\/66\/2022\/12\/image-16.png","width":932,"height":463},{"@type":"BreadcrumbList","@id":"https:\/\/mscvprojects.ri.cmu.edu\/2022team11\/spring-2022-2\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/mscvprojects.ri.cmu.edu\/2022team11\/"},{"@type":"ListItem","position":2,"name":"Spring 2022"}]},{"@type":"WebSite","@id":"https:\/\/mscvprojects.ri.cmu.edu\/2022team11\/#website","url":"https:\/\/mscvprojects.ri.cmu.edu\/2022team11\/","name":"Modeling and Understanding Pedestrian Behavior","description":"Students: Adithya Sampath, Mu Chien Hsu | Advisors: Laszlo Jeni (CMU) and Koichiro Niinuma (Fujitsu) | Industry Sponsor: Fujitsu |","publisher":{"@id":"https:\/\/mscvprojects.ri.cmu.edu\/2022team11\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/mscvprojects.ri.cmu.edu\/2022team11\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/mscvprojects.ri.cmu.edu\/2022team11\/#organization","name":"Modeling and Understanding Pedestrian Behavior","url":"https:\/\/mscvprojects.ri.cmu.edu\/2022team11\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/mscvprojects.ri.cmu.edu\/2022team11\/#\/schema\/logo\/image\/","url":"https:\/\/mscvprojects.ri.cmu.edu\/2022team11\/wp-content\/uploads\/sites\/66\/2022\/05\/cropped-download.png","contentUrl":"https:\/\/mscvprojects.ri.cmu.edu\/2022team11\/wp-content\/uploads\/sites\/66\/2022\/05\/cropped-download.png","width":532,"height":250,"caption":"Modeling and Understanding Pedestrian Behavior"},"image":{"@id":"https:\/\/mscvprojects.ri.cmu.edu\/2022team11\/#\/schema\/logo\/image\/"}}]}},"_links":{"self":[{"href":"https:\/\/mscvprojects.ri.cmu.edu\/2022team11\/wp-json\/wp\/v2\/pages\/324","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/mscvprojects.ri.cmu.edu\/2022team11\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/mscvprojects.ri.cmu.edu\/2022team11\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/mscvprojects.ri.cmu.edu\/2022team11\/wp-json\/wp\/v2\/users\/137"}],"replies":[{"embeddable":true,"href":"https:\/\/mscvprojects.ri.cmu.edu\/2022team11\/wp-json\/wp\/v2\/comments?post=324"}],"version-history":[{"count":2,"href":"https:\/\/mscvprojects.ri.cmu.edu\/2022team11\/wp-json\/wp\/v2\/pages\/324\/revisions"}],"predecessor-version":[{"id":343,"href":"https:\/\/mscvprojects.ri.cmu.edu\/2022team11\/wp-json\/wp\/v2\/pages\/324\/revisions\/343"}],"wp:attachment":[{"href":"https:\/\/mscvprojects.ri.cmu.edu\/2022team11\/wp-json\/wp\/v2\/media?parent=324"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}