{"id":154,"date":"2023-05-01T02:07:22","date_gmt":"2023-05-01T02:07:22","guid":{"rendered":"https:\/\/mscvprojects.ri.cmu.edu\/2023team5\/?page_id=154"},"modified":"2023-05-01T02:31:20","modified_gmt":"2023-05-01T02:31:20","slug":"baseline-2","status":"publish","type":"page","link":"https:\/\/mscvprojects.ri.cmu.edu\/2023team5\/baseline-2\/","title":{"rendered":"Baseline SRT"},"content":{"rendered":"\n<div class=\"wp-block-group\"><div class=\"wp-block-group__inner-container is-layout-flow wp-block-group-is-layout-flow\">\n<p class=\"has-primary-color has-text-color has-large-font-size\" style=\"line-height:0.3\"><strong>SRT<\/strong><\/p>\n\n\n\n<p>Our current work is inspired by Scene representation transformers which use data-driven priors to generate novel views from multi-view inputs.&nbsp;Given a set of multiview inputs and a query pose, the SRT generates the image corresponding to the query pose. <\/p>\n\n\n<div class=\"wp-block-image\">\n<figure class=\"aligncenter size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"794\" height=\"292\" src=\"https:\/\/mscvprojects.ri.cmu.edu\/2023team5\/wp-content\/uploads\/sites\/76\/2023\/05\/Screenshot-2023-04-30-at-9.41.29-PM.png\" alt=\"\" class=\"wp-image-132\" srcset=\"https:\/\/mscvprojects.ri.cmu.edu\/2023team5\/wp-content\/uploads\/sites\/76\/2023\/05\/Screenshot-2023-04-30-at-9.41.29-PM.png 794w, https:\/\/mscvprojects.ri.cmu.edu\/2023team5\/wp-content\/uploads\/sites\/76\/2023\/05\/Screenshot-2023-04-30-at-9.41.29-PM-300x110.png 300w, https:\/\/mscvprojects.ri.cmu.edu\/2023team5\/wp-content\/uploads\/sites\/76\/2023\/05\/Screenshot-2023-04-30-at-9.41.29-PM-768x282.png 768w\" sizes=\"auto, (max-width: 767px) 89vw, (max-width: 1000px) 54vw, (max-width: 1071px) 543px, 580px\" \/><figcaption class=\"wp-element-caption\">SRT pipeline<\/figcaption><\/figure>\n<\/div>\n\n\n<p>Our Baseline model shares a significant resemblance with SRT, except for two distinguishing factors. Firstly, we encode pose features to the image features instead of encoding them directly to the images. This approach allows us to utilize geometric priors to train the transformer. Secondly, unlike the conventional approach of using query pose to generate a novel view, we use query image features to predict the pose. <\/p>\n\n\n<div class=\"wp-block-image\">\n<figure class=\"aligncenter size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"784\" height=\"320\" src=\"https:\/\/mscvprojects.ri.cmu.edu\/2023team5\/wp-content\/uploads\/sites\/76\/2023\/05\/Screenshot-2023-04-30-at-9.52.14-PM.png\" alt=\"\" class=\"wp-image-141\" srcset=\"https:\/\/mscvprojects.ri.cmu.edu\/2023team5\/wp-content\/uploads\/sites\/76\/2023\/05\/Screenshot-2023-04-30-at-9.52.14-PM.png 784w, https:\/\/mscvprojects.ri.cmu.edu\/2023team5\/wp-content\/uploads\/sites\/76\/2023\/05\/Screenshot-2023-04-30-at-9.52.14-PM-300x122.png 300w, https:\/\/mscvprojects.ri.cmu.edu\/2023team5\/wp-content\/uploads\/sites\/76\/2023\/05\/Screenshot-2023-04-30-at-9.52.14-PM-768x313.png 768w\" sizes=\"auto, (max-width: 767px) 89vw, (max-width: 1000px) 54vw, (max-width: 1071px) 543px, 580px\" \/><figcaption class=\"wp-element-caption\">Baseline SRT pipeline<\/figcaption><\/figure>\n<\/div>\n\n\n<p class=\"has-large-font-size\" style=\"line-height:0.3\"><strong>Results<\/strong><\/p>\n\n\n\n<p>We trained with only three input images. Below are some results from the baseline experiment. <\/p>\n\n\n\n<p class=\"has-medium-font-size\" style=\"line-height:0.3\"><strong>Qualitative Results<\/strong><\/p>\n\n\n\n<p>Given three input images with camera views and one query image, our model can regress the pose of the query image. Notice that the query image&#8217;s baseline is huge, yet the model can accurately regress the pose.<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"982\" height=\"612\" src=\"https:\/\/mscvprojects.ri.cmu.edu\/2023team5\/wp-content\/uploads\/sites\/76\/2023\/05\/Screenshot-2023-04-30-at-9.57.21-PM.png\" alt=\"\" class=\"wp-image-146\" srcset=\"https:\/\/mscvprojects.ri.cmu.edu\/2023team5\/wp-content\/uploads\/sites\/76\/2023\/05\/Screenshot-2023-04-30-at-9.57.21-PM.png 982w, https:\/\/mscvprojects.ri.cmu.edu\/2023team5\/wp-content\/uploads\/sites\/76\/2023\/05\/Screenshot-2023-04-30-at-9.57.21-PM-300x187.png 300w, https:\/\/mscvprojects.ri.cmu.edu\/2023team5\/wp-content\/uploads\/sites\/76\/2023\/05\/Screenshot-2023-04-30-at-9.57.21-PM-768x479.png 768w\" sizes=\"auto, (max-width: 767px) 89vw, (max-width: 1000px) 54vw, (max-width: 1071px) 543px, 580px\" \/><figcaption class=\"wp-element-caption\">Visualization of the inputs and outputs for Baseline SRT<\/figcaption><\/figure>\n\n\n\n<p class=\"has-medium-font-size\" style=\"line-height:0.3\"><strong>Quantitative Results<\/strong><\/p>\n<\/div><\/div>\n\n\n\n<div class=\"wp-block-group\"><div class=\"wp-block-group__inner-container is-layout-constrained wp-block-group-is-layout-constrained\">\n<p>After evaluating our model to a subset of the Co3D v2 dataset, we established that it performs remarkably well, achieving an accuracy of approximately 85% despite using only three input views.<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"644\" height=\"670\" src=\"https:\/\/mscvprojects.ri.cmu.edu\/2023team5\/wp-content\/uploads\/sites\/76\/2023\/05\/Screenshot-2023-04-30-at-10.02.49-PM.png\" alt=\"\" class=\"wp-image-147\" srcset=\"https:\/\/mscvprojects.ri.cmu.edu\/2023team5\/wp-content\/uploads\/sites\/76\/2023\/05\/Screenshot-2023-04-30-at-10.02.49-PM.png 644w, https:\/\/mscvprojects.ri.cmu.edu\/2023team5\/wp-content\/uploads\/sites\/76\/2023\/05\/Screenshot-2023-04-30-at-10.02.49-PM-288x300.png 288w\" sizes=\"auto, (max-width: 644px) 100vw, 644px\" \/><\/figure>\n<\/div><\/div>\n","protected":false},"excerpt":{"rendered":"<p>SRT Our current work is inspired by Scene representation transformers which use data-driven priors to generate novel views from multi-view inputs.&nbsp;Given a set of multiview inputs and a query pose, the SRT generates the image corresponding to the query pose. Our Baseline model shares a significant resemblance with SRT, except for two distinguishing factors. Firstly, &hellip; <\/p>\n<p class=\"link-more\"><a href=\"https:\/\/mscvprojects.ri.cmu.edu\/2023team5\/baseline-2\/\" class=\"more-link\">Continue reading<span class=\"screen-reader-text\"> &#8220;Baseline SRT&#8221;<\/span><\/a><\/p>\n","protected":false},"author":153,"featured_media":0,"parent":0,"menu_order":0,"comment_status":"closed","ping_status":"closed","template":"","meta":{"footnotes":""},"class_list":["post-154","page","type-page","status-publish","hentry"],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v27.4 - https:\/\/yoast.com\/product\/yoast-seo-wordpress\/ -->\n<title>Baseline SRT - 2023 Team 5<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/mscvprojects.ri.cmu.edu\/2023team5\/baseline-2\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Baseline SRT - 2023 Team 5\" \/>\n<meta property=\"og:description\" content=\"SRT Our current work is inspired by Scene representation transformers which use data-driven priors to generate novel views from multi-view inputs.&nbsp;Given a set of multiview inputs and a query pose, the SRT generates the image corresponding to the query pose. Our Baseline model shares a significant resemblance with SRT, except for two distinguishing factors. Firstly, &hellip; Continue reading &quot;Baseline SRT&quot;\" \/>\n<meta property=\"og:url\" content=\"https:\/\/mscvprojects.ri.cmu.edu\/2023team5\/baseline-2\/\" \/>\n<meta property=\"og:site_name\" content=\"2023 Team 5\" \/>\n<meta property=\"article:modified_time\" content=\"2023-05-01T02:31:20+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/mscvprojects.ri.cmu.edu\/2023team5\/wp-content\/uploads\/sites\/76\/2023\/05\/Screenshot-2023-04-30-at-9.41.29-PM.png\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data1\" content=\"2 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\\\/\\\/schema.org\",\"@graph\":[{\"@type\":\"WebPage\",\"@id\":\"https:\\\/\\\/mscvprojects.ri.cmu.edu\\\/2023team5\\\/baseline-2\\\/\",\"url\":\"https:\\\/\\\/mscvprojects.ri.cmu.edu\\\/2023team5\\\/baseline-2\\\/\",\"name\":\"Baseline SRT - 2023 Team 5\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/mscvprojects.ri.cmu.edu\\\/2023team5\\\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\\\/\\\/mscvprojects.ri.cmu.edu\\\/2023team5\\\/baseline-2\\\/#primaryimage\"},\"image\":{\"@id\":\"https:\\\/\\\/mscvprojects.ri.cmu.edu\\\/2023team5\\\/baseline-2\\\/#primaryimage\"},\"thumbnailUrl\":\"https:\\\/\\\/mscvprojects.ri.cmu.edu\\\/2023team5\\\/wp-content\\\/uploads\\\/sites\\\/76\\\/2023\\\/05\\\/Screenshot-2023-04-30-at-9.41.29-PM.png\",\"datePublished\":\"2023-05-01T02:07:22+00:00\",\"dateModified\":\"2023-05-01T02:31:20+00:00\",\"breadcrumb\":{\"@id\":\"https:\\\/\\\/mscvprojects.ri.cmu.edu\\\/2023team5\\\/baseline-2\\\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\\\/\\\/mscvprojects.ri.cmu.edu\\\/2023team5\\\/baseline-2\\\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\\\/\\\/mscvprojects.ri.cmu.edu\\\/2023team5\\\/baseline-2\\\/#primaryimage\",\"url\":\"https:\\\/\\\/mscvprojects.ri.cmu.edu\\\/2023team5\\\/wp-content\\\/uploads\\\/sites\\\/76\\\/2023\\\/05\\\/Screenshot-2023-04-30-at-9.41.29-PM.png\",\"contentUrl\":\"https:\\\/\\\/mscvprojects.ri.cmu.edu\\\/2023team5\\\/wp-content\\\/uploads\\\/sites\\\/76\\\/2023\\\/05\\\/Screenshot-2023-04-30-at-9.41.29-PM.png\",\"width\":794,\"height\":292},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\\\/\\\/mscvprojects.ri.cmu.edu\\\/2023team5\\\/baseline-2\\\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\\\/\\\/mscvprojects.ri.cmu.edu\\\/2023team5\\\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Baseline SRT\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\\\/\\\/mscvprojects.ri.cmu.edu\\\/2023team5\\\/#website\",\"url\":\"https:\\\/\\\/mscvprojects.ri.cmu.edu\\\/2023team5\\\/\",\"name\":\"2023 Team 5\",\"description\":\"LEARN FOR STRUCTURE FROM MOTION\",\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\\\/\\\/mscvprojects.ri.cmu.edu\\\/2023team5\\\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Baseline SRT - 2023 Team 5","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/mscvprojects.ri.cmu.edu\/2023team5\/baseline-2\/","og_locale":"en_US","og_type":"article","og_title":"Baseline SRT - 2023 Team 5","og_description":"SRT Our current work is inspired by Scene representation transformers which use data-driven priors to generate novel views from multi-view inputs.&nbsp;Given a set of multiview inputs and a query pose, the SRT generates the image corresponding to the query pose. Our Baseline model shares a significant resemblance with SRT, except for two distinguishing factors. Firstly, &hellip; Continue reading \"Baseline SRT\"","og_url":"https:\/\/mscvprojects.ri.cmu.edu\/2023team5\/baseline-2\/","og_site_name":"2023 Team 5","article_modified_time":"2023-05-01T02:31:20+00:00","og_image":[{"url":"https:\/\/mscvprojects.ri.cmu.edu\/2023team5\/wp-content\/uploads\/sites\/76\/2023\/05\/Screenshot-2023-04-30-at-9.41.29-PM.png","type":"","width":"","height":""}],"twitter_card":"summary_large_image","twitter_misc":{"Est. reading time":"2 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"WebPage","@id":"https:\/\/mscvprojects.ri.cmu.edu\/2023team5\/baseline-2\/","url":"https:\/\/mscvprojects.ri.cmu.edu\/2023team5\/baseline-2\/","name":"Baseline SRT - 2023 Team 5","isPartOf":{"@id":"https:\/\/mscvprojects.ri.cmu.edu\/2023team5\/#website"},"primaryImageOfPage":{"@id":"https:\/\/mscvprojects.ri.cmu.edu\/2023team5\/baseline-2\/#primaryimage"},"image":{"@id":"https:\/\/mscvprojects.ri.cmu.edu\/2023team5\/baseline-2\/#primaryimage"},"thumbnailUrl":"https:\/\/mscvprojects.ri.cmu.edu\/2023team5\/wp-content\/uploads\/sites\/76\/2023\/05\/Screenshot-2023-04-30-at-9.41.29-PM.png","datePublished":"2023-05-01T02:07:22+00:00","dateModified":"2023-05-01T02:31:20+00:00","breadcrumb":{"@id":"https:\/\/mscvprojects.ri.cmu.edu\/2023team5\/baseline-2\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/mscvprojects.ri.cmu.edu\/2023team5\/baseline-2\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/mscvprojects.ri.cmu.edu\/2023team5\/baseline-2\/#primaryimage","url":"https:\/\/mscvprojects.ri.cmu.edu\/2023team5\/wp-content\/uploads\/sites\/76\/2023\/05\/Screenshot-2023-04-30-at-9.41.29-PM.png","contentUrl":"https:\/\/mscvprojects.ri.cmu.edu\/2023team5\/wp-content\/uploads\/sites\/76\/2023\/05\/Screenshot-2023-04-30-at-9.41.29-PM.png","width":794,"height":292},{"@type":"BreadcrumbList","@id":"https:\/\/mscvprojects.ri.cmu.edu\/2023team5\/baseline-2\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/mscvprojects.ri.cmu.edu\/2023team5\/"},{"@type":"ListItem","position":2,"name":"Baseline SRT"}]},{"@type":"WebSite","@id":"https:\/\/mscvprojects.ri.cmu.edu\/2023team5\/#website","url":"https:\/\/mscvprojects.ri.cmu.edu\/2023team5\/","name":"2023 Team 5","description":"LEARN FOR STRUCTURE FROM MOTION","potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/mscvprojects.ri.cmu.edu\/2023team5\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"}]}},"_links":{"self":[{"href":"https:\/\/mscvprojects.ri.cmu.edu\/2023team5\/wp-json\/wp\/v2\/pages\/154","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/mscvprojects.ri.cmu.edu\/2023team5\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/mscvprojects.ri.cmu.edu\/2023team5\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/mscvprojects.ri.cmu.edu\/2023team5\/wp-json\/wp\/v2\/users\/153"}],"replies":[{"embeddable":true,"href":"https:\/\/mscvprojects.ri.cmu.edu\/2023team5\/wp-json\/wp\/v2\/comments?post=154"}],"version-history":[{"count":5,"href":"https:\/\/mscvprojects.ri.cmu.edu\/2023team5\/wp-json\/wp\/v2\/pages\/154\/revisions"}],"predecessor-version":[{"id":180,"href":"https:\/\/mscvprojects.ri.cmu.edu\/2023team5\/wp-json\/wp\/v2\/pages\/154\/revisions\/180"}],"wp:attachment":[{"href":"https:\/\/mscvprojects.ri.cmu.edu\/2023team5\/wp-json\/wp\/v2\/media?parent=154"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}