{"id":14479,"date":"2025-08-14T18:49:00","date_gmt":"2025-08-14T17:49:00","guid":{"rendered":"https:\/\/www.keris-studio.fr\/blog\/?p=14479"},"modified":"2025-08-22T10:33:28","modified_gmt":"2025-08-22T09:33:28","slug":"comfyui-ai-for-architecture-case-study-02-using-gemini-to-compose","status":"publish","type":"post","link":"https:\/\/www.keris-studio.fr\/blog\/?p=14479","title":{"rendered":"COMFYUI &#8211; AI\/Archi02\u00a0: Gemini to compose"},"content":{"rendered":"\n<p><\/p>\n\n\n<h1>Task<\/h1>\n<p><strong>ComfyUI Architectural Tutorial: Photorealistic Rendering with the Gemini 2.0 Flash Model<\/strong><\/p>\n<p>In the world of architectural design, speed and flexibility are essential. With the integration of multimodal models like Gemini 2.0 Flash into ComfyUI, it is now possible to merge visual elements, manipulate them with text, and achieve photorealistic renderings in just a few clicks.<\/p>\n<p>This tutorial explores a unique workflow that uses Gemini 2.0 Flash to combine multiple images and a 3D model into a coherent and detailed render. <a href=\"https:\/\/www.keris-studio.fr\/blog\/wp-content\/empty_00001_.png\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-14488\" src=\"https:\/\/www.keris-studio.fr\/blog\/wp-content\/empty_00001_.png\" alt=\"\" width=\"1920\" height=\"1280\" srcset=\"https:\/\/www.keris-studio.fr\/blog\/wp-content\/empty_00001_.png 1920w, https:\/\/www.keris-studio.fr\/blog\/wp-content\/empty_00001_-300x200.png 300w, https:\/\/www.keris-studio.fr\/blog\/wp-content\/empty_00001_-1024x683.png 1024w, https:\/\/www.keris-studio.fr\/blog\/wp-content\/empty_00001_-768x512.png 768w, https:\/\/www.keris-studio.fr\/blog\/wp-content\/empty_00001_-1536x1024.png 1536w\" sizes=\"auto, (max-width: 1920px) 100vw, 1920px\" \/><\/a><!--more--><\/p>\n<p><em>Disclamer : La g\u00e9n\u00e9ration d&rsquo;images avec gemini-2.0-flash-preview n&rsquo;est actuellement pas disponible dans plusieurs pays d&rsquo;Europe, du Moyen-Orient et d&rsquo;Afrique. Un VPN permet malgr\u00e9 tout de contourner la restriction.<\/em><\/p>\n<p><strong>The Step-by-Step Workflow<\/strong><\/p>\n<p>Here is an overview of the workflow. This process is linear and aims to merge information from different sources to create a final render.<\/p>\n<h2><strong>Step 1: Loading Visual Sources<\/strong><\/h2>\n<p>The first step is to load all the elements we want to combine for our render.<\/p>\n<ul>\n<li><strong>Load3D (image 1)<\/strong>: This node is an innovative starting point. It loads a 3D model (in this case, an .obj file named MAISON-RHI-EXPORT2textures.obj). It generates an image from this 3D model based on defined camera and rendering settings. This is the foundation of our architectural scene. The output of this node will be used as our \u00ab\u00a0image 1\u00a0\u00bb.<\/li>\n<li><strong>LoadImage (image 2)<\/strong>: This node loads a standard image. In this workflow, the livingroom02.JPG image is imported. It likely contains the furniture we want to integrate into our scene.<\/li>\n<li><strong>LoadImage (image 3)<\/strong>: Another LoadImage node is used to import a third image (group-of-men-png-pluspng-group-of-men.png). This is a group of people who will be added to the final render.<\/li>\n<\/ul>\n<p><a href=\"https:\/\/www.keris-studio.fr\/blog\/wp-content\/02a.png\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-14484\" src=\"https:\/\/www.keris-studio.fr\/blog\/wp-content\/02a.png\" alt=\"\" width=\"1999\" height=\"516\" srcset=\"https:\/\/www.keris-studio.fr\/blog\/wp-content\/02a.png 1999w, https:\/\/www.keris-studio.fr\/blog\/wp-content\/02a-300x77.png 300w, https:\/\/www.keris-studio.fr\/blog\/wp-content\/02a-1024x264.png 1024w, https:\/\/www.keris-studio.fr\/blog\/wp-content\/02a-768x198.png 768w, https:\/\/www.keris-studio.fr\/blog\/wp-content\/02a-1536x396.png 1536w\" sizes=\"auto, (max-width: 1999px) 100vw, 1999px\" \/><\/a> <a href=\"https:\/\/www.keris-studio.fr\/blog\/wp-content\/02b.png\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-14485\" src=\"https:\/\/www.keris-studio.fr\/blog\/wp-content\/02b.png\" alt=\"\" width=\"2020\" height=\"839\" srcset=\"https:\/\/www.keris-studio.fr\/blog\/wp-content\/02b.png 2020w, https:\/\/www.keris-studio.fr\/blog\/wp-content\/02b-300x125.png 300w, https:\/\/www.keris-studio.fr\/blog\/wp-content\/02b-1024x425.png 1024w, https:\/\/www.keris-studio.fr\/blog\/wp-content\/02b-768x319.png 768w, https:\/\/www.keris-studio.fr\/blog\/wp-content\/02b-1536x638.png 1536w\" sizes=\"auto, (max-width: 2020px) 100vw, 2020px\" \/><\/a><\/p>\n<h2><strong>Step 2: Preparing the Input for Gemini<\/strong><\/h2>\n<p>The Gemini 2.0 Flash model is multimodal and capable of processing multiple images at once.<\/p>\n<ul>\n<li><strong>MultiImagesInput<\/strong>: This node is a preprocessor that gathers our source images. It takes the outputs of the three LoadImage nodes (image 1, image 2, and image 3) and combines them into a single input for the Gemini model. This allows the model to understand the context of each image and use them together.<\/li>\n<\/ul>\n<h2><strong>Step 3: The Gemini Artificial Intelligence<\/strong><\/h2>\n<p>This is where the magic happens. The GeminiFlash node is the core of the workflow, where the image generation takes place.<\/p>\n<ul>\n<li><strong>GeminiFlash<\/strong>: This node receives the set of images we prepared in the previous step. It also takes a detailed text <strong>prompt<\/strong> as input to guide the generation.\n<ul>\n<li><strong>The prompt<\/strong>: \u00ab\u00a0Put the furnitures of image_2 into the room of image_1, match perspective and light, add plants and images on the walls, add the group of guys from image_3 sitting in the sofas\u00a0\u00bb.<\/li>\n<li><strong>The model<\/strong>: The specified model is gemini-2.0-flash-exp-image-generation, which is a fast and powerful experimental version for image generation.<\/li>\n<li><strong>The output<\/strong>: This node generates a final image, which is the result of merging all visual sources according to the prompt&rsquo;s instructions. It also provides a text message that confirms the success of the generation and recalls the parameters used.<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<p><a href=\"https:\/\/www.keris-studio.fr\/blog\/wp-content\/image-51.png\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-14483\" src=\"https:\/\/www.keris-studio.fr\/blog\/wp-content\/image-51.png\" alt=\"\" width=\"369\" height=\"440\" srcset=\"https:\/\/www.keris-studio.fr\/blog\/wp-content\/image-51.png 369w, https:\/\/www.keris-studio.fr\/blog\/wp-content\/image-51-252x300.png 252w\" sizes=\"auto, (max-width: 369px) 100vw, 369px\" \/><\/a><\/p>\n<h2><strong>Step 4: Visualization and Saving<\/strong><\/h2>\n<p>The final step is to display the result and save it.<\/p>\n<ul>\n<li><strong>PreviewImage<\/strong>: This node displays the image generated by the Gemini model, allowing you to view it directly in ComfyUI.<\/li>\n<li><strong>ShowText|pysssss<\/strong>: This node displays the text confirmation message from the Gemini node, which is useful for checking that everything went smoothly.<\/li>\n<\/ul>\n<p><strong>Tips for Architecture with Gemini Flash<\/strong><\/p>\n<ul>\n<li><strong>Detailed Prompts<\/strong>: The power of this workflow lies in the precision of your prompt. Describe exactly how the elements should interact with each other and with the environment.<\/li>\n<li><strong>Quality of Sources<\/strong>: The better the quality of your images and 3D model, the more impressive the final render will be.<\/li>\n<li><strong>API Key<\/strong>: Note that this type of workflow requires an API key to access Gemini services, which must be configured in the GeminiFlash node.<\/li>\n<\/ul>\n<p>3 images input<\/p>\n<h1><a href=\"https:\/\/www.keris-studio.fr\/blog\/wp-content\/02c.png\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-14487\" src=\"https:\/\/www.keris-studio.fr\/blog\/wp-content\/02c.png\" alt=\"\" width=\"1942\" height=\"883\" srcset=\"https:\/\/www.keris-studio.fr\/blog\/wp-content\/02c.png 1942w, https:\/\/www.keris-studio.fr\/blog\/wp-content\/02c-300x136.png 300w, https:\/\/www.keris-studio.fr\/blog\/wp-content\/02c-1024x466.png 1024w, https:\/\/www.keris-studio.fr\/blog\/wp-content\/02c-768x349.png 768w, https:\/\/www.keris-studio.fr\/blog\/wp-content\/02c-1536x698.png 1536w\" sizes=\"auto, (max-width: 1942px) 100vw, 1942px\" \/><\/a><\/h1>\n<h1>Traduction Fran\u00e7aise<\/h1>\n<p><strong>Tutoriel ComfyUI pour l&rsquo;Architecture : Rendu Photographique avec le Mod\u00e8le Gemini 2.0 Flash<\/strong><\/p>\n<p>Dans le monde de la conception architecturale, la rapidit\u00e9 et la flexibilit\u00e9 sont essentielles. Avec l&rsquo;int\u00e9gration de mod\u00e8les multimodaux comme Gemini 2.0 Flash dans ComfyUI, il est d\u00e9sormais possible de fusionner des \u00e9l\u00e9ments visuels, de les manipuler avec du texte et d&rsquo;obtenir des rendus photor\u00e9alistes en quelques clics.<\/p>\n<p>Ce tutoriel explore un workflow unique qui utilise Gemini 2.0 Flash pour combiner plusieurs images et un mod\u00e8le 3D en un rendu coh\u00e9rent et d\u00e9taill\u00e9.<\/p>\n<p><strong>Le Workflow \u00c9tape par \u00c9tape<\/strong><\/p>\n<p>Voici un aper\u00e7u du workflow. Ce processus est lin\u00e9aire et vise \u00e0 fusionner des informations de diff\u00e9rentes sources pour cr\u00e9er un rendu final.<\/p>\n<p><strong>\u00c9tape 1 : Chargement des Sources Visuelles<\/strong><\/p>\n<p>La premi\u00e8re \u00e9tape consiste \u00e0 charger tous les \u00e9l\u00e9ments que nous souhaitons combiner pour notre rendu.<\/p>\n<ul>\n<li><strong>Load3D (image 1)<\/strong> : Ce n\u0153ud est un point de d\u00e9part innovant. Il charge un mod\u00e8le 3D (ici, un fichier .obj appel\u00e9 MAISON-RHI-EXPORT2textures.obj). Il g\u00e9n\u00e8re une image \u00e0 partir de ce mod\u00e8le 3D selon des param\u00e8tres de cam\u00e9ra et de rendu d\u00e9finis. C&rsquo;est le fondement de notre sc\u00e8ne architecturale. La sortie de ce n\u0153ud sera utilis\u00e9e comme notre \u00ab\u00a0image 1\u00a0\u00bb.<\/li>\n<li><strong>LoadImage (image 2)<\/strong> : Ce n\u0153ud charge une image standard. Dans ce workflow, l&rsquo;image livingroom02.JPG est import\u00e9e. Elle contient probablement les meubles que nous voulons int\u00e9grer dans notre sc\u00e8ne.<\/li>\n<li><strong>LoadImage (image 3)<\/strong> : Un autre n\u0153ud LoadImage est utilis\u00e9 pour importer une troisi\u00e8me image (group-of-men-png-pluspng-group-of-men.png). Il s&rsquo;agit d&rsquo;un groupe de personnes qui seront ajout\u00e9es au rendu final.<\/li>\n<\/ul>\n<p><strong>\u00c9tape 2 : Pr\u00e9paration de l&rsquo;Entr\u00e9e pour Gemini<\/strong><\/p>\n<p>Le mod\u00e8le Gemini 2.0 Flash est multimodal et capable de traiter plusieurs images en m\u00eame temps.<\/p>\n<ul>\n<li><strong>MultiImagesInput<\/strong> : Ce n\u0153ud est un pr\u00e9processeur qui rassemble nos images sources. Il prend les sorties des trois n\u0153uds LoadImage (image 1, image 2, et image 3) et les combine en une seule entr\u00e9e pour le mod\u00e8le Gemini. C&rsquo;est ce qui permet au mod\u00e8le de comprendre le contexte de chaque image et de les utiliser ensemble.<\/li>\n<\/ul>\n<p><strong>\u00c9tape 3 : L&rsquo;Intelligence Artificielle de Gemini<\/strong><\/p>\n<p>C&rsquo;est ici que la magie op\u00e8re. Le n\u0153ud GeminiFlash est le c\u0153ur du workflow, l\u00e0 o\u00f9 la g\u00e9n\u00e9ration d&rsquo;images se produit.<\/p>\n<ul>\n<li><strong>GeminiFlash<\/strong> : Ce n\u0153ud re\u00e7oit l&rsquo;ensemble des images que nous avons pr\u00e9par\u00e9 \u00e0 l&rsquo;\u00e9tape pr\u00e9c\u00e9dente. Il prend \u00e9galement en entr\u00e9e un <strong>prompt<\/strong> textuel tr\u00e8s d\u00e9taill\u00e9 qui guide la g\u00e9n\u00e9ration.\n<ul>\n<li><strong>Le prompt<\/strong> : \u00ab\u00a0Put the furnitures of image_2 into the room of image_1, match perspective and light, add plants and images on the walls, add the group of guys from image_3 sitting in the sofas\u00a0\u00bb (Mettez les meubles de l&rsquo;image 2 dans la pi\u00e8ce de l&rsquo;image 1, faites correspondre la perspective et la lumi\u00e8re, ajoutez des plantes et des tableaux aux murs, ajoutez le groupe de gars de l&rsquo;image 3 assis dans les canap\u00e9s).<\/li>\n<li><strong>Le mod\u00e8le<\/strong> : Le mod\u00e8le sp\u00e9cifi\u00e9 est gemini-2.0-flash-exp-image-generation, qui est une version exp\u00e9rimentale rapide et puissante pour la g\u00e9n\u00e9ration d&rsquo;images.<\/li>\n<li><strong>La sortie<\/strong> : Ce n\u0153ud g\u00e9n\u00e8re une image finale, qui est l&rsquo;aboutissement de la fusion de toutes les sources visuelles selon les instructions du prompt. Il fournit \u00e9galement un message texte qui confirme le succ\u00e8s de la g\u00e9n\u00e9ration et rappelle les param\u00e8tres utilis\u00e9s.<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<p><strong>\u00c9tape 4 : Visualisation et Sauvegarde<\/strong><\/p>\n<p>La derni\u00e8re \u00e9tape consiste \u00e0 afficher le r\u00e9sultat et \u00e0 le sauvegarder.<\/p>\n<ul>\n<li><strong>PreviewImage<\/strong> : Ce n\u0153ud affiche l&rsquo;image g\u00e9n\u00e9r\u00e9e par le mod\u00e8le Gemini, vous permettant de la visualiser directement dans ComfyUI.<\/li>\n<li><strong>ShowText|pysssss<\/strong> : Ce n\u0153ud affiche le message texte de confirmation du n\u0153ud Gemini, ce qui est utile pour v\u00e9rifier que tout s&rsquo;est bien pass\u00e9.<\/li>\n<\/ul>\n<p><strong>Conseils pour l&rsquo;Architecture avec Gemini Flash<\/strong><\/p>\n<ul>\n<li><strong>Prompts d\u00e9taill\u00e9s :<\/strong> La puissance de ce workflow r\u00e9side dans la pr\u00e9cision de votre prompt. D\u00e9crivez exactement comment les \u00e9l\u00e9ments doivent interagir entre eux et avec l&rsquo;environnement.<\/li>\n<li><strong>Qualit\u00e9 des sources :<\/strong> Plus vos images et votre mod\u00e8le 3D sont de bonne qualit\u00e9, plus le rendu final sera impressionnant.<\/li>\n<li><strong>API Key :<\/strong> Notez que ce type de workflow n\u00e9cessite une cl\u00e9 API pour acc\u00e9der aux services de Gemini, qui doit \u00eatre configur\u00e9e dans le n\u0153ud GeminiFlash.<\/li>\n<\/ul>\n<p>Ce workflow montre une mani\u00e8re tr\u00e8s efficace d&rsquo;utiliser la puissance de la g\u00e9n\u00e9ration multimodale pour le design architectural, en combinant des rendus 3D, des assets (meubles, personnes) et des instructions textuelles pour cr\u00e9er des visualisations riches et complexes.<\/p>","protected":false},"excerpt":{"rendered":"<p>Task ComfyUI Architectural Tutorial: Photorealistic Rendering with the Gemini 2.0 Flash Model In the world of architectural design, speed and flexibility are essential. With the integration of multimodal models like Gemini 2.0 Flash into ComfyUI, it is now possible to merge visual elements, manipulate them with text, and achieve photorealistic renderings in just a few &hellip; <a href=\"https:\/\/www.keris-studio.fr\/blog\/?p=14479\" class=\"more-link\">Continuer la lecture de <span class=\"screen-reader-text\">COMFYUI &#8211; AI\/Archi02\u00a0: Gemini to compose<\/span>  <span class=\"meta-nav\">&rarr;<\/span><\/a><\/p>\n","protected":false},"author":2,"featured_media":14484,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[190,593,14,8],"tags":[57,546,600,611],"class_list":["post-14479","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-architecture-2","category-artificial","category-conception","category-methodologie","tag-architecture","tag-artificial-intelligence","tag-comfyui","tag-gemini"],"_links":{"self":[{"href":"https:\/\/www.keris-studio.fr\/blog\/index.php?rest_route=\/wp\/v2\/posts\/14479","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.keris-studio.fr\/blog\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.keris-studio.fr\/blog\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.keris-studio.fr\/blog\/index.php?rest_route=\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/www.keris-studio.fr\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=14479"}],"version-history":[{"count":3,"href":"https:\/\/www.keris-studio.fr\/blog\/index.php?rest_route=\/wp\/v2\/posts\/14479\/revisions"}],"predecessor-version":[{"id":14635,"href":"https:\/\/www.keris-studio.fr\/blog\/index.php?rest_route=\/wp\/v2\/posts\/14479\/revisions\/14635"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.keris-studio.fr\/blog\/index.php?rest_route=\/wp\/v2\/media\/14484"}],"wp:attachment":[{"href":"https:\/\/www.keris-studio.fr\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=14479"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.keris-studio.fr\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=14479"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.keris-studio.fr\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=14479"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}