<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0" xmlns:itunes="http://www.itunes.com/dtds/podcast-1.0.dtd" xmlns:googleplay="http://www.google.com/schemas/play-podcasts/1.0"><channel><title><![CDATA[NastaranAI]]></title><description><![CDATA[A personal exploration of AI that bridges the gap between foundational math, critical research, and production-ready code.]]></description><link>https://blog.nastaran.ai</link><image><url>https://substackcdn.com/image/fetch/$s_!1JfO!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fce8012a4-92e4-4e85-931b-5c42446f683b_1280x1280.png</url><title>NastaranAI</title><link>https://blog.nastaran.ai</link></image><generator>Substack</generator><lastBuildDate>Fri, 15 May 2026 01:56:41 GMT</lastBuildDate><atom:link href="https://blog.nastaran.ai/feed" rel="self" type="application/rss+xml"/><copyright><![CDATA[Nastaran Moghadasi]]></copyright><language><![CDATA[en]]></language><webMaster><![CDATA[nastaranai@substack.com]]></webMaster><itunes:owner><itunes:email><![CDATA[nastaranai@substack.com]]></itunes:email><itunes:name><![CDATA[Nastaran Moghadasi]]></itunes:name></itunes:owner><itunes:author><![CDATA[Nastaran Moghadasi]]></itunes:author><googleplay:owner><![CDATA[nastaranai@substack.com]]></googleplay:owner><googleplay:email><![CDATA[nastaranai@substack.com]]></googleplay:email><googleplay:author><![CDATA[Nastaran Moghadasi]]></googleplay:author><itunes:block><![CDATA[Yes]]></itunes:block><item><title><![CDATA[Teaching an AI to Paint Like Monet: A Deep Dive into DCGANs from Scratch]]></title><description><![CDATA[I taught a neural network to paint like Monet. A complete walkthrough of building a DCGAN from scratch, covering the math, PyTorch implementation, and training.]]></description><link>https://blog.nastaran.ai/p/how-to-build-dcgan-monet-style-art</link><guid isPermaLink="false">https://blog.nastaran.ai/p/how-to-build-dcgan-monet-style-art</guid><dc:creator><![CDATA[Nastaran Moghadasi]]></dc:creator><pubDate>Mon, 05 Jan 2026 14:02:40 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!QCv8!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa964f4bc-9dc3-493a-9a08-01e9e9358a7d_1792x1024.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>As a weekend project, I dove into the world of generative models by taking on a personal project I&#8217;ve been eager to try: teaching a neural network to paint in the style of the great Impressionist, Claude Monet. This post documents my journey, from analyzing Monet&#8217;s iconic style to implementing a Deep Convolutional Generative Adversarial Network (DCGAN) from scratch. You can check out the source code on <a href="https://github.com/NastaranMO/dcgan-monet-generation">GitHub</a>.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!QCv8!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa964f4bc-9dc3-493a-9a08-01e9e9358a7d_1792x1024.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!QCv8!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa964f4bc-9dc3-493a-9a08-01e9e9358a7d_1792x1024.png 424w, https://substackcdn.com/image/fetch/$s_!QCv8!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa964f4bc-9dc3-493a-9a08-01e9e9358a7d_1792x1024.png 848w, https://substackcdn.com/image/fetch/$s_!QCv8!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa964f4bc-9dc3-493a-9a08-01e9e9358a7d_1792x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!QCv8!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa964f4bc-9dc3-493a-9a08-01e9e9358a7d_1792x1024.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!QCv8!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa964f4bc-9dc3-493a-9a08-01e9e9358a7d_1792x1024.png" width="1456" height="832" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/a964f4bc-9dc3-493a-9a08-01e9e9358a7d_1792x1024.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:832,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:2945565,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://blog.nastaran.ai/i/181457622?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa964f4bc-9dc3-493a-9a08-01e9e9358a7d_1792x1024.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!QCv8!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa964f4bc-9dc3-493a-9a08-01e9e9358a7d_1792x1024.png 424w, https://substackcdn.com/image/fetch/$s_!QCv8!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa964f4bc-9dc3-493a-9a08-01e9e9358a7d_1792x1024.png 848w, https://substackcdn.com/image/fetch/$s_!QCv8!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa964f4bc-9dc3-493a-9a08-01e9e9358a7d_1792x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!QCv8!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa964f4bc-9dc3-493a-9a08-01e9e9358a7d_1792x1024.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>The &#8220;Why&#8221;: Capturing Impressionism with Adversarial Nets</h2><p>The challenge of artistic style transfer is not just about mimicking colors or shapes; it&#8217;s about capturing the essence of an artist&#8217;s technique, their brushstrokes, their use of light, and their compositional tendencies. Monet&#8217;s work, characterized by its soft focus, vibrant yet natural color palettes, and emphasis on light&#8217;s ephemeral qualities, presents a fascinating and complex style for an AI to learn.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://blog.nastaran.ai/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading NastaranAI! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p>While various techniques for style transfer exist, Generative Adversarial Networks (GANs) are particularly well-suited for this task. A GAN consists of two neural networks, a Generator and a Discriminator, locked in a competitive game. The Generator&#8217;s goal is to create images so realistic that they fool the Discriminator, whose job is to distinguish real images from fake ones. This adversarial process forces the Generator to learn the intricate and subtle patterns of the training data, rather than just creating a superficial copy. </p><div class="pullquote"><p>I chose the DCGAN architecture specifically because its use of convolutional layers is highly effective at learning hierarchical spatial features, which is essential for understanding the composition and texture of paintings.</p></div><h2>Understanding the Canvas: An Exploration of Monet&#8217;s Style</h2><p>Before building the model, I first had to understand my data. <a href="https://www.kaggle.com/competitions/gan-getting-started/data">The dataset</a> consists of 300 high-quality digital reproductions of Monet&#8217;s paintings. While a small dataset, a thorough Exploratory Data Analysis (EDA) revealed a remarkable consistency in Monet&#8217;s style, which I believed a model could learn.</p><p>My analysis of the color profiles showed that Monet&#8217;s paintings have a distinct palette. The pixel distribution histograms revealed a slight bias towards warmer tones, with the blue channel showing the highest variance, likely reflecting his famous water lilies and sky scenes. The color analysis further confirmed this, showing a tendency towards mid-range brightness and a wide range of saturation, which gives his work its characteristic vibrancy.</p><div class="pullquote"><p>After 5,000 epochs, the model was able to generate convincing, novel images that captured the essence of Monet&#8217;s style. </p></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!DbaK!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3abce957-62c0-4f91-9bc3-0860c2b4c4ef_2048x1621.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!DbaK!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3abce957-62c0-4f91-9bc3-0860c2b4c4ef_2048x1621.png 424w, https://substackcdn.com/image/fetch/$s_!DbaK!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3abce957-62c0-4f91-9bc3-0860c2b4c4ef_2048x1621.png 848w, https://substackcdn.com/image/fetch/$s_!DbaK!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3abce957-62c0-4f91-9bc3-0860c2b4c4ef_2048x1621.png 1272w, https://substackcdn.com/image/fetch/$s_!DbaK!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3abce957-62c0-4f91-9bc3-0860c2b4c4ef_2048x1621.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!DbaK!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3abce957-62c0-4f91-9bc3-0860c2b4c4ef_2048x1621.png" width="1456" height="1152" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/3abce957-62c0-4f91-9bc3-0860c2b4c4ef_2048x1621.png&quot;,&quot;srcNoWatermark&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/89ea5522-6ba2-4d7e-aa75-be6f0019f1ef_2048x1621.png&quot;,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1152,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;Bar charts showing color analysis of Monet dataset including brightness, saturation, and hue distribution.&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Bar charts showing color analysis of Monet dataset including brightness, saturation, and hue distribution." title="Bar charts showing color analysis of Monet dataset including brightness, saturation, and hue distribution." srcset="https://substackcdn.com/image/fetch/$s_!DbaK!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3abce957-62c0-4f91-9bc3-0860c2b4c4ef_2048x1621.png 424w, https://substackcdn.com/image/fetch/$s_!DbaK!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3abce957-62c0-4f91-9bc3-0860c2b4c4ef_2048x1621.png 848w, https://substackcdn.com/image/fetch/$s_!DbaK!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3abce957-62c0-4f91-9bc3-0860c2b4c4ef_2048x1621.png 1272w, https://substackcdn.com/image/fetch/$s_!DbaK!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3abce957-62c0-4f91-9bc3-0860c2b4c4ef_2048x1621.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"><em>Figure 1: Color analysis of the Monet dataset, showing brightness, saturation, and dominant color palettes.</em></figcaption></figure></div><p></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Oxxi!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9cc9bec1-7e09-4b56-baeb-d606ca93c0e7_2048x1622.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Oxxi!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9cc9bec1-7e09-4b56-baeb-d606ca93c0e7_2048x1622.png 424w, https://substackcdn.com/image/fetch/$s_!Oxxi!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9cc9bec1-7e09-4b56-baeb-d606ca93c0e7_2048x1622.png 848w, https://substackcdn.com/image/fetch/$s_!Oxxi!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9cc9bec1-7e09-4b56-baeb-d606ca93c0e7_2048x1622.png 1272w, https://substackcdn.com/image/fetch/$s_!Oxxi!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9cc9bec1-7e09-4b56-baeb-d606ca93c0e7_2048x1622.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Oxxi!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9cc9bec1-7e09-4b56-baeb-d606ca93c0e7_2048x1622.png" width="1456" height="1153" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/9cc9bec1-7e09-4b56-baeb-d606ca93c0e7_2048x1622.png&quot;,&quot;srcNoWatermark&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/892f1bd9-bf88-40b5-b3ce-6e69936b23e6_2048x1622.png&quot;,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1153,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;RGB pixel value histograms comparing red, green, and blue channel distributions in Monet paintings.&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="RGB pixel value histograms comparing red, green, and blue channel distributions in Monet paintings." title="RGB pixel value histograms comparing red, green, and blue channel distributions in Monet paintings." srcset="https://substackcdn.com/image/fetch/$s_!Oxxi!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9cc9bec1-7e09-4b56-baeb-d606ca93c0e7_2048x1622.png 424w, https://substackcdn.com/image/fetch/$s_!Oxxi!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9cc9bec1-7e09-4b56-baeb-d606ca93c0e7_2048x1622.png 848w, https://substackcdn.com/image/fetch/$s_!Oxxi!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9cc9bec1-7e09-4b56-baeb-d606ca93c0e7_2048x1622.png 1272w, https://substackcdn.com/image/fetch/$s_!Oxxi!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9cc9bec1-7e09-4b56-baeb-d606ca93c0e7_2048x1622.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"><em>Figure 2: RGB pixel value distributions across a sample of Monet&#8217;s paintings.</em></figcaption></figure></div><p>This initial analysis was crucial. It not only provided a quantitative fingerprint of Monet&#8217;s style but also informed my decisions for preprocessing the data, such as normalizing the images to a range that would be suitable for the GAN&#8217;s activation functions.</p><h2>The Adversarial Dance: The Mathematics of a GAN</h2><p>The competitive dynamic between the Generator (G) and the Discriminator (D) is formalized through a minimax loss function. The goal is to find a balance&#8212;a Nash equilibrium, where the Generator produces images that are indistinguishable from reality, and the Discriminator is no better than 50/50 at telling them apart. The value function, V(D, G), is expressed as:</p><div class="latex-rendered" data-attrs="{&quot;persistentExpression&quot;:&quot;\\min_G \\max_D V(D, G) = \\mathbb{E}{x \\sim p{data}(x)}[\\log D(x)] + \\mathbb{E}{z \\sim p{z}(z)}[\\log(1 - D(G(z)))]&quot;,&quot;id&quot;:&quot;NVYTVFZRRI&quot;}" data-component-name="LatexBlockToDOM"></div><p></p><p>Let&#8217;s break this down:</p><div class="latex-rendered" data-attrs="{&quot;persistentExpression&quot;:&quot;\\mathbb{E}_{x \\sim p_{data}(x)}[\\log D(x)]&quot;,&quot;id&quot;:&quot;UMUZOWOKNT&quot;}" data-component-name="LatexBlockToDOM"></div><p>This term represents the Discriminator&#8217;s ability to correctly identify real images. The Discriminator&#8217;s output, D(x), is the probability that an image &#8216;x&#8217; from the real data distribution is authentic. The Discriminator wants to maximize this term, driving D(x) towards 1 (real).</p><div class="latex-rendered" data-attrs="{&quot;persistentExpression&quot;:&quot;\\mathbb{E}_{z \\sim p_{z}(z)}[\\log(1 - D(G(z)))]&quot;,&quot;id&quot;:&quot;ULQMCKPVEM&quot;}" data-component-name="LatexBlockToDOM"></div><p>This term represents the Discriminator&#8217;s ability to identify fake images. The Generator creates an image G(z) from a random noise vector &#8216;z&#8217;. The Discriminator then evaluates this fake image. The Generator wants to fool the Discriminator, so it tries to make D(G(z)) as close to 1 as possible, which in turn minimizes this term. The Discriminator, on the other hand, wants to correctly identify the fake image, so it tries to make D(G(z)) as close to 0 as possible, which maximizes this term.</p><p>By training these two networks against each other, the Generator becomes progressively better at capturing the underlying distribution of the real data, which in this case, is the artistic style of Monet.</p><h2>Building the Brushes: The DCGAN Architecture</h2><p>Following the principles of the original DCGAN paper, I designed the Generator and Discriminator networks using PyTorch. The key is using transposed convolutions in the Generator to upsample from a random noise vector into a full image, and standard strided convolutions in the Discriminator to downsample an image into a single probability.</p><p>Here is a simplified view of the Generator&#8217;s architecture:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!_hNW!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2ac08656-c729-4ea7-834f-addc8830362e_1460x1218.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!_hNW!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2ac08656-c729-4ea7-834f-addc8830362e_1460x1218.png 424w, https://substackcdn.com/image/fetch/$s_!_hNW!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2ac08656-c729-4ea7-834f-addc8830362e_1460x1218.png 848w, https://substackcdn.com/image/fetch/$s_!_hNW!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2ac08656-c729-4ea7-834f-addc8830362e_1460x1218.png 1272w, https://substackcdn.com/image/fetch/$s_!_hNW!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2ac08656-c729-4ea7-834f-addc8830362e_1460x1218.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!_hNW!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2ac08656-c729-4ea7-834f-addc8830362e_1460x1218.png" width="1456" height="1215" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/2ac08656-c729-4ea7-834f-addc8830362e_1460x1218.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1215,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:281506,&quot;alt&quot;:&quot;Code of DCGAN Generator&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://blog.nastaran.ai/i/181457622?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2ac08656-c729-4ea7-834f-addc8830362e_1460x1218.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Code of DCGAN Generator" title="Code of DCGAN Generator" srcset="https://substackcdn.com/image/fetch/$s_!_hNW!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2ac08656-c729-4ea7-834f-addc8830362e_1460x1218.png 424w, https://substackcdn.com/image/fetch/$s_!_hNW!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2ac08656-c729-4ea7-834f-addc8830362e_1460x1218.png 848w, https://substackcdn.com/image/fetch/$s_!_hNW!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2ac08656-c729-4ea7-834f-addc8830362e_1460x1218.png 1272w, https://substackcdn.com/image/fetch/$s_!_hNW!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2ac08656-c729-4ea7-834f-addc8830362e_1460x1218.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Generator</figcaption></figure></div><p>The architecture uses a 4x4 kernel for the convolutional layers, which is a common choice in DCGANs to learn spatial features effectively. The stride of 2 in the transposed convolutions allows the network to double the spatial dimensions at each layer, effectively upsampling the image. Batch normalization is used after each layer to stabilize training, and the final <code>Tanh</code> activation function squashes the output pixel values to be between -1 and 1, matching the normalization of the input data.</p><p>The Discriminator is essentially a mirror image of the Generator:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!X6Rx!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd8017f23-138d-4d9c-8f98-9f70b066a914_1460x1174.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!X6Rx!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd8017f23-138d-4d9c-8f98-9f70b066a914_1460x1174.png 424w, https://substackcdn.com/image/fetch/$s_!X6Rx!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd8017f23-138d-4d9c-8f98-9f70b066a914_1460x1174.png 848w, https://substackcdn.com/image/fetch/$s_!X6Rx!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd8017f23-138d-4d9c-8f98-9f70b066a914_1460x1174.png 1272w, https://substackcdn.com/image/fetch/$s_!X6Rx!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd8017f23-138d-4d9c-8f98-9f70b066a914_1460x1174.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!X6Rx!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd8017f23-138d-4d9c-8f98-9f70b066a914_1460x1174.png" width="1456" height="1171" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d8017f23-138d-4d9c-8f98-9f70b066a914_1460x1174.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1171,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:263309,&quot;alt&quot;:&quot;Code of DCGAN Discriminator&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://blog.nastaran.ai/i/181457622?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd8017f23-138d-4d9c-8f98-9f70b066a914_1460x1174.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Code of DCGAN Discriminator" title="Code of DCGAN Discriminator" srcset="https://substackcdn.com/image/fetch/$s_!X6Rx!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd8017f23-138d-4d9c-8f98-9f70b066a914_1460x1174.png 424w, https://substackcdn.com/image/fetch/$s_!X6Rx!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd8017f23-138d-4d9c-8f98-9f70b066a914_1460x1174.png 848w, https://substackcdn.com/image/fetch/$s_!X6Rx!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd8017f23-138d-4d9c-8f98-9f70b066a914_1460x1174.png 1272w, https://substackcdn.com/image/fetch/$s_!X6Rx!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd8017f23-138d-4d9c-8f98-9f70b066a914_1460x1174.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Discriminator</figcaption></figure></div><p>Here, strided convolutions downsample the image at each step. I used <code>LeakyReLU</code> as the activation function, which is a common practice in GANs to prevent sparse gradients, a problem that can occur with standard ReLU. The final <code>Sigmoid</code> activation outputs a single probability, the network&#8217;s guess as to whether the input image is real or fake.</p><h2>The Artist at Work: Training</h2><p>I trained the DCGAN for 5,000 epochs on my Apple Silicon-based machine. The entire process took just over an hour, which is a testament to the efficiency of the M-series chips for ML workloads. Throughout the training, I used <a href="https://wandb.ai/site/">Weights &amp; Biases (WandB)</a> to log the loss metrics and visualize the generated images at different stages. This was invaluable for monitoring the training dynamics.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!ElNn!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F67da31bb-ab41-4b77-be1f-e3b4fedd3d10_2048x1075.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!ElNn!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F67da31bb-ab41-4b77-be1f-e3b4fedd3d10_2048x1075.png 424w, https://substackcdn.com/image/fetch/$s_!ElNn!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F67da31bb-ab41-4b77-be1f-e3b4fedd3d10_2048x1075.png 848w, https://substackcdn.com/image/fetch/$s_!ElNn!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F67da31bb-ab41-4b77-be1f-e3b4fedd3d10_2048x1075.png 1272w, https://substackcdn.com/image/fetch/$s_!ElNn!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F67da31bb-ab41-4b77-be1f-e3b4fedd3d10_2048x1075.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!ElNn!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F67da31bb-ab41-4b77-be1f-e3b4fedd3d10_2048x1075.png" width="1456" height="764" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/67da31bb-ab41-4b77-be1f-e3b4fedd3d10_2048x1075.png&quot;,&quot;srcNoWatermark&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/48501db3-da2a-489d-9fe4-5dfd9b838689_2048x1075.png&quot;,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:764,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;Generator vs Discriminator Loss&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Generator vs Discriminator Loss" title="Generator vs Discriminator Loss" srcset="https://substackcdn.com/image/fetch/$s_!ElNn!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F67da31bb-ab41-4b77-be1f-e3b4fedd3d10_2048x1075.png 424w, https://substackcdn.com/image/fetch/$s_!ElNn!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F67da31bb-ab41-4b77-be1f-e3b4fedd3d10_2048x1075.png 848w, https://substackcdn.com/image/fetch/$s_!ElNn!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F67da31bb-ab41-4b77-be1f-e3b4fedd3d10_2048x1075.png 1272w, https://substackcdn.com/image/fetch/$s_!ElNn!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F67da31bb-ab41-4b77-be1f-e3b4fedd3d10_2048x1075.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"><em>Figure 3: The classic adversarial dance. The Generator and Discriminator losses oscillate as they compete and learn from each other.</em></figcaption></figure></div><p>The loss curves show the characteristic oscillatory behavior of a healthy GAN training process. Neither network overpowers the other for too long, indicating a stable learning environment. It was fascinating to watch the generated images evolve from pure noise into something that started to resemble a coherent, Monet-like scene.</p><h2>Results and Reflections</h2><p>After 5,000 epochs, the model was able to generate convincing, novel images that captured the essence of Monet&#8217;s style. The generated images exhibit the soft textures, impressionistic light, and characteristic color palettes found in his work.</p><p>Here is a sample of the generated images at different points in the training process:</p><p>Epoch 500</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!RA4Q!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff1bff09b-dac8-4e96-9ca8-87ca8483ee5a_266x134.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!RA4Q!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff1bff09b-dac8-4e96-9ca8-87ca8483ee5a_266x134.png 424w, https://substackcdn.com/image/fetch/$s_!RA4Q!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff1bff09b-dac8-4e96-9ca8-87ca8483ee5a_266x134.png 848w, https://substackcdn.com/image/fetch/$s_!RA4Q!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff1bff09b-dac8-4e96-9ca8-87ca8483ee5a_266x134.png 1272w, https://substackcdn.com/image/fetch/$s_!RA4Q!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff1bff09b-dac8-4e96-9ca8-87ca8483ee5a_266x134.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!RA4Q!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff1bff09b-dac8-4e96-9ca8-87ca8483ee5a_266x134.png" width="404" height="203.5187969924812" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f1bff09b-dac8-4e96-9ca8-87ca8483ee5a_266x134.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:134,&quot;width&quot;:266,&quot;resizeWidth&quot;:404,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;Grid of generated images showing the progression of the DCGAN model learning Monet's style from epoch 500.&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Grid of generated images showing the progression of the DCGAN model learning Monet's style from epoch 500." title="Grid of generated images showing the progression of the DCGAN model learning Monet's style from epoch 500." srcset="https://substackcdn.com/image/fetch/$s_!RA4Q!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff1bff09b-dac8-4e96-9ca8-87ca8483ee5a_266x134.png 424w, https://substackcdn.com/image/fetch/$s_!RA4Q!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff1bff09b-dac8-4e96-9ca8-87ca8483ee5a_266x134.png 848w, https://substackcdn.com/image/fetch/$s_!RA4Q!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff1bff09b-dac8-4e96-9ca8-87ca8483ee5a_266x134.png 1272w, https://substackcdn.com/image/fetch/$s_!RA4Q!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff1bff09b-dac8-4e96-9ca8-87ca8483ee5a_266x134.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a><figcaption class="image-caption"><em>Figure 4: Evolution of generated images over 500 epochs.</em></figcaption></figure></div><p>Epoch 5000</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!DrNI!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0ae3d3ee-dfd6-48d4-9e8b-7e3f94e773a6_266x134.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!DrNI!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0ae3d3ee-dfd6-48d4-9e8b-7e3f94e773a6_266x134.png 424w, https://substackcdn.com/image/fetch/$s_!DrNI!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0ae3d3ee-dfd6-48d4-9e8b-7e3f94e773a6_266x134.png 848w, https://substackcdn.com/image/fetch/$s_!DrNI!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0ae3d3ee-dfd6-48d4-9e8b-7e3f94e773a6_266x134.png 1272w, https://substackcdn.com/image/fetch/$s_!DrNI!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0ae3d3ee-dfd6-48d4-9e8b-7e3f94e773a6_266x134.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!DrNI!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0ae3d3ee-dfd6-48d4-9e8b-7e3f94e773a6_266x134.png" width="380" height="191.42857142857142" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/0ae3d3ee-dfd6-48d4-9e8b-7e3f94e773a6_266x134.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:134,&quot;width&quot;:266,&quot;resizeWidth&quot;:380,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;Grid of generated images showing the progression of the DCGAN model learning Monet's style from epoch 5000.&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Grid of generated images showing the progression of the DCGAN model learning Monet's style from epoch 5000." title="Grid of generated images showing the progression of the DCGAN model learning Monet's style from epoch 5000." srcset="https://substackcdn.com/image/fetch/$s_!DrNI!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0ae3d3ee-dfd6-48d4-9e8b-7e3f94e773a6_266x134.png 424w, https://substackcdn.com/image/fetch/$s_!DrNI!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0ae3d3ee-dfd6-48d4-9e8b-7e3f94e773a6_266x134.png 848w, https://substackcdn.com/image/fetch/$s_!DrNI!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0ae3d3ee-dfd6-48d4-9e8b-7e3f94e773a6_266x134.png 1272w, https://substackcdn.com/image/fetch/$s_!DrNI!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0ae3d3ee-dfd6-48d4-9e8b-7e3f94e773a6_266x134.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a><figcaption class="image-caption"><em>Figure 5: Evolution of generated images over 5,000 epochs. The model progresses from noisy patterns to coherent, Monet-style landscapes.</em></figcaption></figure></div><p>This project was a rewarding experience that deepened my understanding of GANs and the practical nuances of training them. It was a powerful reminder that behind the complex mathematics and code, the goal of generative AI is to create, to learn, and, in this case, to even appreciate art in a new way. The full code and a more detailed technical report are available on my <a href="https://github.com/NastaranMO/dcgan-monet-generation/">GitHub repository</a>.</p><p>I hope this post was insightful, and I welcome any questions or discussions. If you&#8217;re working on generative models or exploring computer vision, I&#8217;d be happy to connect and exchange ideas.</p><div><hr></div><h3>References</h3><p>[1] Radford, A., Metz, L., &amp; Chintala, S. (2015). Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks. <em>arXiv preprint arXiv:1511.06434</em>.<br>[2] Goodfellow, I. J., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., &#8230; &amp; Bengio, Y. (2014). Generative adversarial nets. In <em>Advances in neural information processing systems</em> (pp. 2672-2680).<br>[3] I&#8217;m Something of a Painter Myself. (2021). <em>Kaggle</em>. Retrieved from <a href="https://www.kaggle.com/competitions/gan-getting-started">https://www.kaggle.com/competitions/gan-getting-started</a></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://blog.nastaran.ai/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading NastaranAI! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[Why Google Maps Won't Let Gemini Take the Wheel]]></title><description><![CDATA[Not everything needs a language model. The smartest engineering teams are keeping generative AI away from their core logic! Here is why.]]></description><link>https://blog.nastaran.ai/p/generative-ai-vs-discriminative-models</link><guid isPermaLink="false">https://blog.nastaran.ai/p/generative-ai-vs-discriminative-models</guid><dc:creator><![CDATA[Nastaran Moghadasi]]></dc:creator><pubDate>Tue, 23 Dec 2025 08:32:06 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!UgK1!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5d1eac44-5329-4612-ae51-c2d1be9c5036_2625x1536.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>A few weeks ago, I watched someone try to hammer a screw into a wall! They succeeded, eventually. But the result was ugly, unstable, and took three times longer than it should have. This is roughly what happens when organizations deploy generative AI for problems that don&#8217;t need it.</p><p>The current discourse around AI has a selection bias problem. We hear constantly about what generative models can do, but we rarely hear about when they shouldn&#8217;t be used. This matters because choosing the wrong model architecture isn&#8217;t just inefficient. It can be dangerous.</p><p>Let me explain through a product you probably use every day.</p><h2><strong>The Navigation Problem</strong></h2><p>Consider GPS navigation apps like Google Maps or Waze. These systems solve several distinct problems: classifying traffic patterns, recognizing road signs from imagery, predicting travel times, and computing optimal routes.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!UgK1!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5d1eac44-5329-4612-ae51-c2d1be9c5036_2625x1536.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!UgK1!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5d1eac44-5329-4612-ae51-c2d1be9c5036_2625x1536.png 424w, https://substackcdn.com/image/fetch/$s_!UgK1!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5d1eac44-5329-4612-ae51-c2d1be9c5036_2625x1536.png 848w, https://substackcdn.com/image/fetch/$s_!UgK1!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5d1eac44-5329-4612-ae51-c2d1be9c5036_2625x1536.png 1272w, https://substackcdn.com/image/fetch/$s_!UgK1!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5d1eac44-5329-4612-ae51-c2d1be9c5036_2625x1536.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!UgK1!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5d1eac44-5329-4612-ae51-c2d1be9c5036_2625x1536.png" width="1456" height="852" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/5d1eac44-5329-4612-ae51-c2d1be9c5036_2625x1536.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:852,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:5978895,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://blog.nastaran.ai/i/182390127?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5d1eac44-5329-4612-ae51-c2d1be9c5036_2625x1536.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!UgK1!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5d1eac44-5329-4612-ae51-c2d1be9c5036_2625x1536.png 424w, https://substackcdn.com/image/fetch/$s_!UgK1!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5d1eac44-5329-4612-ae51-c2d1be9c5036_2625x1536.png 848w, https://substackcdn.com/image/fetch/$s_!UgK1!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5d1eac44-5329-4612-ae51-c2d1be9c5036_2625x1536.png 1272w, https://substackcdn.com/image/fetch/$s_!UgK1!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5d1eac44-5329-4612-ae51-c2d1be9c5036_2625x1536.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Here is a question worth asking. Which of these tasks actually benefits from a generative model?</p><p>To answer this, we need to be precise about what generative and discriminative actually mean.</p><p>A <strong><a href="https://papers.nips.cc/paper/2001/hash/7b7a53e239400a13bd6be6c91c4f6c4e-Abstract.html">discriminative model</a></strong> learns the boundary between categories. Given input <code>x</code>, it models <code>P(y&#8739;x)</code> directly, the probability of output <code>y</code> conditioned on input. It asks: <em>given this road segment and current traffic, is the delay 5 minutes or 20 minutes?</em></p><p>A <strong><a href="https://papers.nips.cc/paper/2001/hash/7b7a53e239400a13bd6be6c91c4f6c4e-Abstract.html">generative model</a></strong> learns the joint distribution <code>P(x,y)</code> or, in the case of modern large language models, models the probability of the next token given the context. It can generate new samples that resemble the training distribution. It asks: <em>what would a plausible route description look like?</em></p><div class="latex-rendered" data-attrs="{&quot;persistentExpression&quot;:&quot;P(x_{t+1} | x_1, ..., x_t)&quot;,&quot;id&quot;:&quot;HVWASBDQWV&quot;}" data-component-name="LatexBlockToDOM"></div><p>The distinction matters. These models have fundamentally different failure modes.</p><h2><strong>Why Discriminative Models Win for Pathfinding</strong></h2><p>When you ask Google Maps for directions to the airport, the blue line on your screen is computed by algorithms that have nothing to do with generative AI. The underlying computation is typically some variant of <a href="https://ieeexplore.ieee.org/document/4082128">A* search</a> or <a href="https://eudml.org/doc/131436">Dijkstra&#8217;s algorithm</a>. It is potentially augmented by <a href="https://arxiv.org/abs/2108.11482">graph neural networks for traffic prediction</a>.</p><div class="pullquote"><p>This is precisely the kind of problem where we <em>don&#8217;t</em> want creativity. We want the global optimum.</p></div><p>This is the correct architectural choice, and here is why.</p><p><strong>Ground truth constraints are non-negotiable.</strong> When you are driving at 100 km/h, you need the system to discriminate between a drivable road and a pedestrian path. The model must be bound by the physical reality of the road network. A generative model might produce a &#8220;plausible-looking&#8221; route that happens to include a road segment that was closed for construction because it fits the statistical pattern of routes it has seen, not because it is actually traversable.</p><p>The problem has a well-defined optimum. Route planning is a graph optimization problem. Given edge weights (travel times, distances, fuel costs), we want the path that minimizes total cost. This is precisely the kind of problem where we don&#8217;t want creativity. We want the global optimum. We have principled algorithms that can find it efficiently.</p><p><strong>Reliability dominates over delight.</strong> Generative AI excels when surprising outputs are valuable. In creative writing, an unexpected metaphor might be brilliant. In navigation, an unexpected route means you are lost in an unfamiliar neighborhood or late for your flight.</p><p>Imagine a generative navigation system. You ask for directions to the airport, and it suggests a route through what it imagines would be a scenic park road. It does this because in its training data, parks and pleasant drives co-occur frequently. The fact that this particular park has no through-road is lost in the statistical averaging. </p><div class="pullquote"><p>The model has optimized for <em>plausibility</em>, not <em>correctness</em>.</p></div><h2><strong>Wait, Didn&#8217;t Google Just Add Gemini to Maps?</strong></h2><p>Yes. And this is exactly where the nuance lives.</p><p>In late 2024, <a href="https://blog.google/products/maps/gemini-navigation-features-landmark-lens/">Google integrated Gemini into Google Maps</a>. The key question is: what is Gemini actually doing?</p><p>The answer reveals how thoughtful architecture works in practice.</p><h3><strong>What Gemini Does</strong></h3><p>Gemini powers the natural language interface and contextual descriptions. When the app tells you &#8220;turn right after the blue Thai restaurant,&#8221; that instruction was generated by a <a href="https://www.ibm.com/think/topics/multimodal-llm">multimodal model</a>. It analyzed Street View imagery and synthesized a human-friendly landmark description.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!JWJV!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8ae6f04-0ec0-44dc-b13e-04771a419362_2816x1246.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!JWJV!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8ae6f04-0ec0-44dc-b13e-04771a419362_2816x1246.png 424w, https://substackcdn.com/image/fetch/$s_!JWJV!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8ae6f04-0ec0-44dc-b13e-04771a419362_2816x1246.png 848w, https://substackcdn.com/image/fetch/$s_!JWJV!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8ae6f04-0ec0-44dc-b13e-04771a419362_2816x1246.png 1272w, https://substackcdn.com/image/fetch/$s_!JWJV!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8ae6f04-0ec0-44dc-b13e-04771a419362_2816x1246.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!JWJV!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8ae6f04-0ec0-44dc-b13e-04771a419362_2816x1246.png" width="1456" height="644" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e8ae6f04-0ec0-44dc-b13e-04771a419362_2816x1246.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:644,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:6003703,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://blog.nastaran.ai/i/182390127?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8ae6f04-0ec0-44dc-b13e-04771a419362_2816x1246.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!JWJV!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8ae6f04-0ec0-44dc-b13e-04771a419362_2816x1246.png 424w, https://substackcdn.com/image/fetch/$s_!JWJV!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8ae6f04-0ec0-44dc-b13e-04771a419362_2816x1246.png 848w, https://substackcdn.com/image/fetch/$s_!JWJV!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8ae6f04-0ec0-44dc-b13e-04771a419362_2816x1246.png 1272w, https://substackcdn.com/image/fetch/$s_!JWJV!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8ae6f04-0ec0-44dc-b13e-04771a419362_2816x1246.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>This is genuinely generative work. It involves creating natural language from raw data (GPS coordinates, business names, visual features). The model is <em>generating</em> descriptions that don&#8217;t exist in any database. It is reasoning about what would be most salient to a human driver scanning the street.</p><p>Gemini also powers semantic search. When you type &#8220;cozy cafe with parking near me,&#8221; you are querying with natural language that needs to be interpreted, not just matched against keywords. This is where the world knowledge of a generative model becomes valuable.</p><p>Crucially, this generation is securely tethered to reality through a process called <strong>grounding</strong>. As the specific documentation for <a href="https://ai.google.dev/gemini-api/docs/maps-grounding">Grounding with Google Maps</a> explains, when a user&#8217;s query contains geographical context, the Gemini model invokes the Maps API as a source of truth. The model then generates responses grounded in actual Google Maps data relevant to the location, rather than relying solely on its training weights.</p><h3><strong>What Gemini Doesn&#8217;t Do</strong></h3><p>Gemini does not compute your route.</p><p>The actual pathfinding (the blue line, the ETA, the turn-by-turn sequence) is still handled by optimization algorithms and <a href="https://arxiv.org/abs/2108.11482">graph neural networks</a>. These systems take the road network as a constraint rather than a suggestion. They find the minimum-cost path through a graph, with edge weights updated by discriminative models that predict traffic delays.</p><p>The architecture looks something like this:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!3CS0!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F08c03e7e-baf2-4305-b2d4-23b244ef8745_2000x1414.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!3CS0!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F08c03e7e-baf2-4305-b2d4-23b244ef8745_2000x1414.png 424w, https://substackcdn.com/image/fetch/$s_!3CS0!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F08c03e7e-baf2-4305-b2d4-23b244ef8745_2000x1414.png 848w, https://substackcdn.com/image/fetch/$s_!3CS0!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F08c03e7e-baf2-4305-b2d4-23b244ef8745_2000x1414.png 1272w, https://substackcdn.com/image/fetch/$s_!3CS0!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F08c03e7e-baf2-4305-b2d4-23b244ef8745_2000x1414.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!3CS0!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F08c03e7e-baf2-4305-b2d4-23b244ef8745_2000x1414.png" width="1456" height="1029" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/08c03e7e-baf2-4305-b2d4-23b244ef8745_2000x1414.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1029,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:154393,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://blog.nastaran.ai/i/182390127?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F08c03e7e-baf2-4305-b2d4-23b244ef8745_2000x1414.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!3CS0!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F08c03e7e-baf2-4305-b2d4-23b244ef8745_2000x1414.png 424w, https://substackcdn.com/image/fetch/$s_!3CS0!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F08c03e7e-baf2-4305-b2d4-23b244ef8745_2000x1414.png 848w, https://substackcdn.com/image/fetch/$s_!3CS0!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F08c03e7e-baf2-4305-b2d4-23b244ef8745_2000x1414.png 1272w, https://substackcdn.com/image/fetch/$s_!3CS0!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F08c03e7e-baf2-4305-b2d4-23b244ef8745_2000x1414.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>This is a hybrid architecture. It is hybrid for a reason.</p><div class="captioned-button-wrap" data-attrs="{&quot;url&quot;:&quot;https://blog.nastaran.ai/p/generative-ai-vs-discriminative-models?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;}" data-component-name="CaptionedButtonToDOM"><div class="preamble"><p class="cta-caption">Help your network find the global optimum! share this.</p></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://blog.nastaran.ai/p/generative-ai-vs-discriminative-models?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://blog.nastaran.ai/p/generative-ai-vs-discriminative-models?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p></div><h2><strong>The Safety Principle</strong></h2><p>Here is the key insight:</p><div class="pullquote"><p>Google uses Gemini for the <em>description</em> but not for the <em>direction</em>.</p></div><p>If Gemini were allowed to be creative with actual routing decisions, it might hallucinate that a pedestrian bridge is a valid vehicle crossing. Visually, bridges that cars use and bridges that only pedestrians use look similar. The model has learned associations, not physical constraints.</p><p>By constraining generative models to the interface layer (natural language input/output, landmark descriptions, semantic search) while keeping pathfinding in the realm of constrained optimization, the system gets the benefits of both paradigms.</p><ul><li><p>The generative model makes the experience feel natural and human.</p></li><li><p>The discriminative and optimization models keep you on actual roads.</p></li></ul><p>This isn&#8217;t a limitation of generative AI. It is appropriate scoping.</p><h2><strong>The Broader Principle</strong></h2><p>Navigation is just one example. The same logic applies across domains.</p><p><strong>Medical diagnosis:</strong> You probably want a discriminative model that estimates rather than a generative model that produces plausible-sounding diagnoses. The latter might generate confident text about a condition the patient doesn&#8217;t have.</p><p><strong>Fraud detection:</strong> The goal is to discriminate between legitimate and fraudulent transactions. A generative model might be useful for creating synthetic training data, but the production classifier should be discriminative.</p><p><strong>Structural engineering:</strong> When computing whether a bridge can support a given load, you want physics simulations and finite element analysis. You do not want a model that generates realistic-looking stress distributions.</p><p>The pattern is clear. When the problem has a ground truth that must be respected, when there is a well-defined optimum, and when creative outputs would be failures rather than features, discriminative models and traditional optimization often outperform generative approaches.</p><h2><strong>When Generative AI </strong><em><strong>Is</strong></em><strong> the Right Choice</strong></h2><p>To be clear, generative AI is genuinely transformative for many problems.</p><ul><li><p><strong>Open-ended content creation:</strong> writing, art, music, code generation.</p></li><li><p><strong>Natural language interfaces:</strong> making complex systems accessible through conversation.</p></li><li><p><strong>Semantic understanding:</strong> interpreting intent, summarizing documents, answering questions.</p></li><li><p><strong>Synthetic data generation:</strong> creating training examples for rare cases.</p></li><li><p><strong>Exploration and ideation:</strong> when you want novelty and surprise.</p></li></ul><p>The question isn&#8217;t whether generative AI is powerful. It demonstrably is. The question is whether it is appropriate for the specific problem you are solving.</p><h2><strong>The Meta-Point</strong></h2><p>We are in a moment where generative AI is being treated as a universal solution. The commercial pressure to add AI to every product is immense. But good engineering has always been about choosing the right tool for the job.</p><p>The most sophisticated AI systems today are hybrids. They use generative models where creativity and natural language matter, discriminative models where classification accuracy matters, and traditional algorithms where mathematical guarantees matter. The skill is in knowing which is which.</p><p><strong>The next time someone proposes adding a large language model to a system, it is worth asking: what problem are we actually solving, and is a generative model the right tool for it?</strong></p><p>Sometimes the answer is yes. And sometimes, you just need a good classifier and a well-tuned <a href="https://eudml.org/doc/131436">Dijkstra implementation</a>.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://blog.nastaran.ai/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption"><em>If you are interested in going deeper on model selection and AI architecture decisions, I am working on a series exploring these tradeoffs across different domains. Subscribe to catch the next one.</em></p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p><em>All images are generated using <a href="https://blog.google/technology/ai/nano-banana-pro/">Nano Banana Pro</a>.</em></p><h2><strong>References</strong></h2><p>Derrow-Pinion, A., She, J., Wong, D., Lange, O., Hester, T., Perez, L., Nunkesser, M., Lee, S., Guo, X., Wiltshire, B., Battaglia, P. W., Gupta, V., Li, A., Xu, Z., Sanchez-Gonzalez, A., and Li, Y. 2021. ETA Prediction with Graph Neural Networks in Google Maps. arXiv preprint arXiv:2108.11482. <a href="https://arxiv.org/abs/2108.11482">https://arxiv.org/abs/2108.11482</a></p><p>Dijkstra, E. W. 1959. A Note on Two Problems in Connexion with Graphs. Numerische Mathematik, 1(1), 269&#8211;271. <a href="https://eudml.org/doc/131436">https://eudml.org/doc/131436</a></p><p>Google. 2024. Grounding with Google Maps. Google AI for Developers. <a href="https://ai.google.dev/gemini-api/docs/maps-grounding">https://ai.google.dev/gemini-api/docs/maps-grounding</a></p><p>Google. 2024. New ways to get around with Google Maps, powered by AI. The Keyword (Google Blog). <a href="https://blog.google/products/maps/gemini-navigation-features-landmark-lens/">https://blog.google/products/maps/gemini-navigation-features-landmark-lens/</a></p><p>Hart, P. E., Nilsson, N. J., and Raphael, B. 1968. A Formal Basis for the Heuristic Determination of Minimum Cost Paths. IEEE Transactions on Systems Science and Cybernetics, 4(2), 100&#8211;107. <a href="https://ieeexplore.ieee.org/document/4082128">https://ieeexplore.ieee.org/document/4082128</a></p><p>Ng, A. Y. and Jordan, M. I. 2002. On Discriminative vs. Generative Classifiers: A comparison of logistic regression and naive Bayes. Advances in Neural Information Processing Systems 14. <a href="https://papers.nips.cc/paper/2001/hash/7b7a53e239400a13bd6be6c91c4f6c4e-Abstract.html">https://papers.nips.cc/paper/2001/hash/7b7a53e239400a13bd6be6c91c4f6c4e-Abstract.html</a></p>]]></content:encoded></item><item><title><![CDATA[How AI Knows a Cat Is Like a Dog: An Intuitive Guide to Word Embeddings]]></title><description><![CDATA[From the "King - Man + Woman &#8776; Queen" math of GloVe to the contextual intelligence of BERT, discover how machines finally learned to read between the lines.]]></description><link>https://blog.nastaran.ai/p/understanding-word-embeddings-glove-vs-bert</link><guid isPermaLink="false">https://blog.nastaran.ai/p/understanding-word-embeddings-glove-vs-bert</guid><dc:creator><![CDATA[Nastaran Moghadasi]]></dc:creator><pubDate>Thu, 18 Dec 2025 20:21:43 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/86c79e9f-e402-4a86-a711-7c012f7d7aaf_2269x1536.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Have you ever wondered how a computer knows that a cat is more like a dog than a car? </p><p>To a machine, words are just strings of characters or arbitrary ID numbers. But in the world of Natural Language Processing, we&#8217;ve found a way to give words a home in a multi-dimensional space. In this space, the neighbors are their semantic relatives.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://blog.nastaran.ai/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading NastaranAI! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p>In this post, we&#8217;ll explore the fascinating world of <strong>word embeddings</strong>. We&#8217;ll start with the intuition (no deep technical dives) and build up  a clear understanding of what word embeddings really are (along with code), and how they enable AI systems to capture meaning and relationships in human language.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!noq3!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fddcdf563-72b7-49d7-bafe-7399a7a75358_1536x1024.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!noq3!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fddcdf563-72b7-49d7-bafe-7399a7a75358_1536x1024.png 424w, https://substackcdn.com/image/fetch/$s_!noq3!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fddcdf563-72b7-49d7-bafe-7399a7a75358_1536x1024.png 848w, https://substackcdn.com/image/fetch/$s_!noq3!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fddcdf563-72b7-49d7-bafe-7399a7a75358_1536x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!noq3!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fddcdf563-72b7-49d7-bafe-7399a7a75358_1536x1024.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!noq3!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fddcdf563-72b7-49d7-bafe-7399a7a75358_1536x1024.png" width="725" height="483.3333333333333" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ddcdf563-72b7-49d7-bafe-7399a7a75358_1536x1024.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1024,&quot;width&quot;:1536,&quot;resizeWidth&quot;:725,&quot;bytes&quot;:2931309,&quot;alt&quot;:&quot;word embeddings in nlp&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://blog.nastaran.ai/i/181978295?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa3a3d836-8451-45d5-99a4-8b6cb44d8a95_1536x1024.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="word embeddings in nlp" title="word embeddings in nlp" srcset="https://substackcdn.com/image/fetch/$s_!noq3!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fddcdf563-72b7-49d7-bafe-7399a7a75358_1536x1024.png 424w, https://substackcdn.com/image/fetch/$s_!noq3!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fddcdf563-72b7-49d7-bafe-7399a7a75358_1536x1024.png 848w, https://substackcdn.com/image/fetch/$s_!noq3!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fddcdf563-72b7-49d7-bafe-7399a7a75358_1536x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!noq3!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fddcdf563-72b7-49d7-bafe-7399a7a75358_1536x1024.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Word embeddings in NLP, comparing cat, dog, and car in space.</figcaption></figure></div><h2>The Magic of Word Math: Static Embeddings</h2><p>Imagine if you could do math with ideas. The <a href="https://spotintelligence.com/2023/11/27/glove-embedding/">classic example</a> in the world of embeddings is:</p><div class="pullquote"><p><strong>King - Man + Woman &#8776; Queen</strong></p></div><p>This isn&#8217;t just a clever trick! This is the power of <strong>Static Embeddings</strong> like <strong><a href="https://nlp.stanford.edu/projects/glove/">GloVe</a> </strong>(Global Vectors for Word Representation). </p><p>GloVe works by looking at massive amounts of text to see how often words appear near each other. It then assigns each word a fixed numerical vector. Because these vectors represent the &#8220;<em>meaning</em>&#8221;, words that are semantically similar end up close together.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!0Iht!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffe25d592-f517-4fbd-b9b1-0fbadc803d61_894x832.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!0Iht!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffe25d592-f517-4fbd-b9b1-0fbadc803d61_894x832.png 424w, https://substackcdn.com/image/fetch/$s_!0Iht!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffe25d592-f517-4fbd-b9b1-0fbadc803d61_894x832.png 848w, https://substackcdn.com/image/fetch/$s_!0Iht!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffe25d592-f517-4fbd-b9b1-0fbadc803d61_894x832.png 1272w, https://substackcdn.com/image/fetch/$s_!0Iht!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffe25d592-f517-4fbd-b9b1-0fbadc803d61_894x832.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!0Iht!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffe25d592-f517-4fbd-b9b1-0fbadc803d61_894x832.png" width="432" height="402.0402684563758" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/fe25d592-f517-4fbd-b9b1-0fbadc803d61_894x832.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:832,&quot;width&quot;:894,&quot;resizeWidth&quot;:432,&quot;bytes&quot;:267967,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://blog.nastaran.ai/i/181978295?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffe25d592-f517-4fbd-b9b1-0fbadc803d61_894x832.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!0Iht!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffe25d592-f517-4fbd-b9b1-0fbadc803d61_894x832.png 424w, https://substackcdn.com/image/fetch/$s_!0Iht!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffe25d592-f517-4fbd-b9b1-0fbadc803d61_894x832.png 848w, https://substackcdn.com/image/fetch/$s_!0Iht!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffe25d592-f517-4fbd-b9b1-0fbadc803d61_894x832.png 1272w, https://substackcdn.com/image/fetch/$s_!0Iht!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffe25d592-f517-4fbd-b9b1-0fbadc803d61_894x832.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">King is closer to queen than man or woman.</figcaption></figure></div><h2>The Bank Problem: When One Vector Isn&#8217;t Enough</h2><p>As powerful as static models like GloVe are, they have a blind spot called <strong>polysemy</strong>: words with multiple meanings.</p><p>Think about the word <strong>&#8220;bank&#8221;</strong>:</p><ol><li><p>I need to go to the <strong>bank</strong> to deposit some money. (A financial institution).</p></li><li><p>We sat on the <strong>bank</strong> of the river. (The edge of a river).</p></li></ol><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!fccT!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F41aefbd4-141b-450b-a162-82b6baa5712c_2992x1744.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!fccT!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F41aefbd4-141b-450b-a162-82b6baa5712c_2992x1744.png 424w, https://substackcdn.com/image/fetch/$s_!fccT!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F41aefbd4-141b-450b-a162-82b6baa5712c_2992x1744.png 848w, https://substackcdn.com/image/fetch/$s_!fccT!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F41aefbd4-141b-450b-a162-82b6baa5712c_2992x1744.png 1272w, https://substackcdn.com/image/fetch/$s_!fccT!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F41aefbd4-141b-450b-a162-82b6baa5712c_2992x1744.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!fccT!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F41aefbd4-141b-450b-a162-82b6baa5712c_2992x1744.png" width="728" height="424.5" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/41aefbd4-141b-450b-a162-82b6baa5712c_2992x1744.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:849,&quot;width&quot;:1456,&quot;resizeWidth&quot;:728,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;bank vs river bank meaning&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="bank vs river bank meaning" title="bank vs river bank meaning" srcset="https://substackcdn.com/image/fetch/$s_!fccT!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F41aefbd4-141b-450b-a162-82b6baa5712c_2992x1744.png 424w, https://substackcdn.com/image/fetch/$s_!fccT!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F41aefbd4-141b-450b-a162-82b6baa5712c_2992x1744.png 848w, https://substackcdn.com/image/fetch/$s_!fccT!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F41aefbd4-141b-450b-a162-82b6baa5712c_2992x1744.png 1272w, https://substackcdn.com/image/fetch/$s_!fccT!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F41aefbd4-141b-450b-a162-82b6baa5712c_2992x1744.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Bank vs river bank (two different meanings of one word).</figcaption></figure></div><p>In a static model like GloVe, a bank has one single, fixed vector. This single meaning is an average across all contexts the model saw during training. This means the model can&#8217;t truly distinguish between a place where you keep your savings and the grassy side of a river.</p><h2>The Solution: Contextual Embeddings with BERT</h2><p>This is where&nbsp;<strong>Dynamic or Contextual Embeddings,</strong>&nbsp;like&nbsp;<strong><a href="https://arxiv.org/abs/1810.04805">BERT</a> </strong>(Bidirectional Encoder Representations from Transformers)<strong>,</strong>&nbsp;have changed the game. Unlike GloVe, BERT doesn&#8217;t just look up a word in a fixed dictionary. It looks at the <strong>entire sentence</strong> to generate a unique vector for a word every single time it appears.</p><p>When BERT processes our two bank sentences, it recognizes the surrounding words (like &#8220;river&#8221; or &#8220;deposit&#8221;) and generates two completely different vectors. It understands that the context changes the core identity of the word.</p><p>Here is the simple usage of BERT with PyTorch in code:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!3WU8!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fad7c2218-ab70-45eb-ad20-2773cc47bb3c_1440x2046.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!3WU8!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fad7c2218-ab70-45eb-ad20-2773cc47bb3c_1440x2046.png 424w, https://substackcdn.com/image/fetch/$s_!3WU8!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fad7c2218-ab70-45eb-ad20-2773cc47bb3c_1440x2046.png 848w, https://substackcdn.com/image/fetch/$s_!3WU8!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fad7c2218-ab70-45eb-ad20-2773cc47bb3c_1440x2046.png 1272w, https://substackcdn.com/image/fetch/$s_!3WU8!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fad7c2218-ab70-45eb-ad20-2773cc47bb3c_1440x2046.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!3WU8!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fad7c2218-ab70-45eb-ad20-2773cc47bb3c_1440x2046.png" width="1440" height="2046" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ad7c2218-ab70-45eb-ad20-2773cc47bb3c_1440x2046.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:2046,&quot;width&quot;:1440,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:406455,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://blog.nastaran.ai/i/181978295?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fad7c2218-ab70-45eb-ad20-2773cc47bb3c_1440x2046.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!3WU8!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fad7c2218-ab70-45eb-ad20-2773cc47bb3c_1440x2046.png 424w, https://substackcdn.com/image/fetch/$s_!3WU8!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fad7c2218-ab70-45eb-ad20-2773cc47bb3c_1440x2046.png 848w, https://substackcdn.com/image/fetch/$s_!3WU8!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fad7c2218-ab70-45eb-ad20-2773cc47bb3c_1440x2046.png 1272w, https://substackcdn.com/image/fetch/$s_!3WU8!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fad7c2218-ab70-45eb-ad20-2773cc47bb3c_1440x2046.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h3>Output</h3><p>The output shows that BERT assigns different vectors to the word <em>bank</em> based on its surrounding context.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!w4mh!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F395a6360-4b96-4f74-a8bb-ebbdeb439102_1516x544.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!w4mh!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F395a6360-4b96-4f74-a8bb-ebbdeb439102_1516x544.png 424w, https://substackcdn.com/image/fetch/$s_!w4mh!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F395a6360-4b96-4f74-a8bb-ebbdeb439102_1516x544.png 848w, https://substackcdn.com/image/fetch/$s_!w4mh!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F395a6360-4b96-4f74-a8bb-ebbdeb439102_1516x544.png 1272w, https://substackcdn.com/image/fetch/$s_!w4mh!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F395a6360-4b96-4f74-a8bb-ebbdeb439102_1516x544.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!w4mh!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F395a6360-4b96-4f74-a8bb-ebbdeb439102_1516x544.png" width="1456" height="522" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/395a6360-4b96-4f74-a8bb-ebbdeb439102_1516x544.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:522,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:180195,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://blog.nastaran.ai/i/181978295?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F395a6360-4b96-4f74-a8bb-ebbdeb439102_1516x544.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!w4mh!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F395a6360-4b96-4f74-a8bb-ebbdeb439102_1516x544.png 424w, https://substackcdn.com/image/fetch/$s_!w4mh!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F395a6360-4b96-4f74-a8bb-ebbdeb439102_1516x544.png 848w, https://substackcdn.com/image/fetch/$s_!w4mh!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F395a6360-4b96-4f74-a8bb-ebbdeb439102_1516x544.png 1272w, https://substackcdn.com/image/fetch/$s_!w4mh!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F395a6360-4b96-4f74-a8bb-ebbdeb439102_1516x544.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Sentence 1: I went to the bank to deposit money.</figcaption></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!FVEV!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd6fd96a0-84c6-472e-a007-ca4bbacc4d1b_1518x540.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!FVEV!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd6fd96a0-84c6-472e-a007-ca4bbacc4d1b_1518x540.png 424w, https://substackcdn.com/image/fetch/$s_!FVEV!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd6fd96a0-84c6-472e-a007-ca4bbacc4d1b_1518x540.png 848w, https://substackcdn.com/image/fetch/$s_!FVEV!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd6fd96a0-84c6-472e-a007-ca4bbacc4d1b_1518x540.png 1272w, https://substackcdn.com/image/fetch/$s_!FVEV!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd6fd96a0-84c6-472e-a007-ca4bbacc4d1b_1518x540.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!FVEV!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd6fd96a0-84c6-472e-a007-ca4bbacc4d1b_1518x540.png" width="1456" height="518" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d6fd96a0-84c6-472e-a007-ca4bbacc4d1b_1518x540.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:518,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:177171,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://blog.nastaran.ai/i/181978295?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd6fd96a0-84c6-472e-a007-ca4bbacc4d1b_1518x540.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!FVEV!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd6fd96a0-84c6-472e-a007-ca4bbacc4d1b_1518x540.png 424w, https://substackcdn.com/image/fetch/$s_!FVEV!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd6fd96a0-84c6-472e-a007-ca4bbacc4d1b_1518x540.png 848w, https://substackcdn.com/image/fetch/$s_!FVEV!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd6fd96a0-84c6-472e-a007-ca4bbacc4d1b_1518x540.png 1272w, https://substackcdn.com/image/fetch/$s_!FVEV!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd6fd96a0-84c6-472e-a007-ca4bbacc4d1b_1518x540.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Sentence 2: We sat on the bank of the river.</figcaption></figure></div><h2>Which Model Should You Use?</h2><p>Choosing the right embedding depends entirely on your specific task and your available computational resources.</p><p><strong>Static Embeddings (like GloVe)</strong> are the best choice when you need a fast, computationally lightweight solution with a small memory footprint. They are perfect for straightforward tasks like document classification, where the broader meaning of words is usually sufficient.</p><p>On the other hand, <strong>Contextual Embeddings (such as BERT)</strong>&nbsp;are necessary when your task requires a deep understanding of language and ambiguity, such as question answering or advanced chatbots. They excel at handling words with multiple meanings, which is often the key to an application&#8217;s success. However, keep in mind that they require more computational power and a larger memory footprint. </p><h2>Wrapping Up</h2><p>Embeddings are the foundation of how AI reads and processes our human world. Whether you are using a pre-trained model like BERT or building a simple embedding model from scratch using <strong>PyTorch&#8217;s </strong><code>nn.Embedding</code><strong> layer</strong>, you are essentially building a bridge between human thought and machine calculation.</p><p><strong>What do you think?</strong> If you were training a model from scratch today, what specific vocabulary or niche topic would you want it to learn first? Let me know in the comments &#128071;.</p><p>Note: All illustrations in this post were generated using <a href="https://openai.com/index/dall-e-3/">DALL&#183;E</a> 3.</p><h2>Quick Quiz</h2><p>Let&#8217;s test your understanding. Share your answer in the comments &#128071;.</p><div class="pullquote"><p><strong>How does text data differ from image data in machine learning?</strong></p></div><h2>References</h2><ol><li><p>Devlin, J. et al. (2018). <em>BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding</em>.</p></li><li><p>Stanford NLP Group. <em>GloVe: Global Vectors for Word Representation</em>.</p></li><li><p>Spot Intelligence. <em>GloVe Embeddings Explained</em>.</p></li></ol><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://blog.nastaran.ai/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading NastaranAI! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[Diamonds or DAll-E? The Luxury Dilemma]]></title><description><![CDATA[Why I deliberately ignored the best model to save a luxury marketplace from AI fakes.]]></description><link>https://blog.nastaran.ai/p/diamonds-or-dall-e-the-luxury-dilemma</link><guid isPermaLink="false">https://blog.nastaran.ai/p/diamonds-or-dall-e-the-luxury-dilemma</guid><dc:creator><![CDATA[Nastaran Moghadasi]]></dc:creator><pubDate>Tue, 16 Dec 2025 09:14:17 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/ab285609-d740-4a48-9874-c6a9e53cdb62_2248x1536.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Welcome back, AI Explorers!</p><p>In the world of machine learning, we often get obsessed with a single number: <strong>Accuracy</strong>. We ask: </p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://blog.nastaran.ai/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading NastaranAI! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><div class="pullquote"><p>How often is the model right?</p></div><p>But when you are building for the real world, especially where money and trust are on the line, being RIGHT most of the time isn&#8217;t enough. Sometimes, <em>how</em> you are wrong matters just as much.</p><p>In my latest project, I tackled a modern problem: distinguishing genuine photography from AI simulations. The specific use case? A high-end luxury jewelry marketplace.</p><p>Users upload photos of their expensive pieces: diamond rings, vintage watches, and rare gems. They sell on the platform. But there is a growing problem: Generative AI tools like MidJourney and DALL-E are getting frighteningly good at rendering photorealistic jewelry that <em>doesn&#8217;t exist</em>.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!oh27!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F08d58456-5671-4889-a218-9c93367cfed3_2241x1792.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!oh27!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F08d58456-5671-4889-a218-9c93367cfed3_2241x1792.png 424w, https://substackcdn.com/image/fetch/$s_!oh27!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F08d58456-5671-4889-a218-9c93367cfed3_2241x1792.png 848w, https://substackcdn.com/image/fetch/$s_!oh27!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F08d58456-5671-4889-a218-9c93367cfed3_2241x1792.png 1272w, https://substackcdn.com/image/fetch/$s_!oh27!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F08d58456-5671-4889-a218-9c93367cfed3_2241x1792.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!oh27!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F08d58456-5671-4889-a218-9c93367cfed3_2241x1792.png" width="404" height="322.97802197802196" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/08d58456-5671-4889-a218-9c93367cfed3_2241x1792.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1164,&quot;width&quot;:1456,&quot;resizeWidth&quot;:404,&quot;bytes&quot;:6229359,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://blog.nastaran.ai/i/181679950?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F08d58456-5671-4889-a218-9c93367cfed3_2241x1792.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!oh27!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F08d58456-5671-4889-a218-9c93367cfed3_2241x1792.png 424w, https://substackcdn.com/image/fetch/$s_!oh27!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F08d58456-5671-4889-a218-9c93367cfed3_2241x1792.png 848w, https://substackcdn.com/image/fetch/$s_!oh27!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F08d58456-5671-4889-a218-9c93367cfed3_2241x1792.png 1272w, https://substackcdn.com/image/fetch/$s_!oh27!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F08d58456-5671-4889-a218-9c93367cfed3_2241x1792.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>My mission was to build an AI guardrail that could look at an uploaded image and instantly flag if it was <strong>Real (1)</strong> or <strong>AI-Generated (0)</strong>.</p><h3>The Luxury Dilemma</h3><p>When you are dealing with items worth $5,000 or more, the platform&#8217;s reputation is everything. In this scenario, my model faced two very different types of failure:</p><ol><li><p><strong>The Profit Risk (False Negative):</strong> The AI mistakes a real photo of a user&#8217;s diamond ring for a fake. The user gets blocked, gets annoyed, and we lose a potential commission. This is bad, but it&#8217;s a manageable loss of profit.</p></li><li><p><strong>The Reputation Risk (False Positive):</strong> The AI mistakes a scammer&#8217;s AI-generated image for a real photo. The listing goes live. A customer pays a fortune for a phantom product. When the item turns out to be fake (or never arrives), the platform&#8217;s credibility is destroyed.</p></li></ol><p>If I had optimized my model only for <strong>Accuracy</strong>, it might have performed well overall. But if it let even a handful of AI fakes slip through because it was trying to be generally correct, the business would be in serious trouble.</p><h3>Enter Precision</h3><p>This is why I focused heavily on <strong>Precision</strong>. I like to call it here as the Reputation Metric.</p><p>Precision answers a specific, critical question for this jewelry platform: </p><div class="pullquote"><p><strong>Out of all the listings my model </strong><em><strong>approved</strong></em><strong> as real, how many were actually real?</strong></p></div><p>The formula is simple:</p><div class="latex-rendered" data-attrs="{&quot;persistentExpression&quot;:&quot;\\text{Precision} = \\frac{\\text{True Positives}}{\\text{True Positives} + \\text{False Positives}}&quot;,&quot;id&quot;:&quot;DBXHQLOZDK&quot;}" data-component-name="LatexBlockToDOM"></div><ul><li><p><strong>True Positive:</strong> The model said Real Jewelry, and it was real. (Safe!)</p></li><li><p><strong>False Positive:</strong> The model said Real Jewelry, but it was AI. (Danger!)</p></li></ul><p>A high Precision score means the model acts like a strict gatekeeper. It prioritizes eliminating False Positives to protect the platform&#8217;s reputation, even if that means being a bit too harsh on some real listings.</p><h3>How I Implemented It</h3><p>In PyTorch, we don&#8217;t need to calculate this manually. I used the <code>torchmetrics</code> library to track this alongside accuracy during the validation loops.</p><p>Here is a snippet from my code showing how I tracked this metric:</p><p>Python code:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!S-1D!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F00a28584-01fb-4599-92a1-add34a716f8e_1256x1174.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!S-1D!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F00a28584-01fb-4599-92a1-add34a716f8e_1256x1174.png 424w, https://substackcdn.com/image/fetch/$s_!S-1D!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F00a28584-01fb-4599-92a1-add34a716f8e_1256x1174.png 848w, https://substackcdn.com/image/fetch/$s_!S-1D!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F00a28584-01fb-4599-92a1-add34a716f8e_1256x1174.png 1272w, https://substackcdn.com/image/fetch/$s_!S-1D!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F00a28584-01fb-4599-92a1-add34a716f8e_1256x1174.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!S-1D!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F00a28584-01fb-4599-92a1-add34a716f8e_1256x1174.png" width="1256" height="1174" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/00a28584-01fb-4599-92a1-add34a716f8e_1256x1174.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1174,&quot;width&quot;:1256,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:216630,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://blog.nastaran.ai/i/181679950?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F00a28584-01fb-4599-92a1-add34a716f8e_1256x1174.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!S-1D!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F00a28584-01fb-4599-92a1-add34a716f8e_1256x1174.png 424w, https://substackcdn.com/image/fetch/$s_!S-1D!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F00a28584-01fb-4599-92a1-add34a716f8e_1256x1174.png 848w, https://substackcdn.com/image/fetch/$s_!S-1D!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F00a28584-01fb-4599-92a1-add34a716f8e_1256x1174.png 1272w, https://substackcdn.com/image/fetch/$s_!S-1D!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F00a28584-01fb-4599-92a1-add34a716f8e_1256x1174.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h3>The Takeaway</h3><p>When I ran my hyperparameter search (using <a href="https://optuna.org/">Optuna</a>), I found that the models with the highest <strong>Accuracy</strong> didn&#8217;t always have the highest <strong>Precision</strong>.</p><p>If I were building a casual photo-sharing app, I might have picked the most accurate model. But for a luxury jewelry marketplace where a single fake listing can ruin the brand? I chose the model with the highest <strong>Precision</strong>.</p><p><strong>The lesson:</strong> </p><div class="pullquote"><p>Before you deploy a model, look beyond the accuracy score. Ask yourself: <em>What is the cost of being wrong?</em> If the cost is your reputation, make Precision your priority.</p></div><p>I&#8217;d love to hear your thoughts in the comments! If there&#8217;s a specific AI topic you&#8217;re curious about, let me know. I&#8217;d be happy to cover it in a future post.</p><h2>References</h2><ol><li><p>https://docs.pytorch.org/ignite/generated/ignite.metrics.precision.Precision.html</p></li></ol><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://blog.nastaran.ai/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading NastaranAI! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[The Four Surprising Truths About AI Agents in 2025]]></title><description><![CDATA[The path to successful AI agents is becoming clearer, but it is not what the hype promised.]]></description><link>https://blog.nastaran.ai/p/state-of-agentic-ai-2025-multi-agent-systems</link><guid isPermaLink="false">https://blog.nastaran.ai/p/state-of-agentic-ai-2025-multi-agent-systems</guid><dc:creator><![CDATA[Nastaran Moghadasi]]></dc:creator><pubDate>Mon, 15 Dec 2025 07:16:53 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/edfa6f17-8362-413c-98bd-202279a57da5_1024x1024.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h1>Beyond the Buzz: 4 Real Truths About AI Agents in 2025</h1><p>As we are wrapping up 2025, let&#8217;s take a look at the true state of the &#8220;Agentic Revolution.&#8221;</p><p>Since ChatGPT was released, the growth of AI agents has been huge. Generative AI is now used everywhere in technical jobs. Everyone is excited about building systems that can reason, plan, and act on their own. Teams in every industry are racing to launch agents that promise to handle complex work and create new possibilities.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://blog.nastaran.ai/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading NastaranAI! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p>However, beneath this excitement, engineers and product leaders are finding a different reality.</p><p>The first phase of simple prototypes is ending. Now, we face the big challenge of building systems that are reliable, scalable, and ready for real business. What works in a simple demo often breaks when faced with real-world complexity and the hidden costs of autonomy.</p><p>Here are four surprising lessons that technical and product leaders need to understand to succeed with AI agents in 2025. These are the hard truths from the real world that separate the hype from reality.</p><h3>1. The Single &#8220;Super-Agent&#8221; is Disappearing</h3><p>The biggest trend in AI right now is a clear change in structure. We are moving away from the idea of one agent doing everything. We are moving toward Multi-Agent Systems (MAS).</p><p>The idea of a single &#8220;general&#8221; agent that can solve every problem is failing in the real world. Even with the best models, a single agent has natural limits. They have limited memory, they make mistakes (hallucinations), and they cannot handle many different tasks at the same time. They struggle to be experts in everything and quickly slow down the process.</p><div class="pullquote"><p>The future is a team, not a hero.</p></div><p>A multi-agent system is a better solution. This is a team of specialized AI agents that work together, talk to each other, and share tasks. Importantly, these teams are managed by an orchestrator. This is a central manager that breaks down hard tasks and controls the information flow between agents.</p><p>This change brings three key benefits:</p><ul><li><p>Specialized Skills: A hard problem is broken down for experts. You might have a &#8220;Researcher&#8221; agent, an &#8220;Analyst&#8221; agent, and a &#8220;Writer&#8221; agent. Each one is perfect for its specific job.</p></li><li><p>Speed: Specialized agents can work at the same time. By doing tasks in parallel, you finish the work much faster.</p></li><li><p>Better Accuracy: Agents can check each other&#8217;s work. One agent can review or verify what another agent did. This reduces errors and makes the final result more reliable.</p></li></ul><h3>2. You Are Focusing on Prompts, But &#8220;Context Engineering&#8221; is the Real Challenge</h3><p>Prompt engineering is popular, but teams building real software are finding that it is only a small part of a bigger challenge: Context Engineering.</p><p>Context engineering is the skill of controlling the information that flows to and from an agent. It involves carefully managing the agent&#8217;s memory to make sure its thinking is correct and efficient. </p><p>This is critical because, according to McKinsey survey, &#8220;<a href="https://www.mckinsey.com/capabilities/quantumblack/our-insights/the-state-of-ai">The State of AI in 2025: Agents, Innovation, and Transformation</a>,&#8221; inaccuracy is the number one risk that organizations face with AI today. Nearly one-third of companies report negative consequences specifically because of AI errors.</p><p>If you fail to manage this flow, you will have serious issues, such as:</p><ul><li><p>Context Bloat: This happens when the conversation history gets too big. It costs more money and confuses the agents with too much useless information.</p></li><li><p>Context Poisoning: This happens when a mistake enters the memory and gets repeated. The agent builds on this bad information, which causes it to make nonsense decisions.</p></li><li><p>Context Distraction: When a model has too much information to look at, it often just repeats old actions instead of finding new solutions. The performance gets worse long before the memory is technically full.</p></li></ul><div class="pullquote"><p>To master AI agents, you need a new mindset. Stop trying to write the &#8220;perfect prompt&#8221; and start thinking like a systems engineer. </p></div><p>The information you give the AI must go through a compiler pipeline. This is a process that turns all the data into a clean, short, and relevant view for the agent.</p><h3>3. The Biggest Challenge Is Leadership, Not Technology</h3><p>Using truly autonomous AI agents requires more than just new software. It requires a fundamental change in how leaders think.</p><p>For decades, business leaders have tried to reduce risks. We are trained to build predictable processes that do the same thing over and over without errors. AI agents are the opposite.</p><div class="pullquote"><p>The unpredictability and the ability to adapt is a feature, not a bug.</p></div><p>Ishit Vachhrajani quoted this at AWS re:Invent 2025 - A Leader&#8217;s Guide to Agentic AI (<a href="https://www.youtube.com/watch?v=rG8OKTYK6o8&amp;">SNR201</a>).This is hard because leaders are usually rewarded for making things predictable. This new reality can be scary, especially in industries with strict rules and regulations. The answer is not to stop the autonomy but to manage it with new principles.</p><p>We must apply a &#8220;Zero-Trust&#8221; mindset to agents. This means &#8220;never trust, always verify.&#8221; Every action an agent takes must be checked, tracked, and aligned with company goals.</p><p>Safety rules cannot be added at the end. They must be built into the foundation of the system. If trust is the foundation, then governance is the structure that keeps everything standing. It ensures that agents operate safely and ethically, even when they work at high speed.</p><p>&#8220;High performers&#8221; in AI are nearly three times as likely as others to fundamentally redesign their workflows. They do not just add AI to old processes. They change how the work gets done with putting Agentic AI at core.</p><h3>4. The Hype is Real, But So is the 87% Failure Rate</h3><p>While AI agents promise to change the world, we must be realistic about where the technology is today. </p><p>Recent tests show a big gap between the theory and the actual performance.</p><p>Look at these statistics from recent <a href="https://xue-guang.com/post/llm-marl/">benchmarks</a>:</p><blockquote><p>The ChatDev framework, which simulates a software development team, gets only 33.3% correctness on programming tasks in the ProgramDev benchmark. </p><p>The AppWorld benchmark, which tests how agents do tasks across different apps, shows an 86.7% failure rate on complex cases.</p></blockquote><p>These numbers do not mean AI agents are a failure. They simply show that we are still in the early days. Moving from a simple prototype to a reliable system that can handle unpredictable business work is a giant step.</p><p>This serious reality is why we need specialized teams (Point 1), strict context engineering (Point 2), and a safety-first leadership mindset (Point 3). These are not just nice theories. They are necessary to build systems that succeed where today&#8217;s benchmarks show failure.</p><h3>Designing for a New Reality</h3><p>The path to successful AI agents is becoming clearer, but it is not what the hype promised.</p><p>The journey requires moving beyond the idea of a single, powerful agent. We must embrace collaborative teams. We must shift our focus from writing prompts to building complex context pipelines. We need a new leadership mindset that accepts unpredictability but manages it with strict rules.</p><p>As we build these smart and autonomous systems, the key question isn&#8217;t just what they can do. It is how we must adapt our strategies and expectations to manage them well.</p><p>Are your organizations ready for that shift?</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://blog.nastaran.ai/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading NastaranAI! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[Why Your Model is Failing (Hint: It’s Not the Architecture)]]></title><description><![CDATA[A bug-proof guide to lazy loading, smart transforms, and handling messy production data in PyTorch.]]></description><link>https://blog.nastaran.ai/p/pytorch-bugproof-data-pipeline-guide</link><guid isPermaLink="false">https://blog.nastaran.ai/p/pytorch-bugproof-data-pipeline-guide</guid><dc:creator><![CDATA[Nastaran Moghadasi]]></dc:creator><pubDate>Sat, 13 Dec 2025 19:37:30 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/44f97b9b-1b58-4c40-84b9-15370f8ecb1b_2595x1536.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>We&#8217;ve all been there. You spend days tuning hyperparameters and tweaking your architecture, but the loss curve just won&#8217;t cooperate. In my experience, the difference between a successful project and a failure is rarely the model architecture. It&#8217;s almost always the data pipeline.</p><p>I recently built a robust data pipeline solution for a private work project. While I can&#8217;t share that proprietary data due to privacy reasons, the challenges I faced are universal: messy file structures, proprietary label formats, and corrupted images.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://blog.nastaran.ai/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading NastaranAI! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p>To show you exactly how I solved them, I&#8217;ve recreated the solution using the <strong>Oxford 102 Flowers dataset</strong>. It is the perfect playground for this because it mimics real-world messiness: over 8,000 generically named images with labels hidden inside a proprietary MATLAB (<code>.mat</code>) file rather than nice, clean category folders.</p><p>Here is the step-by-step guide to building a bugproof PyTorch data pipeline that handles the mess so your model doesn&#8217;t have to.</p><h2>1. The Strategy: Lazy Loading &amp; The Off-by-One Trap</h2><p>If you can&#8217;t reliably load your data, nothing else matters.</p><p>For this pipeline, I built a custom PyTorch <code>Dataset</code> class focused on <strong>lazy loading</strong>. Instead of loading all 8,000+ images into RAM at once, we store only the file paths during setup (<code>__init__</code>) and load the actual image data on-demand (<code>__getitem__</code>).</p><p><strong>A critical lesson learned:</strong> Watch out for indexing errors. The Oxford dataset uses 1-based indexing for its labels, but PyTorch expects 0-based indexing. Catching this off-by-one error early saves you from training a perpetually confused model.</p><h3>The Dataset Skeleton</h3><p>Here is the core structure we need to implement:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!QtqO!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F892642fe-efa1-4970-b3b3-48390cc9a322_2932x3388.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!QtqO!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F892642fe-efa1-4970-b3b3-48390cc9a322_2932x3388.png 424w, https://substackcdn.com/image/fetch/$s_!QtqO!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F892642fe-efa1-4970-b3b3-48390cc9a322_2932x3388.png 848w, https://substackcdn.com/image/fetch/$s_!QtqO!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F892642fe-efa1-4970-b3b3-48390cc9a322_2932x3388.png 1272w, https://substackcdn.com/image/fetch/$s_!QtqO!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F892642fe-efa1-4970-b3b3-48390cc9a322_2932x3388.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!QtqO!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F892642fe-efa1-4970-b3b3-48390cc9a322_2932x3388.png" width="1456" height="1682" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/892642fe-efa1-4970-b3b3-48390cc9a322_2932x3388.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1682,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:3001035,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://blog.nastaran.ai/i/181534509?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F892642fe-efa1-4970-b3b3-48390cc9a322_2932x3388.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!QtqO!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F892642fe-efa1-4970-b3b3-48390cc9a322_2932x3388.png 424w, https://substackcdn.com/image/fetch/$s_!QtqO!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F892642fe-efa1-4970-b3b3-48390cc9a322_2932x3388.png 848w, https://substackcdn.com/image/fetch/$s_!QtqO!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F892642fe-efa1-4970-b3b3-48390cc9a322_2932x3388.png 1272w, https://substackcdn.com/image/fetch/$s_!QtqO!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F892642fe-efa1-4970-b3b3-48390cc9a322_2932x3388.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>2. Consistency: The Pre-processing Pipeline</h2><p>Real-world data is rarely consistent. In the Flowers dataset, images have wildly different dimensions (e.g., 670x500 vs 500x694). PyTorch batches require identical dimensions, so we need a rigorous transform pipeline.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!IUzn!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ea3ae8b-7684-4950-a36e-bf088aae96b4_950x632.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!IUzn!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ea3ae8b-7684-4950-a36e-bf088aae96b4_950x632.png 424w, https://substackcdn.com/image/fetch/$s_!IUzn!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ea3ae8b-7684-4950-a36e-bf088aae96b4_950x632.png 848w, https://substackcdn.com/image/fetch/$s_!IUzn!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ea3ae8b-7684-4950-a36e-bf088aae96b4_950x632.png 1272w, https://substackcdn.com/image/fetch/$s_!IUzn!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ea3ae8b-7684-4950-a36e-bf088aae96b4_950x632.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!IUzn!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ea3ae8b-7684-4950-a36e-bf088aae96b4_950x632.png" width="950" height="632" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/4ea3ae8b-7684-4950-a36e-bf088aae96b4_950x632.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:632,&quot;width&quot;:950,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:601107,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://blog.nastaran.ai/i/181534509?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ea3ae8b-7684-4950-a36e-bf088aae96b4_950x632.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!IUzn!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ea3ae8b-7684-4950-a36e-bf088aae96b4_950x632.png 424w, https://substackcdn.com/image/fetch/$s_!IUzn!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ea3ae8b-7684-4950-a36e-bf088aae96b4_950x632.png 848w, https://substackcdn.com/image/fetch/$s_!IUzn!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ea3ae8b-7684-4950-a36e-bf088aae96b4_950x632.png 1272w, https://substackcdn.com/image/fetch/$s_!IUzn!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ea3ae8b-7684-4950-a36e-bf088aae96b4_950x632.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>I strictly avoid simple resizing, which distorts the image. Instead, I use a <code>Resize</code> on the shorter edge to preserve the aspect ratio, followed by a <code>CenterCrop</code> to get our uniform square. Finally, we convert to tensors and normalize pixel intensity from 0-255 down to 0-1.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!N-lR!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F77aa0ba9-7e8a-4eb4-8f9d-1fcbb0bf76dd_2424x2308.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!N-lR!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F77aa0ba9-7e8a-4eb4-8f9d-1fcbb0bf76dd_2424x2308.png 424w, https://substackcdn.com/image/fetch/$s_!N-lR!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F77aa0ba9-7e8a-4eb4-8f9d-1fcbb0bf76dd_2424x2308.png 848w, https://substackcdn.com/image/fetch/$s_!N-lR!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F77aa0ba9-7e8a-4eb4-8f9d-1fcbb0bf76dd_2424x2308.png 1272w, https://substackcdn.com/image/fetch/$s_!N-lR!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F77aa0ba9-7e8a-4eb4-8f9d-1fcbb0bf76dd_2424x2308.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!N-lR!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F77aa0ba9-7e8a-4eb4-8f9d-1fcbb0bf76dd_2424x2308.png" width="1456" height="1386" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/77aa0ba9-7e8a-4eb4-8f9d-1fcbb0bf76dd_2424x2308.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1386,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1965217,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://blog.nastaran.ai/i/181534509?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F77aa0ba9-7e8a-4eb4-8f9d-1fcbb0bf76dd_2424x2308.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!N-lR!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F77aa0ba9-7e8a-4eb4-8f9d-1fcbb0bf76dd_2424x2308.png 424w, https://substackcdn.com/image/fetch/$s_!N-lR!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F77aa0ba9-7e8a-4eb4-8f9d-1fcbb0bf76dd_2424x2308.png 848w, https://substackcdn.com/image/fetch/$s_!N-lR!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F77aa0ba9-7e8a-4eb4-8f9d-1fcbb0bf76dd_2424x2308.png 1272w, https://substackcdn.com/image/fetch/$s_!N-lR!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F77aa0ba9-7e8a-4eb4-8f9d-1fcbb0bf76dd_2424x2308.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>And here is the output for a sample image:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Qf-O!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F04045afd-bf48-4667-a660-6bd74e6d7d94_1364x610.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Qf-O!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F04045afd-bf48-4667-a660-6bd74e6d7d94_1364x610.png 424w, https://substackcdn.com/image/fetch/$s_!Qf-O!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F04045afd-bf48-4667-a660-6bd74e6d7d94_1364x610.png 848w, https://substackcdn.com/image/fetch/$s_!Qf-O!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F04045afd-bf48-4667-a660-6bd74e6d7d94_1364x610.png 1272w, https://substackcdn.com/image/fetch/$s_!Qf-O!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F04045afd-bf48-4667-a660-6bd74e6d7d94_1364x610.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Qf-O!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F04045afd-bf48-4667-a660-6bd74e6d7d94_1364x610.png" width="1364" height="610" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/04045afd-bf48-4667-a660-6bd74e6d7d94_1364x610.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:610,&quot;width&quot;:1364,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1045479,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://blog.nastaran.ai/i/181534509?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F04045afd-bf48-4667-a660-6bd74e6d7d94_1364x610.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Qf-O!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F04045afd-bf48-4667-a660-6bd74e6d7d94_1364x610.png 424w, https://substackcdn.com/image/fetch/$s_!Qf-O!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F04045afd-bf48-4667-a660-6bd74e6d7d94_1364x610.png 848w, https://substackcdn.com/image/fetch/$s_!Qf-O!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F04045afd-bf48-4667-a660-6bd74e6d7d94_1364x610.png 1272w, https://substackcdn.com/image/fetch/$s_!Qf-O!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F04045afd-bf48-4667-a660-6bd74e6d7d94_1364x610.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>3. Augmentation: Endless Variation, Zero Extra Storage</h2><p>One of the biggest advantages of PyTorch&#8217;s on-the-fly augmentation is that it provides endless variation without taking up extra storage.</p><p>By applying random transformations (flips, rotations, and color jitters) only when the image is loaded during training, the model sees a slightly different version of the image every epoch. This forces the model to learn essential features like shape and color rather than memorizing pixels.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!KF3R!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faf920c15-a605-42a5-ba9a-8712a74dac9b_1260x625.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!KF3R!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faf920c15-a605-42a5-ba9a-8712a74dac9b_1260x625.png 424w, https://substackcdn.com/image/fetch/$s_!KF3R!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faf920c15-a605-42a5-ba9a-8712a74dac9b_1260x625.png 848w, https://substackcdn.com/image/fetch/$s_!KF3R!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faf920c15-a605-42a5-ba9a-8712a74dac9b_1260x625.png 1272w, https://substackcdn.com/image/fetch/$s_!KF3R!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faf920c15-a605-42a5-ba9a-8712a74dac9b_1260x625.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!KF3R!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faf920c15-a605-42a5-ba9a-8712a74dac9b_1260x625.png" width="1260" height="625" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/af920c15-a605-42a5-ba9a-8712a74dac9b_1260x625.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:625,&quot;width&quot;:1260,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1234082,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://blog.nastaran.ai/i/181534509?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faf920c15-a605-42a5-ba9a-8712a74dac9b_1260x625.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!KF3R!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faf920c15-a605-42a5-ba9a-8712a74dac9b_1260x625.png 424w, https://substackcdn.com/image/fetch/$s_!KF3R!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faf920c15-a605-42a5-ba9a-8712a74dac9b_1260x625.png 848w, https://substackcdn.com/image/fetch/$s_!KF3R!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faf920c15-a605-42a5-ba9a-8712a74dac9b_1260x625.png 1272w, https://substackcdn.com/image/fetch/$s_!KF3R!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faf920c15-a605-42a5-ba9a-8712a74dac9b_1260x625.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="pullquote"><p><strong>Note:</strong> Always disable augmentation for validation and testing to ensure your metrics reflect actual performance improvements.</p></div><h2>4. The Bugproof Pipeline: Handling Corrupted Data</h2><p>This is the part that usually gets overlooked in tutorials but is vital in production. A single corrupted image can crash a training run hours after it starts.</p><p>To fix this, we update the <code>__getitem__</code> method to be resilient. If it encounters a bad file (corrupted bytes, empty file, etc.), it shouldn&#8217;t crash. Instead, it should log the error and recursively call itself to fetch the <em>next</em> valid image.</p><p>Here is the pattern I use:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!WTNm!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd2d562e-e2a5-428d-b8b0-1c0a55a491f6_2824x2488.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!WTNm!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd2d562e-e2a5-428d-b8b0-1c0a55a491f6_2824x2488.png 424w, https://substackcdn.com/image/fetch/$s_!WTNm!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd2d562e-e2a5-428d-b8b0-1c0a55a491f6_2824x2488.png 848w, https://substackcdn.com/image/fetch/$s_!WTNm!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd2d562e-e2a5-428d-b8b0-1c0a55a491f6_2824x2488.png 1272w, https://substackcdn.com/image/fetch/$s_!WTNm!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd2d562e-e2a5-428d-b8b0-1c0a55a491f6_2824x2488.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!WTNm!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd2d562e-e2a5-428d-b8b0-1c0a55a491f6_2824x2488.png" width="1456" height="1283" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/dd2d562e-e2a5-428d-b8b0-1c0a55a491f6_2824x2488.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1283,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:2277213,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://blog.nastaran.ai/i/181534509?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd2d562e-e2a5-428d-b8b0-1c0a55a491f6_2824x2488.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!WTNm!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd2d562e-e2a5-428d-b8b0-1c0a55a491f6_2824x2488.png 424w, https://substackcdn.com/image/fetch/$s_!WTNm!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd2d562e-e2a5-428d-b8b0-1c0a55a491f6_2824x2488.png 848w, https://substackcdn.com/image/fetch/$s_!WTNm!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd2d562e-e2a5-428d-b8b0-1c0a55a491f6_2824x2488.png 1272w, https://substackcdn.com/image/fetch/$s_!WTNm!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd2d562e-e2a5-428d-b8b0-1c0a55a491f6_2824x2488.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>5. Telemetry: Know Your Data</h2><p>Finally, I added basic telemetry to the pipeline. By tracking load times and access counts, you can identify if specific images are dragging down your training throughput (e.g., massive high-res files) or if your random sampler is neglecting certain files.</p><p>In my implementation, if an image takes longer than 1 second to load, the system warns me. After training, I print a summary like:</p><ul><li><p><strong>Total images:</strong> 8,189</p></li><li><p><strong>Errors encountered:</strong> 2</p></li><li><p><strong>Average load time:</strong> 7.8 ms</p></li></ul><h2>Summary</h2><p>If you are shipping models to production, you need to invest as much time in your data pipeline as you do in your model architecture.</p><p>By implementing <strong>lazy loading</strong>, <strong>consistent transforms</strong>, <strong>on-the-fly augmentation</strong>, and <strong>robust error handling</strong>, you ensure that your sophisticated neural network isn&#8217;t being sabotaged by a broken data strategy.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://blog.nastaran.ai/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading NastaranAI! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[The "Chemistry Table" Test: Why Smart People Fail]]></title><description><![CDATA[Merit is a myth. Here are 3 brutal truths from Bell Labs legend Richard Hamming on how to unlock your true potential.]]></description><link>https://blog.nastaran.ai/p/the-chemistry-table-test-why-smart</link><guid isPermaLink="false">https://blog.nastaran.ai/p/the-chemistry-table-test-why-smart</guid><dc:creator><![CDATA[Nastaran Moghadasi]]></dc:creator><pubDate>Sat, 13 Dec 2025 17:44:38 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!-afS!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb9c7eabb-f21f-4a85-b258-ca11b8ab3c36_1024x1024.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>&#8220;Just do the work, and the rest will follow.&#8221; How many times have we told ourselves that lie?</p><p>It goes like this: You lock yourself in a room. You decline the invitations. You do the <em>deep</em> work, writing the code, drafting the strategy, building the product. Then, you emerge, blinking in the sunlight, expecting to find the world waiting to applaud.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://blog.nastaran.ai/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading NastaranAI! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p>We want to believe that merit is a magnet. We want to believe that if we just work hard enough, visibility will take care of itself. That if the work is good, the audience is guaranteed.</p><p>But that is a fantasy.</p><p>That illusion was shattered for me when I read a transcript of a talk given in 1986 by Richard Hamming. Hamming was a titan at Bell Labs, a man who worked alongside the fathers of Information Theory. Throughout his career, he was obsessed with a nagging question: <strong>Why do some capable people fulfill their promise, while others with equal talent are left wondering what they might have accomplished?</strong></p><p>To answer this, he gave a famous talk titled <strong>&#8220;You and Your Research.&#8221;</strong></p><p>While Hamming was speaking to scientists in white coats, his advice is actually a brutal reality check for anyone trying to reach their potential and goals in life.</p><p>Here are three uncomfortable truths Hamming revealed about what it truly takes to reach your full potential.</p><h3>1. The &#8220;Chemistry Table&#8221; Test</h3><p>Hamming didn&#8217;t just theorize about greatness. He investigated it. He used to eat lunch with the physicists, but when the Nobel Prize winners left, the conversation turned dull. So, he moved his tray to the &#8220;Chemistry Table&#8221; to find new ideas. He started asking the chemists a terrifying question: <em>&#8220;What are the important problems of your field?&#8221;</em></p><p>After a few weeks of listening, he asked: <em>&#8220;What important problems are you working on?&#8221;</em></p><p>And finally, when the answers didn&#8217;t match up, he delivered the kill shot:</p><div class="pullquote"><p><strong>&#8220;If what you are doing is not important, and if you don&#8217;t think it is going to lead to something important, why are you at Bell Labs working on it?&#8221;</strong></p></div><p>He wasn&#8217;t welcomed at lunch after that.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!oNY3!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F941dc297-2a4f-424b-ad90-509b28830b25_1024x1024.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!oNY3!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F941dc297-2a4f-424b-ad90-509b28830b25_1024x1024.png 424w, https://substackcdn.com/image/fetch/$s_!oNY3!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F941dc297-2a4f-424b-ad90-509b28830b25_1024x1024.png 848w, https://substackcdn.com/image/fetch/$s_!oNY3!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F941dc297-2a4f-424b-ad90-509b28830b25_1024x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!oNY3!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F941dc297-2a4f-424b-ad90-509b28830b25_1024x1024.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!oNY3!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F941dc297-2a4f-424b-ad90-509b28830b25_1024x1024.png" width="344" height="344" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/941dc297-2a4f-424b-ad90-509b28830b25_1024x1024.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1024,&quot;width&quot;:1024,&quot;resizeWidth&quot;:344,&quot;bytes&quot;:2095642,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://blog.nastaran.ai/i/181503032?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F941dc297-2a4f-424b-ad90-509b28830b25_1024x1024.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!oNY3!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F941dc297-2a4f-424b-ad90-509b28830b25_1024x1024.png 424w, https://substackcdn.com/image/fetch/$s_!oNY3!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F941dc297-2a4f-424b-ad90-509b28830b25_1024x1024.png 848w, https://substackcdn.com/image/fetch/$s_!oNY3!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F941dc297-2a4f-424b-ad90-509b28830b25_1024x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!oNY3!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F941dc297-2a4f-424b-ad90-509b28830b25_1024x1024.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>But most of us are sitting at that Chemistry Table right now. We fill our days with &#8216;busy work&#8217;: <strong>endlessly refactoring code that already works, bikeshedding over variable names in code reviews, or reorganizing the Jira backlog.</strong> It feels productive, but it&#8217;s actually a defense mechanism. We pick the safe, solvable Jira tickets to avoid the terror of the complex architectural problems that actually matter.</p><p>The Rule: If you aren&#8217;t working on the one thing that could actually level up your career, you must re-evaluate your options.</p><h3>2. The 50% Rule</h3><p>Your work should speak for itself, right? In fact, this is the biggest lie that we have been told. In creative and technical fields, &#8220;selling&#8221; is often underrated.</p><p>Hamming disagreed entirely. He saw brilliant people at Bell Labs who had world-changing ideas but kept their heads down. They would file a quiet report weeks after a project finished, but by then, the decisions had already been made.</p><p>He stated his rule bluntly:</p><div class="pullquote"><p>&#8220;I believed, in my early days, that you should spend at least as much time in the polish and presentation as you did in the original research. <strong>Now at least 50% of the time must go for the presentation.</strong> It&#8217;s a big, big number.&#8221;</p></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!4Com!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fad2b77fa-17c2-48a5-903c-a32f54d5cd4a_1024x1024.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!4Com!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fad2b77fa-17c2-48a5-903c-a32f54d5cd4a_1024x1024.png 424w, https://substackcdn.com/image/fetch/$s_!4Com!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fad2b77fa-17c2-48a5-903c-a32f54d5cd4a_1024x1024.png 848w, https://substackcdn.com/image/fetch/$s_!4Com!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fad2b77fa-17c2-48a5-903c-a32f54d5cd4a_1024x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!4Com!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fad2b77fa-17c2-48a5-903c-a32f54d5cd4a_1024x1024.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!4Com!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fad2b77fa-17c2-48a5-903c-a32f54d5cd4a_1024x1024.png" width="426" height="426" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ad2b77fa-17c2-48a5-903c-a32f54d5cd4a_1024x1024.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1024,&quot;width&quot;:1024,&quot;resizeWidth&quot;:426,&quot;bytes&quot;:2090437,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://blog.nastaran.ai/i/181503032?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fad2b77fa-17c2-48a5-903c-a32f54d5cd4a_1024x1024.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!4Com!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fad2b77fa-17c2-48a5-903c-a32f54d5cd4a_1024x1024.png 424w, https://substackcdn.com/image/fetch/$s_!4Com!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fad2b77fa-17c2-48a5-903c-a32f54d5cd4a_1024x1024.png 848w, https://substackcdn.com/image/fetch/$s_!4Com!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fad2b77fa-17c2-48a5-903c-a32f54d5cd4a_1024x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!4Com!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fad2b77fa-17c2-48a5-903c-a32f54d5cd4a_1024x1024.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Why 50%? Because the world is noisy.</p><p>If you build a product but don&#8217;t market it, you haven&#8217;t finished the work. If you write a brilliant strategy document but don&#8217;t present it persuasively to your boss, you haven&#8217;t finished the work.</p><p>Hamming noted that if you don&#8217;t advocate for your own ideas, you aren&#8217;t being &#8220;humble.&#8221; You are crippling the impact of your work.</p><div class="pullquote"><p>If you don&#8217;t promote your work, no one will do it for you!</p></div><h3>3. The Compound Interest of Effort</h3><p>We often look at &#8220;geniuses,&#8221; whether it&#8217;s Elon Musk, Taylor Swift, or the star developer sitting next to you, and assume they are just smarter.</p><p>Hamming was suspicious of this. He asked his boss, Bode, how someone like their colleague John Tukey knew so much. Bode&#8217;s answer changed Hamming&#8217;s life: <em>&#8220;Knowledge and productivity are like compound interest.&#8221;</em></p><p>If you have two people of roughly the same ability, and one works just 10% harder than the other, the output isn&#8217;t 10% higher. <strong>It is 2x higher over a lifetime.</strong></p><div class="pullquote"><p>&#8220;The more you know, the more you learn; the more you learn, the more you can do; the more you can do, the more the opportunity.&#8221;</p></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!-afS!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb9c7eabb-f21f-4a85-b258-ca11b8ab3c36_1024x1024.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!-afS!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb9c7eabb-f21f-4a85-b258-ca11b8ab3c36_1024x1024.png 424w, https://substackcdn.com/image/fetch/$s_!-afS!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb9c7eabb-f21f-4a85-b258-ca11b8ab3c36_1024x1024.png 848w, https://substackcdn.com/image/fetch/$s_!-afS!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb9c7eabb-f21f-4a85-b258-ca11b8ab3c36_1024x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!-afS!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb9c7eabb-f21f-4a85-b258-ca11b8ab3c36_1024x1024.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!-afS!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb9c7eabb-f21f-4a85-b258-ca11b8ab3c36_1024x1024.png" width="488" height="488" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/b9c7eabb-f21f-4a85-b258-ca11b8ab3c36_1024x1024.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1024,&quot;width&quot;:1024,&quot;resizeWidth&quot;:488,&quot;bytes&quot;:2519039,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://blog.nastaran.ai/i/181503032?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb9c7eabb-f21f-4a85-b258-ca11b8ab3c36_1024x1024.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!-afS!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb9c7eabb-f21f-4a85-b258-ca11b8ab3c36_1024x1024.png 424w, https://substackcdn.com/image/fetch/$s_!-afS!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb9c7eabb-f21f-4a85-b258-ca11b8ab3c36_1024x1024.png 848w, https://substackcdn.com/image/fetch/$s_!-afS!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb9c7eabb-f21f-4a85-b258-ca11b8ab3c36_1024x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!-afS!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb9c7eabb-f21f-4a85-b258-ca11b8ab3c36_1024x1024.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>This isn&#8217;t about &#8220;hustle culture&#8221; or burning out. It&#8217;s about the compound interest of focus.</p><p>Hamming realized that if he carved out just one extra hour a day for &#8220;Great Thoughts&#8221; and thinking about the future rather than just putting out fires, that effort would compound. Most of us live life linearly, doing the job and going home. <strong>Great work happens when you invest a small amount of energy today that makes you smarter tomorrow.</strong></p><h3>Luck Favors the Prepared Mind</h3><p>Hamming believed that even if you follow all these rules, you face one final trap: <strong>Success itself.</strong></p><p>Success tempts you to repeat your greatest hits until you become a dinosaur. Hamming&#8217;s advice was to force a reset: change your professional focus every now and then. You need to make yourself a beginner again (not a dramatic change) so you remember how to learn. Don&#8217;t just sit in the shade of the oak tree you already grew. You need to plant new acorns. If you write code, look at the algorithms that power it. If you&#8217;re a software engineer, pivot to AI. </p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!hBEU!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8195294-6953-46c0-91a6-ef40b3923ab8_1792x1024.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!hBEU!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8195294-6953-46c0-91a6-ef40b3923ab8_1792x1024.png 424w, https://substackcdn.com/image/fetch/$s_!hBEU!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8195294-6953-46c0-91a6-ef40b3923ab8_1792x1024.png 848w, https://substackcdn.com/image/fetch/$s_!hBEU!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8195294-6953-46c0-91a6-ef40b3923ab8_1792x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!hBEU!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8195294-6953-46c0-91a6-ef40b3923ab8_1792x1024.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!hBEU!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8195294-6953-46c0-91a6-ef40b3923ab8_1792x1024.png" width="500" height="285.7142857142857" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e8195294-6953-46c0-91a6-ef40b3923ab8_1792x1024.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:832,&quot;width&quot;:1456,&quot;resizeWidth&quot;:500,&quot;bytes&quot;:4329287,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://blog.nastaran.ai/i/181503032?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8195294-6953-46c0-91a6-ef40b3923ab8_1792x1024.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!hBEU!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8195294-6953-46c0-91a6-ef40b3923ab8_1792x1024.png 424w, https://substackcdn.com/image/fetch/$s_!hBEU!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8195294-6953-46c0-91a6-ef40b3923ab8_1792x1024.png 848w, https://substackcdn.com/image/fetch/$s_!hBEU!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8195294-6953-46c0-91a6-ef40b3923ab8_1792x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!hBEU!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8195294-6953-46c0-91a6-ef40b3923ab8_1792x1024.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Hamming&#8217;s talk is a reminder that we have more control over our careers than we think. We often attribute success to luck. But as Hamming said, quoting Pasteur: <em>&#8220;Luck favors the prepared mind.&#8221;</em></p><ul><li><p>You prepare your mind by asking the uncomfortable questions at the Chemistry Table.</p></li><li><p>You prepare your career by selling your work as hard as you create it.</p></li><li><p>And you prepare your soul by keeping your door open to the world.</p></li></ul><div class="pullquote"><p>Hamming refused to accept alibis. He didn&#8217;t believe greatness was reserved for the lucky few. He believed it was a choice you make every day. So take his final challenge to heart: <strong>&#8220;Therefore, go forth and become great!&#8221;</strong></p></div><p>Here is the link to the full transcript of the <a href="https://www.cs.virginia.edu/~robins/YouAndYourResearch.html">source</a>.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://blog.nastaran.ai/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading NastaranAI! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[How to Read a Scientific Paper Efficiently: A 5-Step Guide]]></title><description><![CDATA[From Google Scholar to critical analysis: a journey through the active process of understanding scientific literature.]]></description><link>https://blog.nastaran.ai/p/how-to-read-a-scientific-paper-efficiently</link><guid isPermaLink="false">https://blog.nastaran.ai/p/how-to-read-a-scientific-paper-efficiently</guid><dc:creator><![CDATA[Nastaran Moghadasi]]></dc:creator><pubDate>Fri, 12 Dec 2025 21:00:53 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!yQKP!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8fe8b1f8-c3ea-4f08-b80e-9af35d4bc093_2626x1536.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p><em>I&#8217;ve found that one of the most challenging yet rewarding skills to develop is the ability to effectively read and understand scientific research papers. It&#8217;s a journey of deconstructing complex ideas, and I&#8217;m excited to share what I&#8217;ve learned with you. This blog post is a guide for anyone who wants to dive into the world of academic research, inspired by the insights from Somdip Dey&#8217;s &#8220;A Beginner&#8217;s Guide to Computer Science Research.&#8221; [1]</em></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!yQKP!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8fe8b1f8-c3ea-4f08-b80e-9af35d4bc093_2626x1536.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!yQKP!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8fe8b1f8-c3ea-4f08-b80e-9af35d4bc093_2626x1536.png 424w, https://substackcdn.com/image/fetch/$s_!yQKP!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8fe8b1f8-c3ea-4f08-b80e-9af35d4bc093_2626x1536.png 848w, https://substackcdn.com/image/fetch/$s_!yQKP!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8fe8b1f8-c3ea-4f08-b80e-9af35d4bc093_2626x1536.png 1272w, https://substackcdn.com/image/fetch/$s_!yQKP!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8fe8b1f8-c3ea-4f08-b80e-9af35d4bc093_2626x1536.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!yQKP!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8fe8b1f8-c3ea-4f08-b80e-9af35d4bc093_2626x1536.png" width="1456" height="852" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/8fe8b1f8-c3ea-4f08-b80e-9af35d4bc093_2626x1536.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:852,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:2106373,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://blog.nastaran.ai/i/181461117?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8fe8b1f8-c3ea-4f08-b80e-9af35d4bc093_2626x1536.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!yQKP!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8fe8b1f8-c3ea-4f08-b80e-9af35d4bc093_2626x1536.png 424w, https://substackcdn.com/image/fetch/$s_!yQKP!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8fe8b1f8-c3ea-4f08-b80e-9af35d4bc093_2626x1536.png 848w, https://substackcdn.com/image/fetch/$s_!yQKP!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8fe8b1f8-c3ea-4f08-b80e-9af35d4bc093_2626x1536.png 1272w, https://substackcdn.com/image/fetch/$s_!yQKP!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8fe8b1f8-c3ea-4f08-b80e-9af35d4bc093_2626x1536.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://blog.nastaran.ai/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading NastaranAI! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><h2>The Art of Reading a Scientific Paper</h2><p>Reading a research paper is not like reading a novel or a news article. It&#8217;s an active process that requires a systematic approach. The goal is not just to read the words on the page but to engage with the ideas, critically evaluate the research, and connect it to your own knowledge and interests. Here&#8217;s a breakdown of a five-step process that can help you navigate the dense world of academic literature.</p><h3>Step 1: Find Your Focus</h3><p>The first and most crucial step is to choose a subject area that genuinely interests you. Research requires a significant investment of time and effort, and your passion for the topic will be the fuel that keeps you going. Instead of blindly following trends or suggestions, take the time to explore different areas and find what truly excites you. This personal connection to the subject matter will make the entire process more enjoyable and sustainable.</p><h3>Step 2: Master the Art of the Search</h3><p>Once you have a topic in mind, the next step is to find relevant research papers. While a simple Google search might be your first instinct, it&#8217;s essential to use scholarly databases and search engines to ensure the credibility of the sources. Here are some of the most valuable resources for computer science research:</p><p>Search Engine/Database Description Google Scholar A freely accessible web search engine that indexes the full text or metadata of scholarly literature across an array of publishing formats and disciplines. Microsoft Academic Search A free public web search engine for academic publications and literature, developed by Microsoft Research. ACM Digital Library A research, discovery and networking platform containing the full text of every article ever published by ACM. IEEEXplore A digital library providing access to scientific and technical content published by the Institute of Electrical and Electronics Engineers (IEEE) and its publishing partners. DBLP A computer science bibliography website that provides open bibliographic information on major computer science journals and proceedings. Scopus A multidisciplinary bibliographic database containing abstracts and citations for academic journal articles. ScienceDirect A leading full-text scientific database offering journal articles and book chapters from more than 2,500 peer-reviewed journals and more than 11,000 books.</p><p>When searching, use a combination of broad and specific keywords related to your topic. Keep a running list of these keywords, as they will be invaluable for future searches.</p><h3>Step 3: Organize and Categorize</h3><p>As you start collecting papers, you&#8217;ll quickly realize that not all research is the same. Dey suggests categorizing papers into two main types: <strong>argumentative</strong> and <strong>analytical</strong>. Argumentative papers present a new idea and provide evidence to support it, while analytical papers offer a new perspective or analysis of an existing topic. Understanding the type of paper you&#8217;re reading will help you better grasp the author&#8217;s intent.</p><p>To manage your growing library of papers, consider using reference management software like EndNote, or BibDesk. These tools can help you organize your references, take notes, and even visualize the connections between different papers.</p><h3>Step 4: The Three-Pass Approach</h3><p>Reading a paper effectively is a multi-step process. A popular method, also referenced by Dey, is the &#8220;three-pass approach&#8221; proposed by S. Keshav. [2] This method involves reading the paper three times, with each pass having a different goal:</p><ul><li><p><strong>The First Pass (5-10 minutes):</strong> This is a quick scan to get a general idea of the paper. Read the title, abstract, and introduction. Glance at the section and sub-section headings. Read the conclusions. This will give you a high-level overview of the paper&#8217;s contribution.</p></li><li><p><strong>The Second Pass (up to an hour):</strong> In this pass, you&#8217;ll read the paper more carefully, but you can still ignore the finer details like proofs. Pay attention to the figures, diagrams, and illustrations. This will help you understand the context of the work and the evidence presented.</p></li><li><p><strong>The Third Pass (4-5 hours for beginners):</strong> This is the most detailed pass. The goal here is to understand the paper in its entirety. You should be able to mentally re-implement the paper&#8217;s ideas and identify its strengths and weaknesses.</p></li></ul><h3>Step 5: The Art of Critical Thinking</h3><p>Reading a paper is not a passive activity. The final and most important step is to critically engage with the content. As you read, ask yourself the following questions:</p><ul><li><p>What is the core problem the paper is trying to solve?</p></li><li><p>What is the proposed solution?</p></li><li><p>How is the solution evaluated? Are the benchmarks and evaluations fair?</p></li><li><p>What are the underlying assumptions made by the authors?</p></li><li><p>What are the limitations of the research?</p></li><li><p>Can the work be improved? What are the potential avenues for future research?</p></li></ul><p>Answering these questions will not only deepen your understanding of the paper but also help you generate your own ideas and contribute to the conversation.</p><h2>The Journey Continues</h2><p>Reading research papers is a skill that takes time and practice to develop. Don&#8217;t be discouraged if you find it challenging at first. By following a systematic approach and actively engaging with the material, you&#8217;ll gradually build the confidence and expertise to navigate the world of academic research. This journey of a thousand papers begins with a single, well-read one.</p><h2>References</h2><p>[1] Dey, S. (2014). A Beginner&#8217;s Guide to Computer Science Research. <em>XRDS: Crossroads, The ACM Magazine for Students</em>, 20(4), 14-15.</p><p>[2] Keshav, S. (2007). How to Read a Paper. <em>ACM SIGCOMM Computer Communication Review</em>, 37(3), 83-84.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://blog.nastaran.ai/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading NastaranAI! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[Piano Buddy: When AI Jam with You]]></title><description><![CDATA[Never play alone again: Inside the code of a co-creative AI that lives in your browser.]]></description><link>https://blog.nastaran.ai/p/piano-buddy-when-recurrent-neural</link><guid isPermaLink="false">https://blog.nastaran.ai/p/piano-buddy-when-recurrent-neural</guid><dc:creator><![CDATA[Nastaran Moghadasi]]></dc:creator><pubDate>Thu, 11 Dec 2025 20:40:27 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!L2Z_!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5f9423c6-cc1d-4104-97fb-ac71bd2f051d_1024x1024.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>We talk a lot about AI replacing humans, but the true potential of the technology lies in <strong>Co-Creative AI</strong>, systems designed to amplify our abilities rather than automate them. Have you ever wanted to play a piano duet but found yourself alone in the room? Or perhaps you&#8217;ve wanted to improvise a melody but felt limited by your technical skills?</p><p>Enter <strong><a href="https://piano.nastaran.ai/">Piano Buddy</a></strong>, an interactive web experiment where human creativity meets machine intelligence in a musical call-and-response. It&#8217;s not just a synthesizer; it&#8217;s a listening, reacting musical partner that lives entirely in your browser.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://blog.nastaran.ai/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading NastaranAI! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!L2Z_!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5f9423c6-cc1d-4104-97fb-ac71bd2f051d_1024x1024.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!L2Z_!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5f9423c6-cc1d-4104-97fb-ac71bd2f051d_1024x1024.png 424w, https://substackcdn.com/image/fetch/$s_!L2Z_!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5f9423c6-cc1d-4104-97fb-ac71bd2f051d_1024x1024.png 848w, https://substackcdn.com/image/fetch/$s_!L2Z_!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5f9423c6-cc1d-4104-97fb-ac71bd2f051d_1024x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!L2Z_!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5f9423c6-cc1d-4104-97fb-ac71bd2f051d_1024x1024.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!L2Z_!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5f9423c6-cc1d-4104-97fb-ac71bd2f051d_1024x1024.png" width="1024" height="1024" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/5f9423c6-cc1d-4104-97fb-ac71bd2f051d_1024x1024.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1024,&quot;width&quot;:1024,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:610956,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://blog.nastaran.ai/i/181365070?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5f9423c6-cc1d-4104-97fb-ac71bd2f051d_1024x1024.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!L2Z_!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5f9423c6-cc1d-4104-97fb-ac71bd2f051d_1024x1024.png 424w, https://substackcdn.com/image/fetch/$s_!L2Z_!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5f9423c6-cc1d-4104-97fb-ac71bd2f051d_1024x1024.png 848w, https://substackcdn.com/image/fetch/$s_!L2Z_!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5f9423c6-cc1d-4104-97fb-ac71bd2f051d_1024x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!L2Z_!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5f9423c6-cc1d-4104-97fb-ac71bd2f051d_1024x1024.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>In this post, we&#8217;ll dive deep into the technical architecture behind Piano Buddy, exploring how we leverage <strong><a href="https://magenta.withgoogle.com/js-announce">Magenta.js</a></strong>, <strong><a href="https://en.wikipedia.org/wiki/Recurrent_neural_network">Recurrent Neural Networks</a> (RNNs)</strong>, and <strong><a href="https://www.i-am.ai/piano-genie.html">Piano Genie</a></strong> to create a seamless jamming experience.</p><h2>The Concept: AI as a Creative Partner</h2><p>Piano Buddy operates on a turn-based interaction model. You play a short melody, and the AI listens. When you pause, the AI picks up where you left off, generating a continuation that matches the style, tempo, and key of your input.</p><p>This isn&#8217;t simple playback or random generation. The system uses <a href="https://en.wikipedia.org/wiki/Deep_learning">deep learning</a> models to &#8220;understand&#8221; the musical context you&#8217;ve provided and predict the most musically appropriate follow-up.</p><h2>The Tech Stack: Powered by Magenta.js</h2><p>The core of Piano Buddy is built on <strong><a href="https://magenta.tensorflow.org/">Magenta.js</a></strong>, an open-source library from Google that provides pre-trained Music and Art models in the browser using <a href="https://www.tensorflow.org/js">TensorFlow.js</a>. This allows us to run sophisticated inference client-side, with no need for a backend Python server to generating the notes. This ensures low latency, critical for a musical application.</p><p>We utilize two primary models to make this magic happen:</p><ol><li><p><strong>Piano Genie</strong>: For intelligent input mapping.</p></li><li><p><strong>MusicRNN</strong>: For melody generation.</p></li></ol><h3>1. Smart Input with Piano Genie</h3><p>One of the biggest hurdles in web-based music apps is the input interface. Mapping a full 88-key piano to a computer keyboard is clumsy. To solve this, we integrated <strong>Piano Genie</strong>.</p><p>Piano Genie is a model designed to map a small number of inputs (in our case, just 8 buttons) to a full 88-key piano output while keeping the music sounding &#8220;pianistic.&#8221;</p><p>Under the hood, Piano Genie is based on a discrete <strong><a href="https://arxiv.org/pdf/2003.05991">Autoencoder</a></strong> architecture inspired by <strong><a href="https://arxiv.org/abs/1711.00937">VQ-VAE</a></strong>. It learns a discrete latent space of musical contours. The encoder maps 88 keys to one of 8 &#8220;buttons,&#8221; and the decoder learns to map those buttons back to the original music. At runtime, we only use the <strong>decoder</strong>, allowing the user to play complex-sounding melodies with simple button presses.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!qIYM!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0fc58ddc-8c05-4f6b-9de7-b5fe63fc7a77_1366x768.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!qIYM!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0fc58ddc-8c05-4f6b-9de7-b5fe63fc7a77_1366x768.png 424w, https://substackcdn.com/image/fetch/$s_!qIYM!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0fc58ddc-8c05-4f6b-9de7-b5fe63fc7a77_1366x768.png 848w, https://substackcdn.com/image/fetch/$s_!qIYM!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0fc58ddc-8c05-4f6b-9de7-b5fe63fc7a77_1366x768.png 1272w, https://substackcdn.com/image/fetch/$s_!qIYM!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0fc58ddc-8c05-4f6b-9de7-b5fe63fc7a77_1366x768.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!qIYM!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0fc58ddc-8c05-4f6b-9de7-b5fe63fc7a77_1366x768.png" width="1366" height="768" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/0fc58ddc-8c05-4f6b-9de7-b5fe63fc7a77_1366x768.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:768,&quot;width&quot;:1366,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!qIYM!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0fc58ddc-8c05-4f6b-9de7-b5fe63fc7a77_1366x768.png 424w, https://substackcdn.com/image/fetch/$s_!qIYM!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0fc58ddc-8c05-4f6b-9de7-b5fe63fc7a77_1366x768.png 848w, https://substackcdn.com/image/fetch/$s_!qIYM!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0fc58ddc-8c05-4f6b-9de7-b5fe63fc7a77_1366x768.png 1272w, https://substackcdn.com/image/fetch/$s_!qIYM!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0fc58ddc-8c05-4f6b-9de7-b5fe63fc7a77_1366x768.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Source: https://www.i-am.ai/piano-genie.html</figcaption></figure></div><h3>2. The Brain: Melody RNN</h3><p>Once you&#8217;ve played your part, the <strong>MusicRNN</strong> takes over. This model is a <strong>Long Short-Term Memory (LSTM)</strong> network, a type of Recurrent Neural Network (RNN) specialized for sequential data like text or music.</p><p>Unlike traditional Feed-Forward networks, RNNs have a &#8220;memory&#8221; (hidden state) that persists across time steps. This allows the model to understand context, knowing that a C major chord played 4 steps ago should influence the note generated <em>now</em>.</p><h4>The Mathematics of Melody</h4><p>From a probabilistic standpoint, the RNN is modeling the conditional probability of the next note event  x<sup><sub>t</sub></sup> conditioned on the previous notes x<sub>0:t-1</sub>:</p><div class="latex-rendered" data-attrs="{&quot;persistentExpression&quot;:&quot;P(x_t | x_{t-1}, x_{t-2}, &#8230;, x_0; \\theta)&quot;,&quot;id&quot;:&quot;PASGWAXPVR&quot;}" data-component-name="LatexBlockToDOM"></div><p></p><p>Where &#952; represents the learned weights of the neural network. The output of the network is a <strong><a href="https://en.wikipedia.org/wiki/Softmax_function#:~:text=The%20softmax%20function%2C%20also%20known,used%20in%20multinomial%20logistic%20regression.">Softmax</a></strong> probability distribution over the possible MIDI pitches. We then <em>sample</em> from this distribution to choose the next note.</p><p>We control the &#8220;creativity&#8221; of the AI using a <strong>Temperature</strong> parameter ($T$):</p><div class="latex-rendered" data-attrs="{&quot;persistentExpression&quot;:&quot;p_i = \\frac{\\exp(z_i / T)}{\\sum_j \\exp(z_j / T)}&quot;,&quot;id&quot;:&quot;WXRESVKWBO&quot;}" data-component-name="LatexBlockToDOM"></div><p></p><ul><li><p><strong>Low Temperature (T &lt; 1.0)</strong>: The distribution peaks. The AI plays it safe, choosing the most likely notes.</p></li><li><p><strong>High Temperature (T &gt; 1.0)</strong>: The distribution flattens. The AI takes risks, leading to more interesting (but sometimes chaotic) melodies.</p></li></ul><h2>Show Me the Code</h2><p>Let&#8217;s look at the implementation. The core logic for generating a continuation resides in <code>src/rnn.js</code>. We wrap the Magenta <code>MusicRNN</code> model to handle the quantization (snapping notes to a time grid) and generation.</p><pre><code><code>/**
 * Generates a melody continuation using an RNN model.
 * @param {Object} noteSequence - The user&#8217;s input notes.
 * @param {Object} options - Configuration for generation (steps, temperature).
 */
async function getSampleRnn(
  noteSequence,
  options = { stepsPerQuarter: 2, steps: 50, temperature: 1.3 }
) {
  // 1. Quantize the Input: Snap user&#8217;s loose timing to a grid
  const qns = mm.sequences.quantizeNoteSequence(
    noteSequence,
    options.stepsPerQuarter
  );

  // 2. Generate Continuation: Ask the RNN to dream up what comes next
  const sample = await musicRNN.continueSequence(
    qns,
    options.steps,       // How many steps to generate?
    options.temperature  // How wild should the AI be?
  );

  // 3. Unquantize: Convert back to absolute time for playback
  return mm.sequences.unquantizeSequence(sample);
}
</code></code></pre><p>In the main application logic (<code>src/script.js</code>), we manage the turn-taking state machine. When the user finishes playing (detected via a timer), we package their notes into a <code>NoteSequence</code> protobuf (the standard data format for Magenta) and pass it to the model.</p><pre><code><code>const aiTurn = async () =&gt; {
    // ... setup UI ...
    
    // Get user&#8217;s recent notes
    const modelInput = sequence.getSequencesByTag(USER_TURN);
    
    // Generate the AI&#8217;s response
    const sample = await getSampleRnn(modelInput, modelConfig);
    
    // Play it back to the user
    for (const n of sample.notes) {
        pianoPlayer.playNoteDown(n.pitch);
        // ... visualization logic ...
        await sleep(n.endTime - n.startTime);
    }
};
</code></code></pre><h2>Future Directions</h2><p>Piano Buddy demonstrates that the web browser is becoming a powerful platform for AI deployment. By moving inference to the client, we democratize access to these creative tools, no high-end GPU server required.</p><p>Future improvements could include:</p><ul><li><p><strong>Polyphony</strong>: Using models like <code>PerformanceRNN</code> to support chords and simultaneous notes.</p></li><li><p><strong>Style Transfer</strong>: Allowing the user to select <em>whose</em> style the AI should mimic (e.g., &#8220;Play like Chopin&#8221;).</p></li><li><p><strong>Real-time Harmonization</strong>: generating accompaniment <em>while</em> the user plays, rather than after.</p></li></ul><h2>Try It Yourself</h2><p>The code is open-source and available on GitHub. I encourage you to clone it, tweak the temperature settings, and see what kind of musical chaos you can create!</p><ul><li><p><strong>Live Demo</strong>: <a href="https://piano.nastaran.ai">https://piano.nastaran.ai</a></p></li><li><p><strong>Source Code</strong>: <a href="https://github.com/NastaranMO/piano-buddy">https://github.com/NastaranMO/piano-buddy</a></p></li></ul><div><hr></div><p><em><a href="https://piano.nastaran.ai">Piano Buddy</a> was built by <a href="https://nastaran.ai/">Nastaran Moghadasi</a>. If you enjoyed this technical deep dive, subscribe to my Substack for more insights on Generative AI and Large Language Models.</em></p><h2>References</h2><ol><li><p><strong>[Piano Genie]</strong> Donahue, C., Simon, I., &amp; Dieleman, S. (2019). Piano Genie. <em>Proceedings of the 24th International Conference on Intelligent User Interfaces</em>, 147&#8211;158. <a href="https://doi.org/10.1145/3301275.3302289">https://doi.org/10.1145/3301275.3302289</a></p></li><li><p><strong>[LSTM]</strong> Hochreiter, S., &amp; Schmidhuber, J. (1997). Long Short-Term Memory. <em>Neural Computation</em>, 9(8), 1735&#8211;1780. <a href="https://doi.org/10.1162/neco.1997.9.8.1735">https://doi.org/10.1162/neco.1997.9.8.1735</a></p></li><li><p><strong>[Magenta]</strong> Roberts, A., Engel, J., Raffel, C., Hawthorne, C., &amp; Eck, D. (2018). A Hierarchical Latent Vector Model for Learning Long-Term Structure in Music. <em>International Conference on Machine Learning</em>, 4364&#8211;4373.</p></li><li><p><strong>[TensorFlow.js]</strong> Smilkov, D., Thorat, N., Assogba, Y., Yuan, A., Kreeger, N., Yu, P., &#8230; &amp; Cai, S. (2019). TensorFlow.js: Machine Learning for the Web and Beyond. <em>SysML Conference</em>.</p></li><li><p><strong>[VQ-VAE]</strong> van den Oord, A., Vinyals, O., &amp; Kavukcuoglu, K. (2017). Neural Discrete Representation Learning. <em>Advances in Neural Information Processing Systems</em>, 30.</p></li><li><p><strong>[Autoencoders]</strong> Hinton, G. E., &amp; Salakhutdinov, R. R. (2006). Reducing the dimensionality of data with neural networks. <em>Science</em>, 313(5786), 504&#8211;507.</p></li></ol><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://blog.nastaran.ai/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading NastaranAI! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item></channel></rss>