search.xml

<?xml version="1.0" encoding="utf-8"?>
<search>
  <entry>
    <title>[.NET][C#][SOLID] - DI &amp; IoC (依賴注入與控制反轉) 全面講解</title>
    <url>/posts/3588979794/</url>
    <content><![CDATA[<h2 id="前言"><a href="#前言" class="headerlink" title="前言"></a>前言</h2><p>IoC (Inversion of Control) 控制反轉，與OOP SOLID原則中的其中一種設計原則有關，也就是其中的DIP(Dependency Inversion Principle)，是OOP一個非常重要的程式設計思想，對於軟體開發來說十分重要，下面我將<strong>十分詳細的介紹何為DIP、IoC以及DI、為何要使用它們以及如何實作</strong>，相信大家閱讀完會對這個重要的思想了解的更加透徹。</p>
<span id="more"></span>
<h2 id="定義"><a href="#定義" class="headerlink" title="定義"></a>定義</h2><p>DIP以簡單的一句話說明就是</p>
<blockquote>
<p><strong>DIP - Dependency Inversion Principle (依賴倒轉原則)</strong></p>
<ul>
<li>一種原則、思想</li>
<li>高層次的模組不應該依賴低層的模組，低層次的模組也不應該依賴高層次的模組<br><strong>兩者都應該依賴抽象</strong></li>
</ul>
</blockquote>
<blockquote>
<p><strong>IoC - Inversion of Control (控制反轉)</strong></p>
<ul>
<li>一種思想</li>
<li>把對於某個物件的<strong>控制權</strong>移轉給<strong>第三方容器 (IoC Container)</strong></li>
</ul>
</blockquote>
<blockquote>
<p><strong>DI - Dependency Injection (依賴注入)</strong></p>
<ul>
<li>一種設計模式</li>
<li>將依賴通過<strong>注入</strong>的方式提供給需要的模組，是 IoC 與 DIP 的具體表現</li>
<li>把被依賴物件注入被動接收物件中</li>
</ul>
</blockquote>
<p>DIP的定義非常重要，請大家牢記在心。<br>也就是說，程式應該依賴抽象，而不是實作的實體，這可以幫助我們對程式之間解耦，能夠被更好地維護。<br>DI是為了實現DIP和IoC而誕生的實現方式，因此大家在使用物件導向的技巧時，<strong>務必清楚自己在做什麼，和為何這樣做</strong>。</p>
<h2 id="好處-為什麼要使用？"><a href="#好處-為什麼要使用？" class="headerlink" title="好處&#x2F;為什麼要使用？"></a>好處&#x2F;為什麼要使用？</h2><p>在針對各個名詞解釋與實作之前，我想先讓各位了解DIP以及IoC帶來的好處。</p>
<blockquote>
<ol>
<li>可維護性 (maintainability)</li>
<li>寬鬆耦合 (loose coupling)</li>
</ol>
</blockquote>
<ul>
<li>所謂的<strong>可維護性</strong>，就是你在日<strong>後需要修改或更新程式的時候，所花費的時間和精力</strong>，如果修改起來很費時費力，那我們就說他可維護性低。</li>
<li>所謂的耦合度，就是物件與物件間的依賴、相關程度，如果在A類內去new B，B類內又去new C，彼此相互直接依賴，這樣類別之間相互呼叫令彼此有所牽連，便是耦合(coupling)，物件關係越緊密，耦合度越高，耦合度高的程式碼，一旦有任何變動，容易發生<strong>連鎖反應，牽一髮動全身</strong>，因此龐大的軟體更應該考慮<strong>低耦合高內聚</strong>的設計方式。</li>
</ul>
<p>而DIP, IoC和DI可以<strong>讓程式之間解耦</strong>，提高程式的可維護性。<br>基本上「可維護性」和「寬鬆耦合」就是我們學習DIP, IoC和DI的原因</p>
<p>下面我會一一介紹DIP, IoC以及DI，最後附上實際應用，即最後組合出來的結果。</p>
<h2 id="DIP"><a href="#DIP" class="headerlink" title="DIP"></a>DIP</h2><blockquote>
<p><strong>DIP - Dependency Inversion Principle(依賴倒轉原則)</strong></p>
<ul>
<li>一種原則、思想</li>
<li>高層次的模組不應該依賴低層的模組，低層次的模組也不應該依賴高層次的模組<br><strong>兩者都應該依賴抽象</strong></li>
</ul>
</blockquote>
<p>什麼意思呢？讓我們看看下面的範例：</p>
<h3 id="簡單範例"><a href="#簡單範例" class="headerlink" title="簡單範例"></a>簡單範例</h3><figure class="highlight c#"><table><tr><td class="code"><pre><span class="line"><span class="keyword">public</span> <span class="keyword">class</span> <span class="title">Database</span></span><br><span class="line">&#123;</span><br><span class="line">    <span class="function"><span class="keyword">public</span> <span class="keyword">void</span> <span class="title">Connect</span>()</span> &#123; <span class="comment">/* database connect logic */</span> &#125;</span><br><span class="line">    <span class="function"><span class="keyword">public</span> <span class="keyword">void</span> <span class="title">Disconnect</span>()</span> &#123; <span class="comment">/* database disconnect logic */</span> &#125;</span><br><span class="line">    <span class="function"><span class="keyword">public</span> <span class="keyword">void</span> <span class="title">SaveData</span>(<span class="params"><span class="built_in">string</span> data</span>)</span> &#123; <span class="comment">/* database save data logic */</span> &#125;</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="keyword">public</span> <span class="keyword">class</span> <span class="title">DataAccess</span></span><br><span class="line">&#123;</span><br><span class="line">    <span class="keyword">private</span> Database _database = <span class="keyword">new</span> Database();</span><br><span class="line"></span><br><span class="line">    <span class="function"><span class="keyword">public</span> <span class="keyword">void</span> <span class="title">SaveData</span>(<span class="params"><span class="built_in">string</span> data</span>)</span></span><br><span class="line">    &#123;</span><br><span class="line">        _database.Connect();</span><br><span class="line">        _database.SaveData(data);</span><br><span class="line">        _database.Disconnect();</span><br><span class="line">    &#125;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure>
<p>上面的程式碼違反了DIP，因為DataAccess<strong>直接依賴</strong>了’Database’類別，如果Database做了任何變動，DataAccess也需要跟著變動，所有有用到關於Database的class也都需要跟著變動，因此，’DataAccess’ class應該依賴抽象的Interface，而不是具體的實作。</p>
<hr>
<figure class="highlight c#"><table><tr><td class="code"><pre><span class="line"><span class="keyword">public</span> <span class="keyword">interface</span> <span class="title">IDatabase</span></span><br><span class="line">&#123;</span><br><span class="line">    <span class="function"><span class="keyword">void</span> <span class="title">Connect</span>()</span>;</span><br><span class="line">    <span class="function"><span class="keyword">void</span> <span class="title">Disconnect</span>()</span>;</span><br><span class="line">    <span class="function"><span class="keyword">void</span> <span class="title">SaveData</span>(<span class="params"><span class="built_in">string</span> data</span>)</span>;</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="keyword">public</span> <span class="keyword">class</span> <span class="title">SqlServerDatabase</span> : <span class="title">IDatabase</span></span><br><span class="line">&#123;</span><br><span class="line">    <span class="function"><span class="keyword">public</span> <span class="keyword">void</span> <span class="title">Connect</span>()</span> &#123; <span class="comment">/* SQL Server database connect logic */</span> &#125;</span><br><span class="line">    <span class="function"><span class="keyword">public</span> <span class="keyword">void</span> <span class="title">Disconnect</span>()</span> &#123; <span class="comment">/* SQL Server database disconnect logic */</span> &#125;</span><br><span class="line">    <span class="function"><span class="keyword">public</span> <span class="keyword">void</span> <span class="title">SaveData</span>(<span class="params"><span class="built_in">string</span> data</span>)</span> &#123; <span class="comment">/* SQL Server database save data logic */</span> &#125;</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="keyword">public</span> <span class="keyword">class</span> <span class="title">OracleDatabase</span> : <span class="title">IDatabase</span></span><br><span class="line">&#123;</span><br><span class="line">    <span class="function"><span class="keyword">public</span> <span class="keyword">void</span> <span class="title">Connect</span>()</span> &#123; <span class="comment">/* Oracle database connect logic */</span> &#125;</span><br><span class="line">    <span class="function"><span class="keyword">public</span> <span class="keyword">void</span> <span class="title">Disconnect</span>()</span> &#123; <span class="comment">/* Oracle database disconnect logic */</span> &#125;</span><br><span class="line">    <span class="function"><span class="keyword">public</span> <span class="keyword">void</span> <span class="title">SaveData</span>(<span class="params"><span class="built_in">string</span> data</span>)</span> &#123; <span class="comment">/* Oracle database save data logic */</span> &#125;</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="keyword">public</span> <span class="keyword">class</span> <span class="title">DataAccess</span></span><br><span class="line">&#123;</span><br><span class="line">    <span class="keyword">private</span> IDatabase _database;</span><br><span class="line"></span><br><span class="line">    <span class="function"><span class="keyword">public</span> <span class="title">DataAccess</span>()</span></span><br><span class="line">    &#123;</span><br><span class="line">        _database = <span class="keyword">new</span> SqlServerDatabase();</span><br><span class="line">        <span class="comment">//_database = new OracleDatabase(); 在這裡抽換</span></span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line">    <span class="function"><span class="keyword">public</span> <span class="keyword">void</span> <span class="title">SaveData</span>(<span class="params"><span class="built_in">string</span> data</span>)</span></span><br><span class="line">    &#123;</span><br><span class="line">        _database.Connect();</span><br><span class="line">        _database.SaveData(data);</span><br><span class="line">        _database.Disconnect();</span><br><span class="line">    &#125;</span><br><span class="line">&#125; </span><br></pre></td></tr></table></figure>
<p>由上面的範例可知，我們定義了一個IDatabase，規範所有資料庫應該有的action，讓不同的實體Database去實作，而Program本身(DataAccess)，只需要去依賴、使用IDatabase，這樣一來就算未來我從SQL-Server DB遷移到Oracle DB，我<strong>只需要抽換IDatabase的實作實體</strong>(_database指標指向的實際記憶體)，DataAccess內部使用到IDatabase的程式碼一行都不用更改。</p>
<p>但是，可以發現到，雖然透過依賴倒轉，可以改變為依賴抽象，<strong>但程式(DataAccess)本身還是需要new 出instance</strong>，也就是說程式本身(呼叫者)，對於依賴的控制流程具有主導權，這時就需要<strong>控制反轉</strong>。</p>
<h2 id="IoC"><a href="#IoC" class="headerlink" title="IoC"></a>IoC</h2><blockquote>
<ul>
<li>把對於某個物件的<strong>控制權</strong>移轉給<strong>第三方容器 (IoC Container)</strong></li>
</ul>
</blockquote>
<p>IoC 是一種設計原則或思想，它建議我們反轉物件導向設計中的各種控制，以達到各個類別間的解耦。這裡的 <strong>“控制”指的是除了一個類別本身的職責之外的其它所有工作</strong>，如整個軟體的流程控制，物件的依賴或創建等等。</p>
<blockquote>
<ul>
<li>其實意思就是，一個類別本身除了<strong>本身的職責外，不應該擁有太多其他的工作 (SRP)</strong></li>
<li>所以建議將這些對於物件的<strong>控制權(創建、實作實體抽換等等)<strong>，交給</strong>第三方容器</strong> (Framework or Library)。</li>
<li>獲取資源的行為由”主動”轉為”被動”</li>
<li><strong>程式(Application) 依賴物件的「控制流程 (Control Flow)」，由「主動」變成「被動」。就是「控制反轉」</strong></li>
</ul>
</blockquote>
<p>下面這兩張圖簡單解釋了使用IoC後的依賴關係</p>
<ul>
<li><p>這是還沒使用IoC前，我們的應用程式直接依賴於實體類別<br><img data-src="/images/posts/DI-IoC/IoC1.png"></img></p>
</li>
<li><p>這個則是使用IoC後，透過IoC Container，將依賴實體注入至程式中，程式由原來主動的依賴變成被動的接收<br><img data-src="/images/posts/DI-IoC/IoC2.png"></img></p>
</li>
</ul>
<p>好萊塢原則也很貼切的說明了控制反轉的情境</p>
<blockquote>
<p>Don’t call me, I’ll call you.</p>
</blockquote>
<h3 id="IoC-Container"><a href="#IoC-Container" class="headerlink" title="IoC Container"></a>IoC Container</h3><p>廣義上來說， IoC 容器，就是有進行「依賴注入」的地方，<br>你隨便寫一個類別，透過它將所需元件注入給高階模組，便可說是容器。<br>但現在所說的容器通常泛指那些<strong>強大的IoC框架所提供的容器</strong>。</p>
<p>各位可以把IoC容器想像成是儲存一堆使用者<strong>註冊的依賴實體</strong>，IoC Container透過這些使用者註冊的資訊，知道程式需要這個instance並賦予給他，達到不需要修改高階模組的目的。<br>程式在<strong>執行的期間(Runtime)<strong>，需要依賴物件的實體，需要透過IoC Container注入給程式，使用的是</strong>反射原理(Reflection)</strong>-也就是透過程式碼的中間編譯檔，去讀取程式碼內部的資訊。</p>
<p>下面這兩張圖則解釋了IoC如何利用IoC Container做到控制反轉的示意圖</p>
<ul>
<li><p>這個是沒有使用IoC框架時，高階模組主動的去建立所需要的低階模組 (資源)<br><img data-src="/images/posts/DI-IoC/IoC4.png"></img></p>
</li>
<li><p>這個則是使用了IoC框架，透過 <strong>「註冊 (Register)」</strong> 所需要的模組進IoC Container，藉由容器<strong>主動注入依賴實體</strong>進入高階模組<br><img data-src="/images/posts/DI-IoC/IoC3.png"></img></p>
</li>
</ul>
<h3 id="簡單範例-1"><a href="#簡單範例-1" class="headerlink" title="簡單範例"></a>簡單範例</h3><p>這邊提供的簡單範例中，IoC Container用簡單的方式實作，實際上這些工作會交給第三方套件或框架完成，這邊使用簡單的方式實作給大家理解</p>
<figure class="highlight c#"><table><tr><td class="code"><pre><span class="line"><span class="keyword">public</span> <span class="keyword">interface</span> <span class="title">ILogger</span></span><br><span class="line">&#123;</span><br><span class="line">    <span class="function"><span class="keyword">void</span> <span class="title">Log</span>(<span class="params"><span class="built_in">string</span> message</span>)</span>;</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="keyword">public</span> <span class="keyword">class</span> <span class="title">ConsoleLogger</span> : <span class="title">ILogger</span></span><br><span class="line">&#123;</span><br><span class="line">    <span class="function"><span class="keyword">public</span> <span class="keyword">void</span> <span class="title">Log</span>(<span class="params"><span class="built_in">string</span> message</span>)</span></span><br><span class="line">    &#123;</span><br><span class="line">        Console.WriteLine(<span class="string">$&quot;Log: <span class="subst">&#123;message&#125;</span>&quot;</span>);</span><br><span class="line">    &#125;</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="keyword">public</span> <span class="keyword">class</span> <span class="title">UserService</span></span><br><span class="line">&#123;</span><br><span class="line">    <span class="keyword">private</span> <span class="keyword">readonly</span> ILogger _logger;</span><br><span class="line"></span><br><span class="line">    <span class="function"><span class="keyword">public</span> <span class="title">UserService</span>(<span class="params">ILogger logger</span>)</span></span><br><span class="line">    &#123;</span><br><span class="line">        _logger = logger;</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line">    <span class="function"><span class="keyword">public</span> <span class="keyword">void</span> <span class="title">CreateUser</span>(<span class="params"><span class="built_in">string</span> username, <span class="built_in">string</span> password</span>)</span></span><br><span class="line">    &#123;</span><br><span class="line">        _logger.Log(<span class="string">$&quot;Creating user <span class="subst">&#123;username&#125;</span>&quot;</span>);</span><br><span class="line">        <span class="comment">// Implementation to create a user</span></span><br><span class="line">    &#125;</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="keyword">class</span> <span class="title">Program</span></span><br><span class="line">&#123;</span><br><span class="line">    <span class="function"><span class="keyword">static</span> <span class="keyword">void</span> <span class="title">Main</span>(<span class="params"><span class="built_in">string</span>[] args</span>)</span></span><br><span class="line">    &#123;</span><br><span class="line">        ILogger logger = <span class="keyword">new</span> ConsoleLogger();</span><br><span class="line">        UserService userService = <span class="keyword">new</span> UserService(logger);</span><br><span class="line">        userService.CreateUser(<span class="string">&quot;johndoe&quot;</span>, <span class="string">&quot;secret&quot;</span>);</span><br><span class="line">    &#125;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure>
<p>在這裡，<strong>Program就是我們的IoC Container</strong>，UserService依賴ILogger抽象，而透過註冊在Program中的資訊，讓<strong>Program主動將實際的依賴實體(ConsoleLogger物件)，注入到UserService的建構元中</strong>，也就是建構元注入，後面講到DI會再提到。<br><strong>之後如果有需要新的Logger，只需要創建並實作ILogger，再透過Program注入給UserService，UserService內部程式碼一行都不用更改。</strong></p>
<h3 id="IoC與DIP的差別"><a href="#IoC與DIP的差別" class="headerlink" title="IoC與DIP的差別"></a>IoC與DIP的差別</h3><p>控制反轉(IoC)與依賴倒轉(DIP)兩者不相等！</p>
<blockquote>
<p>依賴倒轉，倒轉的是「依賴關係」<br>控制反轉，反轉的是程式依賴物件的「控制流程」</p>
</blockquote>
<h2 id="DI"><a href="#DI" class="headerlink" title="DI"></a>DI</h2><blockquote>
<p>將依賴通過<strong>注入</strong>的方式提供給需要的模組，是 IoC 與 DIP 的具體表現<br>把被依賴物件注入被動接收物件中</p>
</blockquote>
<blockquote>
<p><strong>程式或者開發者不必理會物件是如何產生、保持、至銷毀的生命週期</strong><br>在.NET的DI框架中，生命週期有三種，Transient、Scoped、Singleton，後面講解實作時會再談到。</p>
</blockquote>
<p>DI的背後思想主要是：</p>
<blockquote>
<ol>
<li>為了保證DIP，一個類別應該只依賴抽象</li>
<li>於是具體的實現必須透過某種方式”注入”到這個類別</li>
<li>那麼依據IoC原則，最好透過第三方容器來做到這件事</li>
</ol>
</blockquote>
<p>而DI又有主要的三種形式：</p>
<blockquote>
<ol>
<li>建構元注入 (Constructor Injection)</li>
<li>設值方法注入 (Setter Injection)</li>
<li>介面注入 (Interface Injection)</li>
</ol>
</blockquote>
<p>下面舉個簡單的範例</p>
<h3 id="簡單範例-2"><a href="#簡單範例-2" class="headerlink" title="簡單範例"></a>簡單範例</h3><ol>
<li>建構元注入 (Constructor Injection)<br>屬於最常見的注入方式，IoC Container將實體DI到呼叫者的建構元中，當呼叫者被new(建立)時，就會自動注入相關實體進入建構元。<figure class="highlight c#"><table><tr><td class="code"><pre><span class="line"><span class="keyword">public</span> <span class="keyword">interface</span> <span class="title">ILogger</span></span><br><span class="line">&#123;</span><br><span class="line">    <span class="function"><span class="keyword">void</span> <span class="title">Log</span>(<span class="params"><span class="built_in">string</span> message</span>)</span>;</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="keyword">public</span> <span class="keyword">class</span> <span class="title">Logger</span> : <span class="title">ILogger</span></span><br><span class="line">&#123;</span><br><span class="line">    <span class="function"><span class="keyword">public</span> <span class="keyword">void</span> <span class="title">Log</span>(<span class="params"><span class="built_in">string</span> message</span>)</span></span><br><span class="line">    &#123;</span><br><span class="line">        Console.WriteLine(<span class="string">&quot;Log: &quot;</span> + message);</span><br><span class="line">    &#125;</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="keyword">public</span> <span class="keyword">class</span> <span class="title">UserService</span></span><br><span class="line">&#123;</span><br><span class="line">    <span class="keyword">private</span> <span class="keyword">readonly</span> ILogger _logger;</span><br><span class="line"></span><br><span class="line">    <span class="function"><span class="keyword">public</span> <span class="title">UserService</span>(<span class="params">ILogger logger</span>)</span></span><br><span class="line">    &#123;</span><br><span class="line">        _logger = logger;</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line">    <span class="function"><span class="keyword">public</span> <span class="keyword">void</span> <span class="title">AddUser</span>(<span class="params"><span class="built_in">string</span> userName</span>)</span></span><br><span class="line">    &#123;</span><br><span class="line">        _logger.Log(<span class="string">&quot;User Added: &quot;</span> + userName);</span><br><span class="line">    &#125;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure>
在這裡，當有人去new UserService時，IoC Container就會自動注入當初註冊的實體進入UserService的建構元。</li>
</ol>
<hr>
<ol start="2">
<li><p>設值方法注入 (Setter Injection)<br>透過setter method注入實體，他允許我們在呼叫者實體被創建後(instantiated)，才注入相關依賴實體。</p>
<figure class="highlight c#"><table><tr><td class="code"><pre><span class="line"><span class="keyword">public</span> <span class="keyword">class</span> <span class="title">UserService</span></span><br><span class="line">&#123;</span><br><span class="line">    <span class="keyword">private</span> ILogger _logger;</span><br><span class="line"></span><br><span class="line">    <span class="keyword">public</span> ILogger Logger</span><br><span class="line">    &#123;</span><br><span class="line">        <span class="keyword">set</span> &#123; _logger = <span class="keyword">value</span>; &#125;</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line">    <span class="function"><span class="keyword">public</span> <span class="keyword">void</span> <span class="title">AddUser</span>(<span class="params"><span class="built_in">string</span> userName</span>)</span></span><br><span class="line">    &#123;</span><br><span class="line">        _logger.Log(<span class="string">&quot;User Added: &quot;</span> + userName);</span><br><span class="line">    &#125;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure>
</li>
<li><p>介面注入 (Interface Injection)<br>依賴透過Interface注入近實例中，這個Interface必須定義一個方法來注入依賴，再藉由實例去實作此介面，來實現具體的DI</p>
<figure class="highlight c#"><table><tr><td class="code"><pre><span class="line"><span class="keyword">public</span> <span class="keyword">interface</span> <span class="title">IUserService</span></span><br><span class="line">&#123;</span><br><span class="line">    <span class="function"><span class="keyword">void</span> <span class="title">AddUser</span>(<span class="params"><span class="built_in">string</span> userName</span>)</span>;</span><br><span class="line">    <span class="function"><span class="keyword">void</span> <span class="title">SetLogger</span>(<span class="params">ILogger logger</span>)</span>; <span class="comment">// 定義注入依賴的方法</span></span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="keyword">public</span> <span class="keyword">class</span> <span class="title">UserService</span> : <span class="title">IUserService</span></span><br><span class="line">&#123;</span><br><span class="line">    <span class="keyword">private</span> ILogger _logger;</span><br><span class="line"></span><br><span class="line">    <span class="function"><span class="keyword">public</span> <span class="keyword">void</span> <span class="title">AddUser</span>(<span class="params"><span class="built_in">string</span> userName</span>)</span></span><br><span class="line">    &#123;</span><br><span class="line">        _logger.Log(<span class="string">&quot;User Added: &quot;</span> + userName);</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line">    <span class="function"><span class="keyword">public</span> <span class="keyword">void</span> <span class="title">SetLogger</span>(<span class="params">ILogger logger</span>) <span class="comment">// 實際注入依賴</span></span></span><br><span class="line">    &#123;</span><br><span class="line">        _logger = logger;</span><br><span class="line">    &#125;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure></li>
</ol>
<h2 id="DIP、IoC與DI的結合-實際應用"><a href="#DIP、IoC與DI的結合-實際應用" class="headerlink" title="DIP、IoC與DI的結合 - 實際應用"></a>DIP、IoC與DI的結合 - 實際應用</h2><p>為大家總結一下上面講的各種名詞，用這張圖簡單概括<br>先讓大家釐清，DIP, IoC, DI, IoC Container之間的關係<br><img data-src="/images/posts/DI-IoC/sumup.png"></img></p>
<h3 id="生活範例"><a href="#生活範例" class="headerlink" title="生活範例"></a>生活範例</h3><p>讓我們用在「餐廳煮東西」來舉例</p>
<blockquote>
<p>DIP: High-level modules should not depend on low-level modules. Both should depend on abstractions.</p>
</blockquote>
<p>在我們的例子中，<strong>廚師就是高階模組</strong>，而<strong>食材是低階模組</strong>，<ins>廚師不應該依賴於特定的食材</ins>，而是應該依賴於可以<ins>用來煮各種餐點的食材的抽象概念</ins>。</p>
<hr>
<blockquote>
<p>Inversion of Control (IoC): The control of the flow of a program is inverted.</p>
</blockquote>
<p>在我們例子中，顧客點餐，廚師備餐，對於餐點準備的控制流程，<ins>顧客不控制備餐的流程，而是接受最終完成的餐點</ins>。</p>
<hr>
<blockquote>
<p>IoC Container: A container that manages and controls the creation and life cycle of objects, and also injects their dependencies.</p>
</blockquote>
<p>在我們的例子中，可以把廚房想像成是IoC容器，他主管了各個食材、廚房用具的生命週期，並確保能夠提供廚師需要的食材或工具。</p>
<hr>
<blockquote>
<p>Dependency Injection (DI): A technique for achieving IoC, where the objects are given their dependencies instead of creating them themselves.</p>
</blockquote>
<p>在我們的例子中，廚師是被提供食材的人（由廚房提供），而不是自己去尋找食材。</p>
<h3 id="結合舉例"><a href="#結合舉例" class="headerlink" title="結合舉例"></a>結合舉例</h3><p>接下來我們接續上面的例子，透過程式的方式來講解上面的所有概念(DIP, IoC, IoC Container, DI)</p>
<figure class="highlight c#"><table><tr><td class="code"><pre><span class="line"><span class="keyword">public</span> <span class="keyword">interface</span> <span class="title">IChef</span></span><br><span class="line">&#123;</span><br><span class="line">    <span class="function"><span class="keyword">void</span> <span class="title">Cook</span>()</span>;</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="keyword">public</span> <span class="keyword">class</span> <span class="title">Chef</span> : <span class="title">IChef</span></span><br><span class="line">&#123;</span><br><span class="line">    <span class="keyword">private</span> <span class="keyword">readonly</span> IIngredients _ingredients;</span><br><span class="line"></span><br><span class="line">    <span class="function"><span class="keyword">public</span> <span class="title">Chef</span>(<span class="params">IIngredients ingredients</span>)</span></span><br><span class="line">    &#123;</span><br><span class="line">        _ingredients = ingredients;</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line">    <span class="function"><span class="keyword">public</span> <span class="keyword">void</span> <span class="title">Cook</span>()</span></span><br><span class="line">    &#123;</span><br><span class="line">        Console.WriteLine(<span class="string">&quot;Cooking with &quot;</span> + _ingredients.GetIngredients());</span><br><span class="line">    &#125;</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="keyword">public</span> <span class="keyword">interface</span> <span class="title">IIngredients</span></span><br><span class="line">&#123;</span><br><span class="line">    <span class="function"><span class="built_in">string</span> <span class="title">GetIngredients</span>()</span>;</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="keyword">public</span> <span class="keyword">class</span> <span class="title">Ingredients</span> : <span class="title">IIngredients</span></span><br><span class="line">&#123;</span><br><span class="line">    <span class="function"><span class="keyword">public</span> <span class="built_in">string</span> <span class="title">GetIngredients</span>()</span></span><br><span class="line">    &#123;</span><br><span class="line">        <span class="keyword">return</span> <span class="string">&quot;Tomatoes, Onions, Garlic, and Spices&quot;</span>;</span><br><span class="line">    &#125;</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="keyword">class</span> <span class="title">Kitchen</span></span><br><span class="line">&#123;</span><br><span class="line">    <span class="keyword">private</span> <span class="keyword">static</span> IChef _chef;</span><br><span class="line">    <span class="keyword">private</span> <span class="keyword">static</span> IIngredients _ingredients;</span><br><span class="line"></span><br><span class="line">    <span class="function"><span class="keyword">static</span> <span class="title">Kitchen</span>()</span></span><br><span class="line">    &#123;</span><br><span class="line">        _ingredients = <span class="keyword">new</span> Ingredients();</span><br><span class="line">        _chef = <span class="keyword">new</span> Chef(_ingredients);</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line">    <span class="function"><span class="keyword">public</span> <span class="keyword">static</span> IChef <span class="title">GetChef</span>()</span></span><br><span class="line">    &#123;</span><br><span class="line">        <span class="keyword">return</span> _chef;</span><br><span class="line">    &#125;</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="keyword">class</span> <span class="title">Program</span></span><br><span class="line">&#123;</span><br><span class="line">    <span class="function"><span class="keyword">static</span> <span class="keyword">void</span> <span class="title">Main</span>(<span class="params"><span class="built_in">string</span>[] args</span>)</span></span><br><span class="line">    &#123;</span><br><span class="line">        <span class="keyword">var</span> chef = Kitchen.GetChef();</span><br><span class="line">        chef.Cook();</span><br><span class="line">        Console.ReadLine();</span><br><span class="line">    &#125;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure>
<ul>
<li>在上面的例子中，’IChef’介面和’Chef’類別遵守了<strong>DIP</strong>，因為他們依賴於抽象的’IIngredients’，而不是特定的食材實體。</li>
<li>而’Chef’類別透過 <strong>「建構元注入」</strong> 的方式，給予’IIngredients’的依賴，而不是自己主動直接創建一個實體。</li>
<li>‘Kitchen’ class則扮演IoC Container的角色，他管理了’Chef’和’Ingredients’的創建與生命週期，並注入’Ingredients’實體進入’Chef’的建構元中。</li>
<li>而Main method可以當成是我們的Application，透過’Kitchen’獲取’Chef’實體，並呼叫Cook() method。</li>
</ul>
<h2 id="NET-C-實現"><a href="#NET-C-實現" class="headerlink" title=".NET C#實現"></a>.NET C#實現</h2><p>下面我將簡單使用.NET預設的DI框架(Microsoft.Extensions.DependencyInjection)來實現註冊依賴實體，與依賴注入。<br>其中還有一些進階的用法，像是把<ins>註冊相關的邏輯抽提出來寫成擴充方法</ins>，還有使用<ins>Attribute與反射來解決建構元注入太多的問題</ins>，但在這篇教學中先使用最簡單的方法實作，為的是讓各位先理解基本的概念與用法，進階用法會在之後的文章詳細介紹。</p>
<h3 id="DI生命週期與註冊"><a href="#DI生命週期與註冊" class="headerlink" title="DI生命週期與註冊"></a>DI生命週期與註冊</h3><p>在.NET的預設DI框架中，註冊實體物件時可以指定其生命週期，分為三種（重要！）</p>
<blockquote>
<ol>
<li>Transient (一次性) : <strong>每次注入時</strong>，都建立一個新的實體。</li>
<li>Scoped (作用域) : <strong>每次的Request</strong>，都建立一個新的實體，同一個Request下，重複利用同一個實體 (這裡的Request 常指Http Request)。</li>
<li>Singleton (單例) : 使用單例模式(Singleton Pattern)，<strong>從程式開始到結束</strong>，只建立一個實體，每次都重複利用同一個，直到程式被終止。</li>
</ol>
</blockquote>
<h3 id="實作"><a href="#實作" class="headerlink" title="實作"></a>實作</h3><p>讓我們繼續以上面餐廳的例子實作<br>首先定義好相關的class與Interface，其中使用DIP我這邊就不特別提了</p>
<figure class="highlight c#"><table><tr><td class="code"><pre><span class="line"><span class="keyword">public</span> <span class="keyword">interface</span> <span class="title">IChef</span></span><br><span class="line">&#123;</span><br><span class="line">    <span class="function"><span class="keyword">void</span> <span class="title">Cook</span>()</span>;</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="keyword">public</span> <span class="keyword">class</span> <span class="title">Chef</span> : <span class="title">IChef</span></span><br><span class="line">&#123;</span><br><span class="line">    <span class="keyword">private</span> <span class="keyword">readonly</span> IIngredients _ingredients;</span><br><span class="line"></span><br><span class="line">    <span class="function"><span class="keyword">public</span> <span class="title">Chef</span>(<span class="params">IIngredients ingredients</span>)</span></span><br><span class="line">    &#123;</span><br><span class="line">        _ingredients = ingredients;</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line">    <span class="function"><span class="keyword">public</span> <span class="keyword">void</span> <span class="title">Cook</span>()</span></span><br><span class="line">    &#123;</span><br><span class="line">        Console.WriteLine(<span class="string">&quot;Cooking with &quot;</span> + _ingredients.GetIngredients());</span><br><span class="line">    &#125;</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="keyword">public</span> <span class="keyword">interface</span> <span class="title">IIngredients</span></span><br><span class="line">&#123;</span><br><span class="line">    <span class="function"><span class="built_in">string</span> <span class="title">GetIngredients</span>()</span>;</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="keyword">public</span> <span class="keyword">class</span> <span class="title">Ingredients</span> : <span class="title">IIngredients</span></span><br><span class="line">&#123;</span><br><span class="line">    <span class="function"><span class="keyword">public</span> <span class="built_in">string</span> <span class="title">GetIngredients</span>()</span></span><br><span class="line">    &#123;</span><br><span class="line">        <span class="keyword">return</span> <span class="string">&quot;Tomatoes, Onions, Garlic, and Spices&quot;</span>;</span><br><span class="line">    &#125;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure>

<hr>
<p>接著就是我們主要註冊DI實體的地方，在Program.cs的檔案中，我這邊只寫出關鍵的部分。</p>
<figure class="highlight c#"><table><tr><td class="code"><pre><span class="line"><span class="comment">// ...</span></span><br><span class="line"></span><br><span class="line"><span class="keyword">using</span> Microsoft.Extensions.DependencyInjection;</span><br><span class="line"></span><br><span class="line">builder.Services.AddScoped&lt;IChef, Chef&gt;();</span><br><span class="line">builder.Services.AddScoped&lt;IIngredients, Ingredients&gt;();</span><br><span class="line"></span><br><span class="line"><span class="comment">// ...</span></span><br></pre></td></tr></table></figure>
<p>解釋一下，這邊的builder.Services屬於IServiceCollection類，一但呼叫’AddScoped&lt;IIngredients, Ingredients&gt;()’方法，IoC Container就知道要建立一個Ingredients實體，去對應到程式中的三種DI形式之一，並注入實體讓IIngredients指向，在我們的例子中是「建構元注入」，因此，DI框架透過「反射原理」，知道Chef class中的建構元有IIngredients，透過之前註冊的資訊，IoC Continaer主動建立一個Ingredients實體，並注入到Chef class的建構元中。</p>
<p><strong>整理一下上面的流程</strong></p>
<blockquote>
<ol>
<li>利用builder.Services.AddScoped&lt;IIngredients, Ingredients&gt;()註冊依賴資訊，以及生命週期給IoC Container</li>
<li>IoC Container利用反射原理，得知Chef class中的建構元有IIngredients，並與之前註冊的依賴資訊做對應</li>
<li>IoC Container建立一個Ingredients實體，並注入進Chef class的建構元中，讓建構元的IIngredients指標指向</li>
<li>在建構元中，透過建構元參數指向的Ingredients實體，賦值給Chef class的內部欄位_ingredients</li>
</ol>
</blockquote>
<h2 id="結語"><a href="#結語" class="headerlink" title="結語"></a>結語</h2><p>這篇文章我十分詳細的介紹了DIP, IoC, DI的概念與實作，這些概念對於軟體開發來說非常重要，但大家也要清楚理解<strong>這些思想要解決的問題，以及使用它們的好處，清楚自己在做什麼，而不是為設計而設計</strong>，其實OOP很多的pattern，都會有其好處以及trade off，因此了解為何使用就顯得非常重要。</p>
<p>P.S. : </p>
<ul>
<li>我自己很喜歡使用指標和記憶體的概念來理解物件與其值，這對於理解Pass by value&#x2F;reference和Stack, Heap的記憶體分配非常有用，十分建議大家使用。</li>
<li>Microsoft.Extensions.DependencyInjection這個namespace，利用IServiceProvider來管理我們程式中中所註冊的依賴，我們也可以透過注入這個IServiceProvider來取得實體，這在之後的<strong>DI進化的文章會有著關鍵作用。</strong></li>
</ul>
<style>
img{
    width: 70%; 
    margin: 15px auto;
}

</style>]]></content>
      <categories>
        <category>OOP</category>
        <category>SOLID</category>
      </categories>
      <tags>
        <tag>OOP</tag>
        <tag>SOLID</tag>
        <tag>.NET C#</tag>
      </tags>
  </entry>
  <entry>
    <title>CC-沈浸式線上逛街APP-系統與技術介紹</title>
    <url>/posts/2012237495/</url>
    <content><![CDATA[<h2 id="前言"><a href="#前言" class="headerlink" title="前言"></a>前言</h2><p>接續上一篇的文章，這篇文章會著重介紹這個系統<br>以下是我在YouTube上對這個專案的介紹與Demo：</p>
<iframe
    src="https://www.youtube.com/embed/aMGnyI2Xe04">
</iframe>
<span class="exturl" data-url="aHR0cHM6Ly9kcml2ZS5nb29nbGUuY29tL2ZpbGUvZC8xTmVDRGk0dWFmQ3UtcF9iVDE2cERLSm50SXluN3FzeUYvdmlldz91c3A9c2hhcmluZw==">Proposal Link<i class="fa fa-external-link-alt"></i></span>

<p>這篇文章的內容會與影片很像，主要分為<strong>動機與介紹、重點功能簡介、系統架構與技術</strong>的順序介紹<br>系統功能Demo, 還請觀看影片介紹</p>
<span id="more"></span>
<p><img data-src="/images/posts/CC-project-demo/1.png" 
style="width: 90%; margin: 15px auto;"></p>
<h2 id="動機與介紹"><a href="#動機與介紹" class="headerlink" title="動機與介紹"></a>動機與介紹</h2><img data-src="/images/posts/CC-project-demo/2.png" style="width: 70%; margin: 15px auto;">
首先我們主要的TA之一是線上逛街族, 針對這個族群我們統整出簡單的兩點：第一, 他們在線上瀏覽或滑商品時, 通常沒有特定的消費目的, 只是想要滑滑看看, 無目的的瀏覽行為, 第二, 這種行為主要是以消遣、獲得樂趣為目的, 而不一定是真的想要買商品

<hr>
<img data-src="/images/posts/CC-project-demo/3.png" style="width: 70%; margin: 15px auto;">
根據上面的族群設定, 我們的專案主要會focus在解決或滿足下面的問題, 第一, 我們是針對無目的性的消費情境底下的需求, 第二, 我們希望帶給使用者沈浸式的體驗, 所謂的沈浸式體驗, 就是在舒服且不受干擾的狀態下, 接收自己有興趣的資訊, 以此讓使用者能在我們這個APP上做消遣、殺時間的動作, 這麼做的目的很大一部分是為了捕捉使用者的微時刻, 這些微時刻的數據可以代表使用者不同時刻的喜好或者決策, 第四，我們這個APP也會瞄準直播的市場與直播平台合作，而這些直播主的煩惱，便是下播後無法持續創造收入，因此我們除了希望能延長商品的生命週期外，也可以透過捕捉使用者行為來提供直播主缺乏的Consumer Insight

<hr>
<img data-src="/images/posts/CC-project-demo/4.png" style="width: 70%; margin: 15px auto;">
為了達成以上的目的，我們主要有這幾項手段，第一, 研究推薦演算法與捕捉使用者行為，透過使用者不同時刻的行為數據來作為推薦的依據，而非傳統的電商使用歷史紀錄來做推薦，以次讓使用者能夠在每時每刻接收有興趣的資訊，達到沈浸式的體驗。第二，透過簡潔的介面，降低使用者瀏覽時的壓力，讓使用者更願意停留在APP上，第三，透過分享貼文、追蹤、評論等等的社群機制，讓使用者能夠了解親朋好友們有興趣或者好評的商品，也能夠享受社群的樂趣，增加APP的黏著度，最後，透過APP上種種捕捉使用者行為數據的機制，可以將這些數據提供給與我們合作的第三方廠商，讓廠商更了解使用者的喜好

<h2 id="重點功能簡介"><a href="#重點功能簡介" class="headerlink" title="重點功能簡介"></a>重點功能簡介</h2><img data-src="/images/posts/CC-project-demo/5.png" style="width: 70%; margin: 15px auto;">
再來介紹一些重點功能的簡短敘述，首先第一個是我們的商品貼文，也是商品的主體，貼文特色主要以滿板設計與資訊收合來達到雜訊最小化的目的，另外後面Demo也會呈現推薦商品的形式，與現在短影音的方式很像，透過推薦與給人耳目一新的商品，帶給使用者殺時間的樂趣，另外，透過商品貼文，也可以成為直播主下播後銷售的利器，與一般電商不同的是，我們主動推薦商家商品，而且是透過使用者當前的行為喜好，而不是被動等待使用者搜尋或是透過購買紀錄來做推薦

<hr>
<img data-src="/images/posts/CC-project-demo/6.png" style="width: 70%; margin: 15px auto;">
再來是我們的主頁面，透過滑卡的機制增加互動感，也可以同時捕捉使用者行為，以做到即時推薦，這麼做除了可以增加沈浸體驗，也可以更廣泛的推薦商品，提高曝光程度

<hr>
<img data-src="/images/posts/CC-project-demo/7.png" style="width: 70%; margin: 15px auto;">
主頁面與探索頁面都可以透過使用者的滑卡行為、停留時間與點擊率，來彼此優化推薦內容，做到雙向推薦的功能

<hr>
<img data-src="/images/posts/CC-project-demo/8.png" style="width: 70%; margin: 15px auto;">
探索頁面除了會推薦使用者感興趣的商品，也會隨機推薦相關商品，增加新奇度，其中最大塊的貼文則為推薦分數最高的商品，透過矩陣式樹狀結構，帶給使用者簡潔與大量瀏覽的感受

<hr>
<img data-src="/images/posts/CC-project-demo/9.png" style="width: 70%; margin: 15px auto;">
CC提供商城、買家、直播主追蹤的機制，加上個人化自己的動態牆與留言區，增加社群互動性，以社群的力量增加APP的黏著度

<hr>
<img data-src="/images/posts/CC-project-demo/10.png" style="width: 70%; margin: 15px auto;">
分享貼文除了可以增加互動的樂趣，也是提供消費者洞見的大平台
最重要的是，可以藉由社群的力量，一傳十十傳百，來行銷商品

<h2 id="系統架構與技術"><a href="#系統架構與技術" class="headerlink" title="系統架構與技術"></a>系統架構與技術</h2><p>先附上系統架構圖：<br><img data-src="/images/posts/CC-project-demo/CC_structure.jpeg" style="width: 70%; margin: 15px auto;"></p>
<p>首先前端主要由Angular寫成，搭配<strong>模組化與物件導向設計，增加程式的可維護性</strong>，最後透過PWA包裝成跨平台的APP<br>藉由Angular能夠模組化與元件化的特性，我們將各個功能切分成不同模組，提升程式低耦合高內聚的特性，另外，Token驗證、路由保護、API攔截器、資料格式化Pipeline等等，也都抽離出來實作，其中，核心的商業邏輯切分在我們的Service模組，與畫面邏輯分離，再藉由依賴注入讓各模組使用，這樣的設計，讓我們的專案在未來更改與維護擴充時非常省時與省力。</p>
<p>另外一個主要的功能是推薦演算法，我們透過標籤(label)與雅卡爾相似度(Jaccard Similarity)，做了一個Example Based的推薦引擎，主要是用MongoDB pipeline與後端演算構築而成。</p>
<p>這裡也是我第一次使用前端框架，來寫一個比較完整的專案，由於我是後端出身，所以選擇一個自己寫起來最舒服的Angular框架，有種在寫後端的親切感XD。</p>
<p>後端主要由NodeJS寫成，資料庫則為MongoDB，一樣是將各個功能進行模組化的拆分，與資料庫索引的優化，以及推薦演算法的實現。</p>
<p>另外WebServer的部分則是由IIS代理，加上GitLab CI&#x2F;CD &amp; GitLab Runner做到自動化整合與部屬的功能</p>
<style>
.video-container
{
    padding-top: 60% !important;
}
</style>]]></content>
      <categories>
        <category>專案</category>
        <category>side project</category>
      </categories>
      <tags>
        <tag>專案</tag>
        <tag>side project</tag>
      </tags>
  </entry>
  <entry>
    <title>大一大二參加畢業專案與程式比賽心得分享</title>
    <url>/posts/534426495/</url>
    <content><![CDATA[<h2 id="前言"><a href="#前言" class="headerlink" title="前言"></a>前言</h2><p>小弟我在剛上大一沒多久，就私mail系上的程式老師，詢問是否可以參與更多的專案或研究，也一併把我的履歷給教授看。<br>沒想到教授大方地給我幾條路選，可以跟教授做研究，出去實習，或是和學長們做專案。<br>大一的我想要慢慢累積實力，於是決定先<strong>和學長們做專案</strong>，去比賽，累積經驗。<br>在歷經<strong>國科會專案、資訊競賽、畢業專案競賽等等</strong>後，我非常感謝教授和學長們給我這個機會，讓我可以擁有這些寶貴的經驗！<br>這篇文章主要著重在心得與分享，比較技術面的內容會在下篇文章詳細介紹。</p>
<span id="more"></span>

<h2 id="比賽心得與定位"><a href="#比賽心得與定位" class="headerlink" title="比賽心得與定位"></a>比賽心得與定位</h2><p>在這個Team內，我主要負責程式開發，包括<strong>前端、伺服器架設與管理和一部分的後端</strong>。<br>前前後後比了<strong>國科會大專生計畫、智慧創新、資訊服務和系上的畢業專案競賽</strong>，大約歷時1年多的時間，在這段時間裡，我們歷經<strong>發想、開發、維護、寫技術文件、UI&#x2F;UX設計到演算法研究等等</strong>，這也是我大一結束為止做過相對完整的專案。</p>
<hr>
<p>這是我們智慧創新和資訊服務競賽時的照片，我們直接把成品給評審滑XD，因為以我們的系統來說相對完整，也不怕噴Bug，雖然都只進到決賽，但我也更加知道這類型比賽的準備方向（基本上就是要對到當前流行的主題），<del>所以明年就準備AI吧</del><br><img data-src="/images/posts/CC-experience/智慧創新.JPG" 
style="width: 70%; margin: 15px auto;"><br><img data-src="/images/posts/CC-experience/資服.jpg" 
style="width: 70%; margin: 15px auto;"></p>
<hr>
<p>這是系上畢業專案的比賽，非常榮幸拿到第二名！當天還直接被其他資訊公司的主管遞名片，受寵若驚。<br><img data-src="/images/posts/CC-experience/畢業專案.JPG" 
style="width: 70%; margin: 15px auto;"><br><img data-src="/images/posts/CC-experience/畢業專案得獎.JPG" 
style="width: 70%; margin: 15px auto;"></p>
<h2 id="結語"><a href="#結語" class="headerlink" title="結語"></a>結語</h2><p>總的來說，這是我<strong>參與多人協作、實作一個完整專案的寶貴經驗</strong>，這也成為我日後開發其他專案的養分(Google學生開發者社群, 資訊競賽, etc.)，除了<strong>技術面的大幅成長，專案管理、人際互動、時間管理</strong>也都是成長的一部分，主動去尋找機會，得到的會比你想像的多，雖然有點辛苦就是了，但我覺得很值得！所以時間管理真的超重要，這也讓我在工程師這條路上變得更主動，相信只要努力，自己絕對值得更好的！</p>
<p>比較技術面的內容會在之後的文章提及</p>
]]></content>
      <categories>
        <category>心得</category>
        <category>比賽</category>
      </categories>
      <tags>
        <tag>心得</tag>
        <tag>比賽</tag>
      </tags>
  </entry>
  <entry>
    <title>政大Google學生開發者社群-政大通NCCUPass-心得 (第二學期)</title>
    <url>/posts/3042436019/</url>
    <content><![CDATA[<h1 id="前言"><a href="#前言" class="headerlink" title="前言"></a>前言</h1><p>繼上篇<a href="https://mao-code.github.io/posts/277323671/#more">GDSC的心得文</a>，這篇我會介紹我們下學期的開發經歷、得獎、專案簡介以及未來這個專案的走向，屬於一個小小的紀錄心得文！</p>
<span id="more"></span>

<h1 id="專案"><a href="#專案" class="headerlink" title="專案"></a>專案</h1><h2 id="簡介"><a href="#簡介" class="headerlink" title="簡介"></a>簡介</h2><p>政大通 - NCCUPass 團隊熱切懷抱著重塑校園體驗的夢想，點燃學生的熱情和創造力，以共同打造一個充滿活力、連結力和創新力的校園生活。經由深入校園的人際交流與觀察，我們發現學生面對著種種生活上的不便與困難，然而目前鮮少有專門為學生打造的解決方案。學生間的線下互動與真實聯繫日益減少，導致難以找到同學幫忙或組團合作的機會。此外，午晚餐時的人潮湧入校園周邊餐廳，更帶來了無法即時點餐的困擾。這些現況讓我們看到一個獨特的契機，我們決心以「政大通 - NCCUPass 」 APP 的形式來化解這些困境。</p>
<p>「政大通 - NCCUPass 」不僅提供政大學生一個集社交、實用和創新於一身的平台，更是一個綜合性的應用程式，旨在加強他們的校園體驗。我們希望這個 APP 能成為解決校園生活難題的得力幫手！</p>
<p>在社團「政大 Google 學生開發者社群」的擁護下，我們匯聚各領域的精英，凝聚開放分享的心態，攜手共創這個改變學生生活的嶄新 APP ！政大只是我們的第一塊版圖，我們的未來目標是影響全國的大學生，為他們帶來更加豐富、更加便利的校園體驗！</p>
<h2 id="目前功能簡介"><a href="#目前功能簡介" class="headerlink" title="目前功能簡介"></a>目前功能簡介</h2><h3 id="學生任務功能-增強校園互動體驗"><a href="#學生任務功能-增強校園互動體驗" class="headerlink" title="學生任務功能 - 增強校園互動體驗"></a>學生任務功能 - 增強校園互動體驗</h3><ul>
<li>動機發想<ul>
<li>缺乏線下互動與真實聯繫：<br>我們注意到越來越多的大學生在日常生活中依賴數位平台和社交媒體，而非與人面對面的互動。這種現象在校園中尤為明顯，比起在現實中進行真實的互動，學生更傾向於在虛擬世界中建立社交連結。我們對大學生進行訪談、問卷調查和社交媒體分析，收集了大量的數據和反饋。這些數據揭示了大學生內心深處對面對面互動的渴望，以及更真實的交流，更深刻的情感，卻在線下互動中遇到了種種挑戰與限制。基於這些資訊，我們更加確定缺乏線下互動和真實聯繫的問題造成大學生的困擾，這些數據與心靈共鳴，激發了我們的靈感。我們深信，缺乏線下互動和真實聯繫的問題，不僅是個人的困擾，更是一種社會現象，需要我們攜手改變。於是，我們投入心血，開發了一個獨特的解決方案——一個能在現實世界中促進人與人之間有意義連結的奇妙工具。</li>
</ul>
</li>
<li>功能敘述<ul>
<li>這個功能旨在促進學生之間的線下互動，並為他們提供一個自由而實用的平台，能以此完成各種有趣的任務。學生們可以發布任務，例如：幫忙帶宵夜、一起揪團出去玩等等。學生可以在我們的 APP 上自由發布任務，並尋找其他學生來接受挑戰，每一次的任務都是獨特的社交機會，讓學生們在校園中建立更多真實的聯繫和友誼。除此之外，這個功能還擁有相當多面的好處。首先，它鼓勵學生積極參與校園生活，擺脫線上世界的束縛。透過與其他學生一起完成任務，他們可以體驗到珍貴的團隊合作和互助精神。此外，如果同學對彼此的服務滿意，可以提供金錢或其他回饋，藉著小小賺外快的機會，也可以累積一定的金錢！相信這個制度能鼓勵學生發揮創意和努力工作，同時也建立相互尊重和價值交換的文化。</li>
</ul>
</li>
</ul>
<h3 id="預約外帶功能與午餐快選器-－-用餐省時無等待，專屬學生的美食提前預訂"><a href="#預約外帶功能與午餐快選器-－-用餐省時無等待，專屬學生的美食提前預訂" class="headerlink" title="預約外帶功能與午餐快選器 － 用餐省時無等待，專屬學生的美食提前預訂"></a>預約外帶功能與午餐快選器 － 用餐省時無等待，專屬學生的美食提前預訂</h3><ul>
<li><p>動機發想</p>
<ul>
<li><p>餐點供應不足：<br>尤其在午晚餐尖峰時段，餐廳經常供不應求，學生需要花費大量的時間排隊等待。</p>
</li>
<li><p>用餐時間緊湊：<br>大學生的生活節奏快，經常要在短時間內完成吃飯等基本活動，而在繁忙的用餐時間內排隊等待餐點往往會浪費寶貴的時間。</p>
</li>
<li><p>無法預估用餐需求：<br>在繁忙的用餐時間內，無法預估用餐需求，可能需要長時間等待，或因時間太短而做了不太理想的食物選擇。</p>
</li>
</ul>
</li>
<li><p>功能敘述</p>
<ul>
<li>這個功能將為大學生們帶來極速、便利、無等待的用餐體驗！我們與學校附近的店家攜手合作，確保學生們不必等待太久。現在，同學只需要提前預約外帶，下課後直接前往店家，餐點早已準備好，輕鬆取走即可！不再浪費時間排隊，享受自由自在的用餐時光！這個預約外帶功能不僅是大學生們用餐的時尚選擇，更是忙碌學習生活的得力助手。在繁忙的用餐尖峰時段，可以輕鬆預訂心儀的美食，避免為了取餐而耽誤其他重要事務。</li>
</ul>
</li>
</ul>
<h2 id="連結"><a href="#連結" class="headerlink" title="連結"></a>連結</h2><ul>
<li><span class="exturl" data-url="aHR0cHM6Ly9uY2N1cGFzcy5jb20v">官方網站<i class="fa fa-external-link-alt"></i></span></li>
<li><span class="exturl" data-url="aHR0cHM6Ly93d3cuaW5zdGFncmFtLmNvbS9uY2N1cGFzcy8=">IG粉專<i class="fa fa-external-link-alt"></i></span></li>
<li><span class="exturl" data-url="aHR0cHM6Ly93d3cubGlua2VkaW4uY29tL2NvbXBhbnkvbmNjdXBhc3M=">LinkedIn<i class="fa fa-external-link-alt"></i></span></li>
</ul>
<h1 id="得獎"><a href="#得獎" class="headerlink" title="得獎"></a>得獎</h1><p>NCCUPass在當天的期末發表會得到最佳人氣專案奬的殊榮！<br><img data-src="/images/posts/2023NCCUPass/GDSC_award.JPG" 
style="width: 70%; margin: 15px auto;"></p>
<p><img data-src="/images/posts/2023NCCUPass/GDSC_award_on_stage.JPG" 
style="width: 70%; margin: 15px auto;"></p>
<p>當天的期末發表會：<br><img data-src="/images/posts/2023NCCUPass/GDSC_final.JPG" 
style="width: 70%; margin: 15px auto;"></p>
<h1 id="技術"><a href="#技術" class="headerlink" title="技術"></a>技術</h1><p>在經過一個學期後，我們也慢慢拓展技術範圍，未來也將因應需求機動地改變<br>在這邊就只放上系統架構圖，不贅述太多技術細節</p>
<h2 id="系統架構圖"><a href="#系統架構圖" class="headerlink" title="系統架構圖"></a>系統架構圖</h2><p><img data-src="/images/posts/2023NCCUPass/NCCUPass_structure.png" 
style="width: 70%; margin: 15px auto;"></p>
<h1 id="未來"><a href="#未來" class="headerlink" title="未來"></a>未來</h1><p>政大是我們的第一塊版圖，我們的目標是全台灣的大學，目前先穩紮穩打在政大站穩腳步，在此期間會積極地與其他組織或社團合作、參與各項競賽以及參加創投機構的活動，我們的理念是讓大學生們彼此間的關係更加密切與真實，使他們的生活更加數位化與便利！</p>
]]></content>
      <categories>
        <category>專案</category>
        <category>心得</category>
        <category>GDSC</category>
        <category>side project</category>
        <category>NCCUPass</category>
      </categories>
      <tags>
        <tag>專案</tag>
        <tag>GDSC</tag>
        <tag>NCCUPass</tag>
      </tags>
  </entry>
  <entry>
    <title>政大Google學生開發者社群-心得（第一學期）</title>
    <url>/posts/277323671/</url>
    <content><![CDATA[<h2 id="前言"><a href="#前言" class="headerlink" title="前言"></a>前言</h2><p>在剛上大二時，除了手邊和學長的專案，我也參加了政大的Google學生開發者社群。主要是希望透過提案，<strong>利用科技來解決問題</strong>，在這裡我也深深感受到Google的文化，那就是<strong>不會怕去嘗試、不怕去解決問題</strong>，我也認識在這個社群中的各個人才。在這個社群中，我除了擔任自己<strong>專案領導人</strong>的角色外，我也身兼<strong>後端技術長</strong>的任務，也是想要發揮我目前所學到的、最成熟的後端知識。除了技術面，如何管理一個團隊、一項專案，也是一門很大的課題。希望在未來可以繼續把自己的這項專案做到完善，解決真正的問題！</p>
<span id="more"></span>

<h2 id="定位與職位"><a href="#定位與職位" class="headerlink" title="定位與職位"></a>定位與職位</h2><ul>
<li><h3 id="專案定位"><a href="#專案定位" class="headerlink" title="專案定位"></a>專案定位</h3></li>
</ul>
<p>我這次的提案叫做「政大通NCCUPass」，以<strong>政大校園生活為出發點</strong>，想要做一款能夠解決政大學生生活上不便的APP，透過專案組的討論與調查，帶來<strong>更便利、聰明與數位化</strong>的校園生活，讓校園內的需求能夠即時被滿足，也讓這款APP融入政大學生的生活中，成為政大校園內不可或缺的一部分，因此，「政大通-NCCUPass」的提案就此誕生</p>
<hr>
<p>下面是我們第一次期末發表的海報和照片<br>海報內呈現的是我們規劃、也正在開發的功能<br><img data-src="/images/posts/GDSC-NCCUPass-experience-1/poster.jpg" 
style="width: 70%; margin: 15px auto;"></p>
<hr>
<p>這個是期末發表當天的攤位介紹照片<br><img data-src="/images/posts/GDSC-NCCUPass-experience-1/img4.jpg" 
style="width: 70%; margin: 15px auto;"></p>
<p><img data-src="/images/posts/GDSC-NCCUPass-experience-1/img2.JPG" 
style="width: 70%; margin: 15px auto;"></p>
<p><img data-src="/images/posts/GDSC-NCCUPass-experience-1/img1.JPG" 
style="width: 70%; margin: 15px auto;"></p>
<p>參與證書<br><img data-src="/images/posts/GDSC-NCCUPass-experience-1/GDSC_certificate.png" 
style="width: 70%; margin: 15px auto;"></p>
<hr>
<ul>
<li><h3 id="職位"><a href="#職位" class="headerlink" title="職位"></a>職位</h3></li>
</ul>
<p>在這個專案中，我擔任<strong>Project Leader(專案主持人)<strong>，與</strong>Backend Tech-Lead(後端技術長)<strong>的身份，帶領總共</strong>11人的團隊</strong>，在我們這個團隊中，主要分成四組，前端、後端、UI&#x2F;UX、文書，除了負責主要的後端技術，也要帶領後端的組員們提升實力、協調與規劃。以專案領導人的角度來看，除了需要<strong>規劃整個專案的進度與走向外，也要監督進度、協調各組、人際溝通等等</strong>，除了利用各種專案管理的工具外，也要撰寫各種文件與流程圖，老實說滿累的XD。但也是遇到一群願意跟隨我的隊友們，一起成長，真的非常得感謝他們🙏</p>
<h2 id="技術"><a href="#技術" class="headerlink" title="技術"></a>技術</h2><ul>
<li><h3 id="系統架構圖"><a href="#系統架構圖" class="headerlink" title="系統架構圖"></a>系統架構圖</h3></li>
</ul>
<p>下面是我們的系統架構圖，因為我是負責後端的部分，所以會著重畫後端的架構<br><img data-src="/images/posts/GDSC-NCCUPass-experience-1/NCCUPass-Structure.jpg" 
style="width: 70%; margin: 15px auto;"></p>
<ul>
<li><h3 id="後端技術細節"><a href="#後端技術細節" class="headerlink" title="後端技術細節"></a>後端技術細節</h3></li>
</ul>
<p>根據上面的架構圖，我們的專案主要部署在Ubuntu的主機上，所有的服務利用Docker Compose統一建立，另外還有利用GitLab Runner做到自動化整合與自動化部屬的功能。後端程式是由.NET C#撰寫，這邊我使用的是Software Layer Architecture Pattern(軟體分層架構)，結合各種Design Pattern再加上自己的一些變形，資料庫是使用MongoDB的Replica-Set，也有使用Redis做快取，照片和一些公開的檔案主要放在我們的File Server上。其他技術細節我就列在下面，想看的人可以參考一下XD</p>
<figure class="highlight plaintext"><table><tr><td class="code"><pre><span class="line">- Software Layer Architecture pattern (多層)</span><br><span class="line">- Repository pattern</span><br><span class="line">- Unit of Work pattern</span><br><span class="line">- Mediator pattern &amp; CQRS</span><br><span class="line">- Password Salting and Encryption</span><br><span class="line">- JWT &amp; RBAC</span><br><span class="line">- Automapper</span><br><span class="line">- Exception handler抽離</span><br><span class="line">- Dapper &amp; EF combination</span><br><span class="line">- Redis (Cache)</span><br><span class="line">- Docker Networking</span><br><span class="line">- Docker Volume</span><br><span class="line">- Docker hub</span><br><span class="line">- appsettings 組態切換</span><br><span class="line">- MongoDB replica-set</span><br><span class="line">    - key-file (internal authentication)</span><br><span class="line">- Git 多人協作</span><br><span class="line">- Swagger / OpenAPI</span><br><span class="line">- Docker File Server</span><br><span class="line">- Docker mongoDB backup daily</span><br><span class="line">- Docker Compose</span><br><span class="line">- JMeter壓力測試</span><br><span class="line">- GC mode區別(Workstation, Server)</span><br><span class="line">- SignalR 雙向溝通</span><br><span class="line">- 測試 (K6 stress testing, )</span><br><span class="line">- 自動化發送Email (python selenium)</span><br><span class="line">- SSH with Linux server</span><br><span class="line">- FCM (to push device notification)</span><br><span class="line">- Linux server</span><br><span class="line">- Cloudflare domain and SSL/TLS</span><br><span class="line">- Nginx  (on server and on docker)</span><br><span class="line">    - redirect (setting files)</span><br><span class="line">    - ssl setting(certificate, key)</span><br><span class="line">- Shell script自動備份資料庫</span><br></pre></td></tr></table></figure>
<p>之後也會一直新增，因為現在處於專案開發的初期。</p>
<h2 id="結語"><a href="#結語" class="headerlink" title="結語"></a>結語</h2><p>最後，這個專案雖然只是在發展初期，但我希望未來可以發展到我預想的樣子，也非常感謝一路願意跟隨我、幫助我的隊友們，單打獨鬥真的比不上團隊合作👍，也希望各位未來也可以繼續幫助我啦，現在打分享文可能還太早，但我就是想要趁學期末趕快記錄一下哈哈</p>
]]></content>
      <categories>
        <category>專案</category>
        <category>心得</category>
        <category>GDSC</category>
        <category>side project</category>
        <category>NCCUPass</category>
      </categories>
      <tags>
        <tag>專案</tag>
        <tag>GDSC</tag>
        <tag>NCCUPass</tag>
      </tags>
  </entry>
  <entry>
    <title>Basic OOP 基礎物件導向</title>
    <url>/posts/3787153742/</url>
    <content><![CDATA[<h1 id="前言"><a href="#前言" class="headerlink" title="前言"></a>前言</h1><p>物件導向（Object-Oriented Programming, OOP）是一種程式設計範式，強調使用包含數據（屬性）和方法（功能）的物件來設計和構建應用程序。提高軟體的重用性、靈活性和擴充性。</p>
<p>而對於物件導向，最基礎需要知道以下：</p>
<blockquote>
<p><strong>一個抽象</strong><br><strong>兩個目的</strong><br><strong>三個特性</strong><br><strong>五個原則</strong></p>
</blockquote>
<span id="more"></span>

<h1 id="一個抽象"><a href="#一個抽象" class="headerlink" title="一個抽象"></a>一個抽象</h1><h2 id="抽象-Abstraction"><a href="#抽象-Abstraction" class="headerlink" title="抽象 (Abstraction)"></a>抽象 (Abstraction)</h2><p>在OOP的背景下，抽象是指<strong>隱藏複雜的實作細節</strong>，僅展示物件的必要特性的能力。這簡化了與物件的互動，使編程更直觀、更高效。</p>
<p>例如想要使用交通工具的物件，只需要知道交通工具提供的介面，而不需要知道具體的交通工具是什麼，以及其實作細節<br><img data-src="/images/posts/OOP-basic/abstraction.png" 
style="width: 70%; margin: 15px auto;"></p>
<h1 id="兩個目的"><a href="#兩個目的" class="headerlink" title="兩個目的"></a>兩個目的</h1><h2 id="低耦合-Low-Coupling"><a href="#低耦合-Low-Coupling" class="headerlink" title="低耦合 (Low Coupling)"></a>低耦合 (Low Coupling)</h2><p>低耦合是指程式中<strong>不同的類別或模組之間應該有盡可能少的依賴關係</strong>。這使得<strong>一個類別或模組的變更不太可能影響到其他的類別或模組</strong>，從而使得程式更容易維護和擴展。</p>
<h2 id="高內聚（High-Cohesion）"><a href="#高內聚（High-Cohesion）" class="headerlink" title="高內聚（High Cohesion）"></a>高內聚（High Cohesion）</h2><p>高內聚是指一個類別或模組應該只<strong>專注於完成一項特定的任務或一組緊密相關的任務</strong>。這使得程式更有組織，更易於理解和維護。</p>
<p><img data-src="/images/posts/OOP-basic/CandC.png" 
style="width: 70%; margin: 15px auto;"></p>
<h1 id="三個特性"><a href="#三個特性" class="headerlink" title="三個特性"></a>三個特性</h1><h2 id="繼承-Inheritance"><a href="#繼承-Inheritance" class="headerlink" title="繼承 (Inheritance)"></a>繼承 (Inheritance)</h2><p>繼承允許新創建的類別（子類別）繼承一個或多個現有類別（父類別）的屬性和方法。這促進了程式碼重用和擴展性。</p>
<h2 id="封裝-Encapsulation"><a href="#封裝-Encapsulation" class="headerlink" title="封裝 (Encapsulation)"></a>封裝 (Encapsulation)</h2><p>封裝是將數據（屬性）和行為（方法）綁定到單個單位（類別）中，並<strong>限制對該單位內部的直接訪問</strong>。這有助於保護數據和隱藏實現細節。</p>
<h2 id="多型-Polymorphism"><a href="#多型-Polymorphism" class="headerlink" title="多型 (Polymorphism)"></a>多型 (Polymorphism)</h2><p>多型允許對<strong>不同類別的物件使用共同的接口</strong>。這意味著可以在不同類別的物件上執行同一操作，而每個類別可以以不同的方式響應相同的操作。</p>
<h1 id="五個原則-SOLID"><a href="#五個原則-SOLID" class="headerlink" title="五個原則 (SOLID)"></a>五個原則 (SOLID)</h1><h2 id="S-單一職責原則（Single-Responsibility-Principle）"><a href="#S-單一職責原則（Single-Responsibility-Principle）" class="headerlink" title="S - 單一職責原則（Single Responsibility Principle）:"></a>S - 單一職責原則（Single Responsibility Principle）:</h2><p>一個類別應該只有一個改變的理由，這意味著<strong>一個類別應該只做一件事</strong>。</p>
<h2 id="O-開放封閉原則（Open-Closed-Principle）"><a href="#O-開放封閉原則（Open-Closed-Principle）" class="headerlink" title="O - 開放封閉原則（Open&#x2F;Closed Principle）:"></a>O - 開放封閉原則（Open&#x2F;Closed Principle）:</h2><p>軟體實體（類別、模組、函數等）應該<strong>對擴展開放，對修改封閉</strong>。這意味著應該能夠在不修改現有代碼的情況下擴展其功能。</p>
<h2 id="L-里氏替換原則（Liskov-Substitution-Principle）"><a href="#L-里氏替換原則（Liskov-Substitution-Principle）" class="headerlink" title="L - 里氏替換原則（Liskov Substitution Principle）:"></a>L - 里氏替換原則（Liskov Substitution Principle）:</h2><p><strong>子類別應該能夠替換其父類別而不影響程序的正常運行</strong>。</p>
<h2 id="I-接口隔離原則（Interface-Segregation-Principle）"><a href="#I-接口隔離原則（Interface-Segregation-Principle）" class="headerlink" title="I - 接口隔離原則（Interface Segregation Principle）:"></a>I - 接口隔離原則（Interface Segregation Principle）:</h2><p>不應強迫客戶依賴於它們不使用的接口。換句話說，<strong>更小和更具體的接口優於大而通用的接口</strong>。</p>
<h2 id="D-依賴反轉原則（Dependency-Inversion-Principle）"><a href="#D-依賴反轉原則（Dependency-Inversion-Principle）" class="headerlink" title="D - 依賴反轉原則（Dependency Inversion Principle）:"></a>D - 依賴反轉原則（Dependency Inversion Principle）:</h2><p><strong>高層模組不應該依賴低層模組，兩者都應該依賴於抽象</strong>；抽象不應該依賴於細節，細節應該依賴於抽象。這有助於減少類別之間的直接依賴，從而提高系統的靈活性和可重用性。</p>
<h1 id="Design-Pattern概述"><a href="#Design-Pattern概述" class="headerlink" title="Design Pattern概述"></a>Design Pattern概述</h1><p>設計模式是一組作為最佳實踐的解決方案，用來解決特定類型的重複出現的設計問題。這些模式不是現成的程式碼，而是可以在許多不同情況下使用的模板。它們提高了代碼的可重用性、靈活性和維護性。</p>
<p>設計模式通常分為三大類：</p>
<ol>
<li>創建型模式（Creational Patterns）<br>這些模式與物件創建機制有關，幫助創建物件的方式使得系統獨立於物件的創建和組合方式。</li>
<li>結構型模式（Structural Patterns）<br>這些模式與物件的組合有關，通常用於形成大的物件結構</li>
<li>行為型模式（Behavioral Patterns）<br>這些模式關注物件間的通訊和責任分配</li>
</ol>
]]></content>
      <categories>
        <category>OOP</category>
      </categories>
      <tags>
        <tag>OOP</tag>
      </tags>
  </entry>
  <entry>
    <title>[NLP][ML] Transformer (1) - Structure</title>
    <url>/posts/4283617483/</url>
    <content><![CDATA[<h1 id="Overview"><a href="#Overview" class="headerlink" title="Overview"></a>Overview</h1><p>The Transformer is a <strong>deep learning architecture</strong> introduced in the paper “Attention Is All You Need” by Vaswani et al. in 2017. It revolutionized the field of natural language processing (NLP) and brought significant advancements in various <strong>sequence-to-sequence tasks</strong>. The Transformer architecture, thanks to its <strong>attention mechanisms</strong>, enables efficient processing of sequential data while <strong>capturing long-range dependencies</strong>.</p>
<hr>
<p>Transformer is a <strong>Seq2Seq(Sequence to Sequence) model</strong>. It uses <strong>Encoder-Decoder structure</strong><br>Below is a simple diagram：</p>
<p><img data-src="/images/posts/NLP-series/transformer-1.gif" 
style="width: 70%; margin: 15px auto;"><br>Source: <span class="exturl" data-url="aHR0cHM6Ly9haS5nb29nbGVibG9nLmNvbS8yMDE2LzA5L2EtbmV1cmFsLW5ldHdvcmstZm9yLW1hY2hpbmUuaHRtbA==">https://ai.googleblog.com/2016/09/a-neural-network-for-machine.html<i class="fa fa-external-link-alt"></i></span></p>
<p>The line between Encoder and Decoder represents the “Attention”.<br>The thicker the line, the more the Decoder below pays more attention to some Chinese characters above when generating an English word.</p>
<span id="more"></span>
<p>In the below sections, I will introduce the structure of transformer and the attention machanism.<br>For the <strong>detail explaination of key components in transformer and the details of attention</strong>, you can refer to the <a href="https://mao-code.github.io/posts/2443192075/#more">next article</a>.</p>
<h1 id="Structure"><a href="#Structure" class="headerlink" title="Structure"></a>Structure</h1><p>Below is the strucutre of Transformer.</p>
<p><img data-src="/images/posts/NLP-series/transformer-2.png" 
style="width: 70%; margin: 15px auto;"></p>
<p>The part <strong>on the left in the figure is the Encoder, and the part on the right is the Decoder</strong>.<br>You can notice that the structures of the left and right sides are actually quite similar.<br>The Encoder and Decoder usually contain many blocks with the same layer structure, and each layer will have <strong>multi-head attention and Feed Forward Network</strong>.</p>
<h2 id="Encoder"><a href="#Encoder" class="headerlink" title="Encoder"></a>Encoder</h2><p><img data-src="/images/posts/NLP-series/transformer-3.png" 
style="width: 70%; margin: 15px auto;"></p>
<p>We just mentioned that the Encoder will be divided into many blocks.<br>We will <strong>first convert a whole row of input sequence data into a whole row of vectors</strong>, and then the processing steps of each block are as follows:</p>
<ol>
<li>Firstly, after considering all the input information of the input vector through self-attention, output a row of vectors. (I will introduce how self-attention considers all input information later)</li>
<li>Throw this row of vectors into the feed forward network of Fully Connected (FC).</li>
<li>The final output vector is the output of the block.</li>
</ol>
<hr>
<p>However, what the block does in the original Transformer is more complicated, the details are as follows:<br><img data-src="/images/posts/NLP-series/transformer-4.png" 
style="width: 70%; margin: 15px auto;"></p>
<p>Suppose we will follow the method just now, and the <strong>output result of the input vector after self-attention is called a</strong>. we also need to <strong>pull the original input (we first call it b )</strong> and add it to a to get a+b . Such a network architecture is called a <strong>residual connection</strong>.</p>
<p>After that, we will do <strong>layer normalization</strong> on the result of a+b. It will calculate the average $m$(mean) and standard deviation $\sigma$(standard deviation) of the input vector, and then calculate it according to the formula: divide the input minus the mean $m$ by the standard deviation $\sigma$. It’s here that we really get <strong>the input of the FC network</strong>.</p>
<p>The <strong>FC network also has a residual architecture</strong>, so we will add the input of the FC network to its output to get a new output, and then do layer normalization again. This is the <strong>real output of a block in Transformer Encoder</strong>.</p>
<hr>
<p>Now, let’s look back at the structure diagram of Encoder:</p>
<p><img data-src="/images/posts/NLP-series/transformer-5.png" 
style="width: 70%; margin: 15px auto;"></p>
<p>First, in the place of input, <strong>convert the input into a vector through Embedding</strong>, and then add <strong>positional encoding</strong> (because if only self-attention is used, there will be a lack of unknown information)</p>
<p>Next we see <strong>Multi-Head Attention</strong>, which is <strong>the block of self-attention and Add&amp;Norm means residual plus layer normalization</strong>.</p>
<p>Finally, <strong>doing Add&amp;Norm again after FC’s feed forward network is the output of the whole block</strong>, and this block will be <strong>repeated n times</strong>.</p>
<h2 id="Decoder"><a href="#Decoder" class="headerlink" title="Decoder"></a>Decoder</h2><p>Then let us look at the Decoder:</p>
<p><img data-src="/images/posts/NLP-series/transformer-2.png" 
style="width: 70%; margin: 15px auto;"></p>
<p><strong>Input the sequence obtained at the previous time</strong>, and then perform the same Embedding and Positional Encoding and then enter the block repeated N times.<br>The difference is that there is an extra <strong>“Masked” (note the red box) in the Multi-Head Attention</strong> when we first entered. What does it mean?</p>
<p>Masked means that the model will <strong>only pay attention to the part it has already generated</strong>, and <strong>will not accidentally pay attention to the words generated in the future</strong>. Since the output of the Decoder is generated <strong>one by one</strong>, it has no way to consider its future input. It seems a bit vague to say this, and I will make it clearer when I talk about self-attention in the next article.</p>
<p>After repeating the block N times, after the <strong>Linear Layer and Softmax</strong>, we can get the output <strong>probability distribution</strong> we want, and we can sample according to this distribution or take the value with the highest probability to get the output sequence.</p>
<p>There is still one of the most critical self-attention mechanisms that has not been explained in detail. Let’s introduce how it pays attention to all input sequences and performs parallel processing in the next article!</p>
<h1 id="Reference"><a href="#Reference" class="headerlink" title="Reference"></a>Reference</h1><ul>
<li><span class="exturl" data-url="aHR0cHM6Ly93d3cueW91dHViZS5jb20vd2F0Y2g/dj1lTWx4NWZGTm9ZYw==">3Blue1Brown<i class="fa fa-external-link-alt"></i></span></li>
<li><span class="exturl" data-url="aHR0cHM6Ly9pdGhlbHAuaXRob21lLmNvbS50dy9hcnRpY2xlcy8xMDI4MDM5Mg==">iThome - Day 27 Transformer (Recommend)<i class="fa fa-external-link-alt"></i></span></li>
<li><span class="exturl" data-url="aHR0cHM6Ly9pdGhlbHAuaXRob21lLmNvbS50dy9hcnRpY2xlcy8xMDI4MTI0Mg==">iThome - Day 28 Self-Attention (Recommend)<i class="fa fa-external-link-alt"></i></span></li>
<li><span class="exturl" data-url="aHR0cHM6Ly9oYWNrbWQuaW8vQGFibGl1L0JrWG16REJtcg==">Transformer 李宏毅深度學習 (Recommend)<i class="fa fa-external-link-alt"></i></span></li>
<li><span class="exturl" data-url="aHR0cHM6Ly9zcGVlY2guZWUubnR1LmVkdS50dy9+aHlsZWUvbWwvbWwyMDIxLWNvdXJzZS1kYXRhL3NlcTJzZXFfdjkucGRm">Transformer 李宏毅老師簡報<i class="fa fa-external-link-alt"></i></span></li>
<li><span class="exturl" data-url="aHR0cHM6Ly93d3cueW91dHViZS5jb20vY2hhbm5lbC9VQzJnZ2p0dXVXdnhySEhIaWFESDFkbFE=">李宏毅老師YouTube channel<i class="fa fa-external-link-alt"></i></span></li>
<li><span class="exturl" data-url="aHR0cHM6Ly9hcnhpdi5vcmcvYWJzLzE3MDYuMDM3NjI=">Attention is all you need (paper)<i class="fa fa-external-link-alt"></i></span></li>
</ul>
]]></content>
      <tags>
        <tag>ML</tag>
        <tag>AI</tag>
        <tag>NLP</tag>
      </tags>
  </entry>
  <entry>
    <title>[NLP][ML] Adapters &amp; LoRA</title>
    <url>/posts/38266156/</url>
    <content><![CDATA[<h1 id="Overview"><a href="#Overview" class="headerlink" title="Overview"></a>Overview</h1><p>In this article, I will provide an introduction to adapters and LoRA, including their <strong>definitions, purposes, and functions.</strong> I will also explore their various <strong>applications</strong> and, lastly, delve into the distinctions that set them apart(the <strong>differences between them</strong>).</p>
<span id="more"></span>

<h1 id="Adapter"><a href="#Adapter" class="headerlink" title="Adapter"></a>Adapter</h1><h2 id="What-are-Adapters"><a href="#What-are-Adapters" class="headerlink" title="What are Adapters?"></a>What are Adapters?</h2><p>According to <span class="exturl" data-url="aHR0cHM6Ly93d3cuYW5hbHl0aWNzdmlkaHlhLmNvbS9ibG9nLzIwMjMvMDQvdHJhaW5pbmctYW4tYWRhcHRlci1mb3Itcm9iZXJ0YS1tb2RlbC1mb3Itc2VxdWVuY2UtY2xhc3NpZmljYXRpb24tdGFzay8jOn46dGV4dD1BZGFwdGVycyUyMGFyZSUyMGxpZ2h0d2VpZ2h0JTIwYWx0ZXJuYXRpdmVzJTIwdG8sbW9kdWxhciUyMGFwcHJvYWNoJTIwdG8lMjB0cmFuc2ZlciUyMGxlYXJuaW5nLg==">this article<i class="fa fa-external-link-alt"></i></span><br>We can give the definition of adapters:</p>
<blockquote>
<p><strong>Adapters are lightweight alternatives to fully fine-tuned pre-trained models.</strong><br>Currently, <strong>adapters are implemented as small feedforward neural networks</strong> that are <strong>inserted between layers of a pre-trained model.</strong><br>They provide a <strong>parameter-efficient, computationally efficient, and modular approach</strong> to transfer learning. The following image shows added adapter.</p>
</blockquote>
<p>The image below clearly shows the usage flow of apadters<br>(Source: <span class="exturl" data-url="aHR0cHM6Ly9hZGFwdGVyaHViLm1sLw==">AdapterHub<i class="fa fa-external-link-alt"></i></span>)<br><img data-src="/images/posts/NLP-series/adapter.gif" 
style="width: 70%; margin: 15px auto;"></p>
<blockquote>
<p>During training, <strong>all the weights of the pre-trained model are frozen</strong> such that only the adapter weights<br>are updated, resulting in <strong>modular knowledge representations</strong>. They can be easily <strong>extracted, interchanged,</strong><br><strong>independently distributed, and dynamically plugged</strong> into a language model. These properties highlight the<br>potential of adapters in advancing the NLP field astronomically.</p>
</blockquote>
<h2 id="What-is-the-purpose-and-function-of-adapters"><a href="#What-is-the-purpose-and-function-of-adapters" class="headerlink" title="What is the purpose and function of adapters?"></a>What is the purpose and function of adapters?</h2><ol>
<li>Purpose:<br>Large Language Models are computationally expensive and memory-intensive. <strong>Fine-tuning the entire model for each specific task</strong> can be impractical due to resource constraints. Adapters provide a solution by allowing for more efficient and <strong>targeted modifications</strong> to the model for different tasks. This approach <strong>saves both computational power and memory</strong>, enabling the deployment of a single pre-trained model for multiple tasks.</li>
<li>Functions:<ol>
<li><p>Efficient Fine-tuning: Instead of fine-tuning the entire model, adapters enable <strong>fine-tuning only a small subset of parameters</strong> related to a specific task. This fine-tuning process is faster and requires fewer resources.</p>
</li>
<li><p>Task-specific Modifications: Adapters allow you to add <strong>task-specific layers or modifications</strong> to the pre-trained model without altering the core architecture. This makes it easier to adapt the model for various tasks like text classification, named entity recognition, sentiment analysis, etc.</p>
</li>
<li><p>Versatility: With adapters, a single pre-trained LLM can be <strong>adapted for a wide range of tasks.</strong> This versatility is beneficial in scenarios where deploying and maintaining separate models for each task might be impractical.</p>
</li>
<li><p>Interoperability: Adapters enable the combination of pre-trained models with task-specific modifications in a standardized way. This facilitates sharing, collaboration, and research in the NLP community.</p>
</li>
<li><p>Transfer Learning: Adapters enhance the <strong>effectiveness of transfer learning.</strong> Models pre-trained on <strong>large and diverse datasets can be fine-tuned on smaller</strong>, task-specific datasets using adapters, improving performance on specific tasks.</p>
</li>
<li><p>Incremental Updates: Adapters allow for easy updates to the model. Instead of retraining the entire model, only the adapters related to a specific task need to be fine-tuned when new data or requirements arise.</p>
</li>
</ol>
</li>
</ol>
<p>Overall, adapters are a <strong>mechanism that strikes a balance between the benefits of fine-tuning for specific tasks and the efficiency of reusing pre-trained LLMs.</strong> They enable the NLP community to leverage the power of these large models while tailoring them to a diverse set of applications.</p>
<h2 id="The-applications-of-adapters"><a href="#The-applications-of-adapters" class="headerlink" title="The applications of adapters"></a>The applications of adapters</h2><p>I list some practical applications that can use adapters to enhance.</p>
<ol>
<li><p>Efficient Task Adaptation: Adapters make it possible to fine-tune a pretrained model for specific tasks with minimal computational resources and time. This is particularly useful for industries that require quick adaptation to changing trends or requirements.</p>
</li>
<li><p>Multilingual Applications: Adapters can be used to enable a pretrained model to perform tasks in multiple languages. This is valuable for businesses operating in global markets.</p>
</li>
<li><p>Domain-Specific NLP: Adapting models with domain-specific adapters (e.g., medical, legal, financial) enhances their performance on tasks specific to those domains.</p>
</li>
<li><p>Personalization: Adapters can be used to personalize a general-purpose model for individual users or contexts, leading to more relevant and tailored responses.</p>
</li>
</ol>
<h1 id="LoRA-Low-Rank-Adaptation"><a href="#LoRA-Low-Rank-Adaptation" class="headerlink" title="LoRA (Low-Rank Adaptation)"></a>LoRA (Low-Rank Adaptation)</h1><h2 id="What-is-LoRA"><a href="#What-is-LoRA" class="headerlink" title="What is LoRA?"></a>What is LoRA?</h2><ul>
<li>Low-Rank Adaptation, or LoRA, is proposed, which <strong>freezes the pre-trained model weights</strong> and <strong>injects trainable rank decomposition matrices into each layer</strong> of the <strong>Transformer</strong> architecture, greatly <strong>reducing the number of trainable parameters</strong> for downstream tasks.</li>
</ul>
<p>And let’s see the full fine-tuning definition:</p>
<ul>
<li>Full fine-tuning LLM, which <strong>retrains all model parameters</strong>, becomes less feasible. Using GPT-3 175B as an example — deploying independent instances of fine-tuned models, each with 175B parameters, is prohibitively expensive.</li>
</ul>
<p>In <span class="exturl" data-url="aHR0cHM6Ly9iZHRlY2h0YWxrcy5jb20vMjAyMy8wNS8yMi93aGF0LWlzLWxvcmEv">this article<i class="fa fa-external-link-alt"></i></span>, it clearly compare the fine-tuning and LoRA approaches. And in <span class="exturl" data-url="aHR0cHM6Ly9zaC10c2FuZy5tZWRpdW0uY29tL2JyaWVmLXJldmlldy1sb3JhLWxvdy1yYW5rLWFkYXB0YXRpb24tb2YtbGFyZ2UtbGFuZ3VhZ2UtbW9kZWxzLWZhZjVkZGQ1ODAyZiM6fjp0ZXh0PUxvUkElMkMlMjBMb3clMkRSYW5rJTIwTExNJTIwRmluZSUyRFR1bmluZyUyQyUyMFJlZHVjZSUyMFJlcXVpcmVkJTIwTWVtb3J5JnRleHQ9TG93JTJEUmFuayUyMEFkYXB0YXRpb24lMkMlMjBvciUyMExvUkEsdHJhaW5hYmxlJTIwcGFyYW1ldGVycyUyMGZvciUyMGRvd25zdHJlYW0lMjB0YXNrcy4=">this article<i class="fa fa-external-link-alt"></i></span>, it dives more into the machanism of LoRA. If you wnat to know the full knowledge, you can refer to <span class="exturl" data-url="aHR0cHM6Ly9hcnhpdi5vcmcvYWJzLzIxMDYuMDk2ODU=">the paper of LoRA<i class="fa fa-external-link-alt"></i></span>.<br>In the next sections, I will introduce more ideas of LoRA based on these references. Note that I only organize the contents of these articles and add some mark and note on my own.</p>
<hr>
<h3 id="How-does-fine-tuning-LLMs-work"><a href="#How-does-fine-tuning-LLMs-work" class="headerlink" title="How does fine-tuning LLMs work?"></a>How does fine-tuning LLMs work?</h3><p>Open-source LLMs such as LLaMA,Vicuna are foundation models that <strong>have been pre-trained on hundreds of billions of words.</strong> Developers and machine learning engineers can download the model with the <strong>pre-trained weights</strong> and <strong>fine-tune it for downstream tasks</strong> such as <span class="exturl" data-url="aHR0cHM6Ly9iZHRlY2h0YWxrcy5jb20vMjAyMy8wMS8xNi93aGF0LWlzLXJsaGYv">instruction following<i class="fa fa-external-link-alt"></i></span>.</p>
<p>The model is provided input from the <strong>fine-tuning dataset</strong>. It then <strong>predicts the next tokens and compares its output with the ground truth.</strong> It then <strong>adjusts the weights(gradient)</strong> to correct its predictions. By doing this over and over, the LLM becomes fine-tuned to the downstream task.</p>
<p><img data-src="/images/posts/NLP-series/LLM-fintune.png" 
style="width: 70%; margin: 15px auto;"></p>
<p>(Source: <span class="exturl" data-url="aHR0cHM6Ly9iZHRlY2h0YWxrcy5jb20vMjAyMy8wNS8yMi93aGF0LWlzLWxvcmEv">What is low-rank adaptation (LoRA)?<i class="fa fa-external-link-alt"></i></span>)</p>
<h3 id="The-idea-of-LoRA"><a href="#The-idea-of-LoRA" class="headerlink" title="The idea of LoRA"></a>The idea of LoRA</h3><p>Now, let’s make a small modification to the fine-tuning process. In this new method, we <strong>freeze the original weights of the model and don’t modify them during the fine-tuning process.</strong> Instead, we apply the modifications to a <strong>separate set of weights</strong> and we add their new values to the original parameters. Let’s call these two sets <strong>“pre-trained” and “fine-tuned” weights</strong>.</p>
<blockquote>
<p><strong>Separating the pre-trained and fine-tuned parameters is an important part of LoRA.</strong><br><img data-src="/images/posts/NLP-series/LoRA-1.png" 
style="width: 70%; margin: 15px auto;"></p>
</blockquote>
<h4 id="Low-rank-adaptation"><a href="#Low-rank-adaptation" class="headerlink" title="Low-rank adaptation"></a>Low-rank adaptation</h4><p>Before moving on to LoRA, let’s think about our <strong>model parameters as very large matrices</strong>. If you remember your linear algebra class, <strong>matrices can form vector spaces</strong>. In this case, we’re talking about a <strong>very large vector space with many dimensions</strong> that models language.</p>
<p>Every matrix has a <strong>“rank”</strong>, which is <strong>the number of linearly independent columns it has</strong>. If a column is linearly independent, it means that <strong>it can’t be represented as a combination of other columns in the matrix</strong>. On the other hand, a dependent column is one that can be represented as a combination of one or more columns in the same matrix. You can remove dependent columns from a matrix without losing information.</p>
<p>LoRA, proposed in a <span class="exturl" data-url="aHR0cHM6Ly9hcnhpdi5vcmcvYWJzLzIxMDYuMDk2ODU=">paper<i class="fa fa-external-link-alt"></i></span> by researchers at Microsoft, suggests that when fine-tuning an LLM for a downstream task, <strong>you don’t need the full-rank weight matrix.</strong> They proposed that you could preserve most of the learning capacity of the model while <strong>reducing the dimension of the downstream parameters.</strong> (This is why it makes sense to separate the pre-trained and fine-tuned weights.)</p>
<hr>
<p><img data-src="/images/posts/NLP-series/LoRA-2.png" 
style="width: 70%; margin: 15px auto;"></p>
<p>Basically, in LoRA, you create <strong>two downstream weight matrices</strong>. One <strong>transforms the input parameters from the original dimension to the low-rank dimension</strong>. And the second matrix <strong>transforms the low-rank data to the output dimensions of the original model</strong>.</p>
<p>During training, <strong>modifications are made to the LoRA parameters</strong>, which are now much fewer than the original weights. This is why they can be trained much faster and at a fraction of the cost of doing full fine-tuning. <strong>At inference time, the output of LoRA is added to the pre-trained parameters to calculate the final values.</strong></p>
<h4 id="More-detail"><a href="#More-detail" class="headerlink" title="More detail"></a>More detail</h4><p><img data-src="/images/posts/NLP-series/LoRA-3.png" 
style="width: 70%; margin: 15px auto;"></p>
<ul>
<li>For a pre-trained weight matrix $W_0$, its update is constrained by representing the latter with a low-rank decomposition:</li>
</ul>
<p>$$\begin{equation}<br>   W_0 + \Delta{W} &#x3D; W_0 + BA<br>\end{equation}$$</p>
<ul>
<li>During training, <strong>$W_0$ is frozen</strong> and does not receive gradient updates, while <strong>A and B contain trainable parameters</strong>.</li>
<li>For $h&#x3D;W_0$, the modified forward pass yields:</li>
</ul>
<p>$$\begin{equation}<br>   h &#x3D; W_0x + \Delta{W}x &#x3D; W_0x + BAx<br>\end{equation}$$</p>
<ul>
<li><p>A random <strong>Gaussian initialization is used for A</strong> and <strong>zero is used for B</strong>, so $\Delta{W}&#x3D;BA$ is zero at the beginning of training. (The method of initializing the weights)</p>
</li>
<li><p>One of the advantages is that when deployed in production, we can <strong>explicitly compute and store $W&#x3D;W_0+BA$</strong> and perform inference as usual. <strong>No additional latency</strong> compared to other methods, such as appending more layers.</p>
</li>
</ul>
<p>You can see the implementation and more detail of LoRA in <span class="exturl" data-url="aHR0cHM6Ly95b3V0dS5iZS9kQS1OaEN0cnJWRQ==">this video<i class="fa fa-external-link-alt"></i></span><br>and <span class="exturl" data-url="aHR0cHM6Ly9ibG9nLm1sNi5ldS9sb3ctcmFuay1hZGFwdGF0aW9uLWEtdGVjaG5pY2FsLWRlZXAtZGl2ZS03ODJkZWM5OTU3NzI=">this article<i class="fa fa-external-link-alt"></i></span>.</p>
<h2 id="What-is-the-purpose-and-function-of-LoRA"><a href="#What-is-the-purpose-and-function-of-LoRA" class="headerlink" title="What is the purpose and function of LoRA?"></a>What is the purpose and function of LoRA?</h2><ul>
<li>Purpose<ul>
<li>The purpose of LoRA is to make it easier and more efficient to fine-tune LLMs for downstream tasks.</li>
</ul>
</li>
<li>Function<ul>
<li>The function of LoRA is to decompose the LLM into a low-rank representation and then adapt this representation to the target task.</li>
<li>Here are the benefits:</li>
</ul>
</li>
<li>Reduced number of parameters: The low-rank representation has a much smaller number of parameters than the original LLM, which can make it faster to train and easier to deploy.</li>
<li>Improved performance: The low-rank representation is able to capture the most important features of the LLM, which can lead to improved performance on the downstream task.</li>
</ul>
<h2 id="The-applications-of-LoRA"><a href="#The-applications-of-LoRA" class="headerlink" title="The applications of LoRA"></a>The applications of LoRA</h2><ul>
<li>Fine-tuning large language models for downstream tasks:<br>LoRA can be used to fine-tune large language models (LLMs) for a variety of downstream tasks, such as question answering, summarization, and translation. This can <strong>make LLMs more accessible and easier to use for a wider range of applications.</strong></li>
<li>Improving the efficiency of machine learning models:<br> LoRA can be used to improve the efficiency of machine learning models by reducing the number of parameters. This can <strong>make models faster to train and easier to deploy</strong>.</li>
<li>Compressing large datasets:<br>LoRA can be used to <strong>compress large datasets by representing them in a low-rank format</strong>. This can make datasets easier to <strong>store and transmit</strong>.</li>
<li>Improving the security of machine learning models:<br>LoRA can be used to improve the security of machine learning models by making them more resistant to adversarial attacks.</li>
</ul>
<h1 id="Differences-between-adapter-and-LoRA"><a href="#Differences-between-adapter-and-LoRA" class="headerlink" title="Differences between adapter and LoRA"></a>Differences between adapter and LoRA</h1><table>
   <thead>
      <th>Feature</th>
      <th>Adapter</th>
      <th>LoRA</th>
   </thead>
   <tbody>
      <tr>
         <td>Approach</td>
         <td>Adds additional layers to the pretrained model	</td>
         <td>Decomposes the pretrained model into a low-rank representation
</td>
      </tr>
      <tr>
         <td>Parameters</td>
         <td>Adds a small number of parameters to the pretrained model</td>
         <td>Reduces the number of parameters in the pretrained model</td>
      </tr>
      <tr>
         <td>Performance</td>
         <td>Effective for a variety of downstream tasks</td>
         <td>Particularly effective for tasks that require a large number of parameters</td>
      </tr>
      <tr>
         <td>Speed</td>
         <td>Can be faster to train than LoRA</td>
         <td>Can be faster at inference time</td>
      </tr>
      <tr>
         <td>Memory usage</td>
         <td>Can use more memory than LoRA</td>
         <td></td>
      </tr>
   </tbody>
</table>

<p>In general, adapters are a good choice for tasks that require a small number of parameters and can be trained quickly, while LoRA is a good choice for tasks that require a large number of parameters and need to be fast at inference time.</p>
<h1 id="References"><a href="#References" class="headerlink" title="References"></a>References</h1><ul>
<li><span class="exturl" data-url="aHR0cHM6Ly93d3cuYW5hbHl0aWNzdmlkaHlhLmNvbS9ibG9nLzIwMjMvMDQvdHJhaW5pbmctYW4tYWRhcHRlci1mb3Itcm9iZXJ0YS1tb2RlbC1mb3Itc2VxdWVuY2UtY2xhc3NpZmljYXRpb24tdGFzay8jOn46dGV4dD1BZGFwdGVycyUyMGFyZSUyMGxpZ2h0d2VpZ2h0JTIwYWx0ZXJuYXRpdmVzJTIwdG8sbW9kdWxhciUyMGFwcHJvYWNoJTIwdG8lMjB0cmFuc2ZlciUyMGxlYXJuaW5nLg==">(Recommend) Training an Adapter for RoBERTa Model for Sequence Classification Task<i class="fa fa-external-link-alt"></i></span></li>
<li><span class="exturl" data-url="aHR0cHM6Ly9hZGFwdGVyaHViLm1sLw==">AdapterHub<i class="fa fa-external-link-alt"></i></span></li>
<li><span class="exturl" data-url="aHR0cHM6Ly9odWdnaW5nZmFjZS5jby9kb2NzL2RpZmZ1c2Vycy90cmFpbmluZy9sb3JhIzp+OnRleHQ9TG93JTJEUmFuayUyMEFkYXB0YXRpb24lMjBvZiUyMExhcmdlJTIwTGFuZ3VhZ2UlMjBNb2RlbHMlMjAoTG9SQSklMjBpcyx0cmFpbnMlMjB0aG9zZSUyMG5ld2x5JTIwYWRkZWQlMjB3ZWlnaHRzLg==">Low-Rank Adaptation of Large Language Models (LoRA) (Huggingface)<i class="fa fa-external-link-alt"></i></span></li>
<li><span class="exturl" data-url="aHR0cHM6Ly9zaC10c2FuZy5tZWRpdW0uY29tL2JyaWVmLXJldmlldy1sb3JhLWxvdy1yYW5rLWFkYXB0YXRpb24tb2YtbGFyZ2UtbGFuZ3VhZ2UtbW9kZWxzLWZhZjVkZGQ1ODAyZiM6fjp0ZXh0PUxvUkElMkMlMjBMb3clMkRSYW5rJTIwTExNJTIwRmluZSUyRFR1bmluZyUyQyUyMFJlZHVjZSUyMFJlcXVpcmVkJTIwTWVtb3J5JnRleHQ9TG93JTJEUmFuayUyMEFkYXB0YXRpb24lMkMlMjBvciUyMExvUkEsdHJhaW5hYmxlJTIwcGFyYW1ldGVycyUyMGZvciUyMGRvd25zdHJlYW0lMjB0YXNrcy4=">Brief Review — LoRA: Low-Rank Adaptation of Large Language Models<i class="fa fa-external-link-alt"></i></span></li>
<li><span class="exturl" data-url="aHR0cHM6Ly9iZHRlY2h0YWxrcy5jb20vMjAyMy8wNS8yMi93aGF0LWlzLWxvcmEv">What is low-rank adaptation (LoRA)?<i class="fa fa-external-link-alt"></i></span></li>
<li><span class="exturl" data-url="aHR0cHM6Ly9hcnhpdi5vcmcvYWJzLzIxMDYuMDk2ODU=">LoRA: Low-Rank Adaptation of Large Language Models (Paper)<i class="fa fa-external-link-alt"></i></span></li>
<li><span class="exturl" data-url="aHR0cHM6Ly9ibG9nLm1sNi5ldS9sb3ctcmFuay1hZGFwdGF0aW9uLWEtdGVjaG5pY2FsLWRlZXAtZGl2ZS03ODJkZWM5OTU3NzI=">(Recommend) Low Rank Adaptation: A Technical Deep Dive<i class="fa fa-external-link-alt"></i></span></li>
<li><span class="exturl" data-url="aHR0cHM6Ly95b3V0dS5iZS9kQS1OaEN0cnJWRQ==">Low-rank Adaption of Large Language Models: Explaining the Key Concepts Behind LoRA (Video)<i class="fa fa-external-link-alt"></i></span></li>
<li><span class="exturl" data-url="aHR0cHM6Ly93d3cueW91dHViZS5jb20vd2F0Y2g/dj1VczVaRnAxNlBhVQ==">Fine-tuning LLMs with PEFT and LoRA<i class="fa fa-external-link-alt"></i></span></li>
<li><span class="exturl" data-url="aHR0cHM6Ly9odWdnaW5nZmFjZS5jby9ibG9nL3BlZnQ=">PEFT: Parameter-Efficient Fine-Tuning of Billion-Scale Models on Low-Resource Hardware<i class="fa fa-external-link-alt"></i></span></li>
</ul>
]]></content>
      <tags>
        <tag>ML</tag>
        <tag>AI</tag>
        <tag>NLP</tag>
      </tags>
  </entry>
  <entry>
    <title>[NLP][ML] Transformer (3) - More conputational detail</title>
    <url>/posts/1255359463/</url>
    <content><![CDATA[<h1 id="Overview"><a href="#Overview" class="headerlink" title="Overview"></a>Overview</h1><p>In this article, I will focus more on the computing detail in transformer.<br>It will cover self-attention, prallel processing, multi-head self-attention, positional encoding and so on.</p>
<p><img data-src="/images/posts/NLP-series/transformer-2.png" 
style="width: 70%; margin: 15px auto;"></p>
<span id="more"></span>

<h1 id="Self-Attention"><a href="#Self-Attention" class="headerlink" title="Self-Attention"></a>Self-Attention</h1><h2 id="Idea"><a href="#Idea" class="headerlink" title="Idea"></a>Idea</h2><ul>
<li>Input:<ul>
<li>$x_1$, …, $x_4$, is a sequence.</li>
<li>Each input first goes through an <strong>embedding(convert to vectors)</strong>, multiplied by a weight matrix to become $a_1$, …, $a_4$. These $a_1$, …, $a_4$ are then passed into a Self-attention layer</li>
</ul>
</li>
<li>Each input is multiplied by different vectors:<ul>
<li>$q$: query (to match against others)<ul>
<li>$q_i$ &#x3D; $W^qa_i$</li>
</ul>
</li>
<li>$k$: key (to be matched)<ul>
<li>$k_i$ &#x3D; $W^ka_i$</li>
</ul>
</li>
<li>$v$: value, information to be extracted<ul>
<li>$v_i$ &#x3D; $W^va_i$</li>
</ul>
</li>
</ul>
</li>
<li>The weights $W^q$, $W^k$, $W^v$ are <strong>learned, initially randomly initialized</strong>.</li>
</ul>
<h2 id="Method"><a href="#Method" class="headerlink" title="Method"></a>Method</h2><p><img data-src="/images/posts/NLP-series/transformer-6.png" 
style="width: 70%; margin: 15px auto;"></p>
<p><img data-src="/images/posts/NLP-series/transformer-7.png" 
style="width: 70%; margin: 15px auto;"></p>
<ol>
<li>Take each query <strong>$q$ and perform attention on each key $k$</strong> (using two vectors to output a score(attention score)), which is essentially calculating the <strong>similarity of $q$ and $k$ (Similarity)</strong>.<ul>
<li>Scaled Dot-Product: $S(q_1, k_1)$ yields $\alpha_{1,1}$, $S(q_1, k_2)$ yields $\alpha_{1,2}$, and so on.</li>
<li>$\alpha_{1,i}$ &#x3D; $  q_1 \cdot k_i &#x2F; \sqrt{d}$ </li>
<li>$d$ represents the <strong>dimensions of $q$ and $k$</strong>. This is a <strong>trick used by the authors in the paper</strong>.</li>
</ul>
</li>
<li>Followed by <strong>Softmax normalization to normalize the values</strong>.</li>
<li>Multiply the obtained $\hat{\alpha}$ with $v$ to get $b$, which is <strong>equivalent to a weighted sum</strong>.</li>
<li>The obtained $b_1$ in the figure is the <strong>first vector (word or character)</strong> of the sought sequence.</li>
<li>Each output vector incorporates information from the <strong>entire sequence</strong>.</li>
</ol>
<h1 id="Prallel-Processing"><a href="#Prallel-Processing" class="headerlink" title="Prallel Processing"></a>Prallel Processing</h1><p><img data-src="/images/posts/NLP-series/transformer-8.png" 
style="width: 70%; margin: 15px auto;"></p>
<p>$$q_i &#x3D; W^qa_i$$<br>$$k_i &#x3D; W^ka_i$$<br>$$v_i &#x3D; W^va_i$$</p>
<hr>
<p><img data-src="/images/posts/NLP-series/transformer-9.png" 
style="width: 70%; margin: 15px auto;"></p>
<ol>
<li>Consider $a_1$, …, $a_4$ as a matrix $I$. Multiply it by the weight matrix $W^q$ to obtain $q_1$, …, $q_4$, forming another matrix $Q$. </li>
<li>The same process applies to matrices $K$ and $V$, formed by multiplying $q$, $k$, and $a$ to get<br>$\alpha_{1,1}$ &#x3D; $k^T_1 \cdot q_1$,<br>$\alpha_{1,2}$ &#x3D; $k^T_2 \cdot q_1$,<br>…<br>Stack $k_1$, …, $k_4$ to form matrix $K$, then multiply it by $q_1$ stacked with $q_2$, …, $q_4$ to form matrix $Q$, resulting in a matrix $A$ composed of $\alpha$ values, which is the <strong>Attention</strong>. </li>
<li>After applying Softmax, it becomes $\hat{A}$. In each time step, attention exists between each pair of vectors.</li>
</ol>
<hr>
<p><img data-src="/images/posts/NLP-series/transformer-10.png" 
style="width: 70%; margin: 15px auto;"></p>
<p>By calculating the <strong>weighted sum of $V$ and $\hat{A}$</strong>, you obtain $b$, and the matrix composed of b forms the <strong>output matrix $O$</strong>.”</p>
<hr>
<h2 id="What-self-attention-layer-do"><a href="#What-self-attention-layer-do" class="headerlink" title="What self-attention layer do"></a>What self-attention layer do</h2><p><img data-src="/images/posts/NLP-series/transformer-11-1.png" 
style="width: 70%; margin: 15px auto;"></p>
<p><img data-src="/images/posts/NLP-series/transformer-11-2.png" 
style="width: 70%; margin: 15px auto;"></p>
<p>By converting it into matrix multiplication, you can utilize the <strong>GPU to accelerate the computation</strong>.</p>
<h1 id="Multi-head-Self-attention"><a href="#Multi-head-Self-attention" class="headerlink" title="Multi-head Self-attention"></a>Multi-head Self-attention</h1><p><img data-src="/images/posts/NLP-series/transformer-12.png" 
style="width: 70%; margin: 15px auto;"></p>
<p>Taking 2 heads as an example:</p>
<ul>
<li>Having <strong>2 heads</strong> means splitting $q, k, v$ into two sets of $q, k, v$. And $q_{i,1}$ will only be multiplied with $k_{i,1}$ to obtain $\alpha_{i,1}$, finally calculating $b_{i,1}$. </li>
<li>Afterward, concatenate $b_{i,1}, b_{i,2}$, apply a transformation, and perform dimension reduction to obtain the final $b_i$.</li>
<li><strong>Each head focuses on different information</strong>; some only care about local information (neighborhood data), while others concentrate on global (long-term) information, and so on.</li>
</ul>
<h1 id="Positional-Encoding"><a href="#Positional-Encoding" class="headerlink" title="Positional Encoding"></a>Positional Encoding</h1><p><img data-src="/images/posts/NLP-series/transformer-13.png" 
style="width: 40%; margin: 15px auto;"></p>
<p>In the attention mechanism, the order of words in the input sentence doesn’t matter.</p>
<hr>
<p><img data-src="/images/posts/NLP-series/transformer-14.png" 
style="width: 70%; margin: 15px auto;"></p>
<ul>
<li>Without positional information &#x3D;&gt; Therefore, there is a <strong>unique position vector $e_i$</strong>, which is not learned but set by humans.</li>
<li>Other methods: <strong>Using one-hot encoding</strong> to represent $p_i$ as $x_i$ to denote its position.</li>
</ul>
<h1 id="Seq2seq-with-Attention"><a href="#Seq2seq-with-Attention" class="headerlink" title="Seq2seq with Attention"></a>Seq2seq with Attention</h1><p><img data-src="/images/posts/NLP-series/transformer-15.png" 
style="width: 70%; margin: 15px auto;"></p>
<p>The original seq2seq model consists of two RNNs, an Encoder and a Decoder, and can be applied to machine translation.</p>
<p>In the diagram above, the Encoder originally contained bidirectional RNNs, while the Decoder contained a unidirectional RNN. In the diagram below, <strong>both(bi&#x2F;unidirectional RNN) have been replaced with Self-Attention layers</strong>, achieving the same purpose and enabling <strong>parallel processing</strong>.</p>
<p><img data-src="/images/posts/NLP-series/transformer-16.png" 
style="width: 70%; margin: 15px auto;"></p>
<h1 id="Look-into-the-detail-of-Transformer-Model"><a href="#Look-into-the-detail-of-Transformer-Model" class="headerlink" title="Look into the detail of Transformer Model"></a>Look into the detail of Transformer Model</h1><p><img data-src="/images/posts/NLP-series/transformer-2.png" 
style="width: 70%; margin: 15px auto;"></p>
<p>Using Chinese to English translation for example.</p>
<h2 id="Encoder-Part"><a href="#Encoder-Part" class="headerlink" title="Encoder Part:"></a>Encoder Part:</h2><ol>
<li>The input goes through <strong>Input Embedding</strong>, which considers <strong>positional information</strong> and is augmented with manually set Positional Encoding. It then enters the block that <strong>repeats N times</strong>.</li>
</ol>
<hr>
<p><img data-src="/images/posts/NLP-series/transformer-17.png" 
style="width: 70%; margin: 15px auto;"></p>
<ol start="2">
<li>Multi-head:<br>Within the Encoder, it utilizes <strong>Multi-head Attention,</strong> which means there <strong>are multiple sets of $q$, $k$, $z$</strong>. Inside this mechanism, individual $qkv$ multiplications with $a$ are performed, leading to the calculation of $\alpha$, ultimately resulting in $b$.</li>
</ol>
<hr>
<p><img data-src="/images/posts/NLP-series/transformer-18.png" 
style="width: 70%; margin: 15px auto;"></p>
<ol start="3">
<li>Add &amp; Norm (residual connection):<br>The <strong>input</strong> of Multi-head Attention, denoted as <strong>$a$</strong>, is added to the <strong>output $b$</strong>, resulting in <strong>$b^\prime$</strong>. Following this, <strong>Layer Normalization</strong> is performed. </li>
<li>Once the calculations are completed, the result is passed through the <strong>forward propagation</strong>, followed by another <strong>Add &amp; Norm</strong> step.</li>
</ol>
<h2 id="Decoder-Part"><a href="#Decoder-Part" class="headerlink" title="Decoder Part"></a>Decoder Part</h2><p><img data-src="/images/posts/NLP-series/transformer-18-2.png" 
style="width: 70%; margin: 15px auto;"></p>
<ol>
<li><p>The Decoder <strong>input is the output from the previous time step</strong>. It goes through output embedding, considering positional information, and is augmented with manually set positional encoding. It then enters the block that repeats n times.</p>
</li>
<li><p><strong>Masked Multi-head Attention</strong>:<br>Attention is performed, where <strong>“Masked” indicates attending only to the already generated sequence</strong>. This is followed by an Add &amp; Norm layer.</p>
</li>
<li><p>Next, it undergoes a Multi-head Attention layer, attending to the <strong>previous output of the Encoder</strong>, followed by another Add &amp; Norm layer.</p>
</li>
<li><p>After the computations, it is passed to the <strong>Feed Forward forward propagation</strong>. Subsequently, <strong>Linear and Softmax</strong> operations are applied to generate the <strong>final output</strong>.</p>
</li>
</ol>
<hr>
<p>Last but not least, I provide the definition and purpose of encoder and decoder (in the previous article):</p>
<blockquote>
<p>Encoder-Decoder Architecture:<br>   The Transformer’s architecture is divided into an <strong>encoder and a decoder</strong>. The <strong>encoder processes the input sequence, capturing its contextual information</strong>, while the <strong>decoder generates the output sequence</strong>. This architecture is widely used in tasks like machine translation.</p>
</blockquote>
<h1 id="Attention-Visualization"><a href="#Attention-Visualization" class="headerlink" title="Attention Visualization"></a>Attention Visualization</h1><h2 id="single-head"><a href="#single-head" class="headerlink" title="single-head"></a>single-head</h2><p><img data-src="/images/posts/NLP-series/transformer-19.png" 
style="width: 70%; margin: 15px auto;"></p>
<p>The relationships between words. The thicker the line, the more related of these words.</p>
<h2 id="multi-head"><a href="#multi-head" class="headerlink" title="multi-head"></a>multi-head</h2><p><img data-src="/images/posts/NLP-series/transformer-20.png" 
style="width: 70%; margin: 15px auto;"></p>
<p>The results obtained by pairing different sets of $q$ and $k$ vectors differ, indicating that different sets of $q$ and $k$ possess distinct information. This signifies that various sets of $q$ and $k$ hold different types of information, with some focusing on local aspects (below) and others on global aspects (above).</p>
<h1 id="Reference"><a href="#Reference" class="headerlink" title="Reference"></a>Reference</h1><ul>
<li><span class="exturl" data-url="aHR0cHM6Ly93d3cueW91dHViZS5jb20vd2F0Y2g/dj1lTWx4NWZGTm9ZYw==">3Blue1Brown<i class="fa fa-external-link-alt"></i></span></li>
<li><span class="exturl" data-url="aHR0cHM6Ly9pdGhlbHAuaXRob21lLmNvbS50dy9hcnRpY2xlcy8xMDI4MDM5Mg==">iThome - Day 27 Transformer (Recommend)<i class="fa fa-external-link-alt"></i></span></li>
<li><span class="exturl" data-url="aHR0cHM6Ly9pdGhlbHAuaXRob21lLmNvbS50dy9hcnRpY2xlcy8xMDI4MTI0Mg==">iThome - Day 28 Self-Attention (Recommend)<i class="fa fa-external-link-alt"></i></span></li>
<li><span class="exturl" data-url="aHR0cHM6Ly9oYWNrbWQuaW8vQGFibGl1L0JrWG16REJtcg==">Transformer 李宏毅深度學習 (Recommend)<i class="fa fa-external-link-alt"></i></span></li>
<li><span class="exturl" data-url="aHR0cHM6Ly9zcGVlY2guZWUubnR1LmVkdS50dy9+aHlsZWUvbWwvbWwyMDIxLWNvdXJzZS1kYXRhL3NlcTJzZXFfdjkucGRm">Transformer 李宏毅老師簡報<i class="fa fa-external-link-alt"></i></span></li>
<li><span class="exturl" data-url="aHR0cHM6Ly93d3cueW91dHViZS5jb20vY2hhbm5lbC9VQzJnZ2p0dXVXdnhySEhIaWFESDFkbFE=">李宏毅老師YouTube channel<i class="fa fa-external-link-alt"></i></span></li>
<li><span class="exturl" data-url="aHR0cHM6Ly9hcnhpdi5vcmcvYWJzLzE3MDYuMDM3NjI=">Attention is all you need (paper)<i class="fa fa-external-link-alt"></i></span></li>
</ul>
]]></content>
      <tags>
        <tag>ML</tag>
        <tag>AI</tag>
        <tag>NLP</tag>
      </tags>
  </entry>
  <entry>
    <title>[NLP][ML] Transformer (2) - Attention &amp; Summary</title>
    <url>/posts/2443192075/</url>
    <content><![CDATA[<h1 id="Overview"><a href="#Overview" class="headerlink" title="Overview"></a>Overview</h1><p>Self-attention allows the model to <strong>weigh the importance of different parts of an input sequence against each other</strong>, capturing <strong>relationships</strong> and dependencies between elements within the sequence. This is particularly powerful for tasks involving sequential or contextual information, such as language translation, text generation, and more.</p>
<p>What Self-Attention wants to do is to replace what RNN can do<br>Its output&#x2F;input is the same as RNN, and its biggest advantages are:</p>
<ul>
<li>Can parallelize operations</li>
<li>Each output vector has seen the entire input sequence. So there is no need to stack several layers like CNN.</li>
</ul>
<span id="more"></span>
<h1 id="Difference-between-attention-and-self-attention"><a href="#Difference-between-attention-and-self-attention" class="headerlink" title="Difference between attention and self-attention"></a>Difference between attention and self-attention</h1><p><strong>attention is a broader concept of selectively focusing on information</strong>, while <strong>self-attention is a specific implementation of this concept</strong> where elements within the same sequence are attended to. Self-attention is a fundamental building block of the Transformer architecture, allowing it to capture relationships and dependencies within sequences effectively.</p>
<h1 id="Attention-Machanism"><a href="#Attention-Machanism" class="headerlink" title="Attention (Machanism)"></a>Attention (Machanism)</h1><p>main idea:<br>use <strong>triples</strong><br>$$&lt;Q,K,V&gt;$$</p>
<p>Represents the <strong>attention mechanism</strong>, expresses the <strong>similarity between Query and Key</strong>, and then assigns the value of <strong>Value according to the similarity</strong><br><strong>formula</strong>:<br>$$Attention(Q,K,V) &#x3D; softmax(\frac{QK^T}{\sqrt{d_k}})V$$</p>
<h1 id="Self-Attention-Layer-Key"><a href="#Self-Attention-Layer-Key" class="headerlink" title="Self-Attention(Layer) (Key)"></a>Self-Attention(Layer) (<strong>Key</strong>)</h1><h2 id="Computing-process"><a href="#Computing-process" class="headerlink" title="Computing process"></a>Computing process</h2><ol>
<li><p>Now we suppose to enter $a_1$~$a_4$ Four vector, and Self-Attention needs to output another row of $b$ vectors, and each $b$ is generated after considering all $a$</p>
</li>
<li><p>To figure out $b_1$, the first step is based on $a_1$, find <strong>other vectors related to $a_1$</strong> in this sequence. We use the “$\alpha$” to represent the <strong>similarity</strong> of each vector related to $a_1$.</p>
</li>
<li><p>It must be mentioned here that there are <strong>3 very important values in the Self-Attention mechanism</strong>: <strong>Query, Key, Value</strong>. Respectively represent <strong>the value used to match, the value to be matched, and the extracted information</strong>.</p>
</li>
<li><p>As for determining the <strong>correlation between two vectors</strong>, the most commonly used method is the <strong>dot product (here is scaled dot product)</strong>. It takes two vectors as input and multiplies them with two different matrices. The left vector is multiplied by the matrix $W^q$ (Query matrix), and the right vector is multiplied by the matrix $W^k$ (Key matrix). The values of $W^q$ and $W^k$ are both <strong>randomly initialized and obtained through training</strong>.</p>
</li>
<li><p>Next, after obtaining the two vectors, $q$, $k$, the <strong>dot product is computed between them</strong>. After summing up all the dot products, a <strong>scalar (magnitude)</strong> is obtained. This scalar is represented as $\alpha$, which we consider as <strong>the degree of correlation between the two vectors.</strong></p>
</li>
</ol>
<p><img data-src="/images/posts/NLP-series/transformer-6.png" 
style="width: 70%; margin: 15px auto;"></p>
<p>Next, we apply what was just introduced to Self-Attention.</p>
<ol>
<li>First, we calculate the relationships “$\alpha$” between $a_1$ and $a_2$, $a_3$, $a_4$ individually. <ul>
<li>We multiply $a_1$ by $W^q$ to obtain $q_1$. </li>
<li>Then, we multiply $a_2$, $a_3$, $a_4$ by $W^k$ respectively and compute the inner products to determine the relationship “$\alpha$” between $a_1$ and each vector. </li>
<li>Applying the Softmax function yields $\alpha’$. </li>
<li>With this $\alpha’$, we can extract crucial information from this sequence!</li>
</ul>
</li>
</ol>
<hr>
<p><img data-src="/images/posts/NLP-series/transformer-7.png" 
style="width: 70%; margin: 15px auto;"></p>
<ol start="2">
<li>How to extract important information using $\alpha’$? The steps are as follows:</li>
</ol>
<ul>
<li>First, multiply $a_1$ ~ $a_4$ by $W^v$ to obtain new vectors, denoted as $v_1$, $v_2$, $v_3$ and $v_4$, respectively (where $W^v$ is the Value matrix).</li>
<li>Next, multiply each vector here, $v_1$ ~ $v_4$, by $\alpha’$, and then sum them to obtain the output $b_1$ (formula written in the top-right corner of the image).</li>
</ul>
<p>If a certain vector receives a higher score - for instance, if the relationship between $a_1$ and $a_2$ is strong, leading to a large value for $\alpha_{1,2}’$ - then after performing the weighted sum, the value of $b_1$ obtained could be very close to $v_2$.</p>
<p>Now that we know how to compute $b_1$, it naturally follows that we can deduce $b_1$, $b_2$, $b_3$, and $b_4$ using the same method. With this, we have completed the explanation of the internal computation process of Self-Attention.</p>
<p>Last but not least, the <strong>similarity matrix($\alpha_{i,j}$)</strong> is just the <strong>attention</strong> in the self-attention layer, i.e. importance or relevance to other elements in the same sequence..</p>
<h1 id="Summary"><a href="#Summary" class="headerlink" title="Summary"></a>Summary</h1><h2 id="Attention-Score-Weight-and-Output"><a href="#Attention-Score-Weight-and-Output" class="headerlink" title="Attention (Score, Weight and Output)"></a>Attention (Score, Weight and Output)</h2><ul>
<li>$ Attention Score &#x3D; QK^T $</li>
<li>$ Attention Weights &#x3D; Softmax(\frac{Attention Score}{\sqrt{d_k}}) $</li>
<li>$ Attention Output &#x3D; (Attention Weights)V $</li>
</ul>
<h2 id="Key-Components-of-Transformers"><a href="#Key-Components-of-Transformers" class="headerlink" title="Key Components of Transformers"></a>Key Components of Transformers</h2><ol>
<li><p>Self-Attention Mechanism:<br>The core innovation of the Transformer is the self-attention mechanism, which allows the model to <strong>weigh the importance of different words in a sequence relative to each other</strong>. It computes attention scores for each word by considering its <strong>relationships with all other words</strong> in the same sequence. Self-attention enables capturing context and dependencies between words <strong>regardless of their distance</strong>.</p>
</li>
<li><p>Multi-Head Attention:<br>To capture different types of relationships, the Transformer employs multi-head attention. <strong>Multiple sets of self-attention mechanisms (attention heads) run in parallel</strong>, and their outputs are concatenated and linearly transformed to create a more comprehensive representation.</p>
</li>
<li><p>Positional Encodings:<br>Since the Transformer does not inherently understand <strong>the order of words in a sequence</strong> (unlike recurrent networks(RNN)), positional encodings are added to the input embeddings. These encodings provide <strong>information about the positions of words within the sequence</strong>.</p>
</li>
<li><p>Encoder-Decoder Architecture:<br>The Transformer’s architecture is divided into an <strong>encoder and a decoder</strong>. The <strong>encoder processes the input sequence, capturing its contextual information</strong>, while the <strong>decoder generates the output sequence</strong>. This architecture is widely used in tasks like machine translation.</p>
</li>
<li><p>Residual Connections and Layer Normalization:<br>To address the <strong>vanishing gradient problem</strong>, residual connections (skip connections) are used around each sub-layer in the encoder and decoder. Layer normalization is also applied to <strong>stabilize</strong> the training process.</p>
</li>
<li><p>Position-wise Feed-Forward Networks:<br>After the self-attention layers, each position’s representation is passed through a position-wise feed-forward neural network, which <strong>adds non-linearity to the model</strong>.</p>
</li>
<li><p>Scaled Dot-Product Attention:<br>The self-attention mechanism involves computing the <strong>dot product of query, key, and value vectors</strong>. To control the scale of the dot products and <strong>avoid large gradients</strong>, the dot products are divided by the <strong>square root of the dimension of the key vectors</strong>.</p>
</li>
<li><p>Masked Self-Attention in Decoding:<br>During decoding, the self-attention mechanism is modified to <strong>ensure that each position can only attend to previous positions</strong>. This masking <strong>prevents the model from “cheating” by looking ahead in the output sequence</strong>.</p>
</li>
<li><p>Transformer Variants:<br> The original Transformer model has inspired various extensions and improvements, such as BERT, GPT, and more. BERT focuses on pretraining language representations, while GPT is designed for autoregressive text generation.</p>
</li>
</ol>
<p>The Transformer architecture has become the foundation for many state-of-the-art NLP models due to its ability to <strong>capture context, parallelize computations, and handle long-range dependencies effectively</strong>. It has led to significant advancements in machine translation, text generation, sentiment analysis, and various other NLP tasks.</p>
<h1 id="Reference"><a href="#Reference" class="headerlink" title="Reference"></a>Reference</h1><ul>
<li><span class="exturl" data-url="aHR0cHM6Ly93d3cueW91dHViZS5jb20vd2F0Y2g/dj1lTWx4NWZGTm9ZYw==">3Blue1Brown<i class="fa fa-external-link-alt"></i></span></li>
<li><span class="exturl" data-url="aHR0cHM6Ly9pdGhlbHAuaXRob21lLmNvbS50dy9hcnRpY2xlcy8xMDI4MDM5Mg==">iThome - Day 27 Transformer (Recommend)<i class="fa fa-external-link-alt"></i></span></li>
<li><span class="exturl" data-url="aHR0cHM6Ly9pdGhlbHAuaXRob21lLmNvbS50dy9hcnRpY2xlcy8xMDI4MTI0Mg==">iThome - Day 28 Self-Attention (Recommend)<i class="fa fa-external-link-alt"></i></span></li>
<li><span class="exturl" data-url="aHR0cHM6Ly9oYWNrbWQuaW8vQGFibGl1L0JrWG16REJtcg==">Transformer 李宏毅深度學習 (Recommend)<i class="fa fa-external-link-alt"></i></span></li>
<li><span class="exturl" data-url="aHR0cHM6Ly9zcGVlY2guZWUubnR1LmVkdS50dy9+aHlsZWUvbWwvbWwyMDIxLWNvdXJzZS1kYXRhL3NlcTJzZXFfdjkucGRm">Transformer 李宏毅老師簡報<i class="fa fa-external-link-alt"></i></span></li>
<li><span class="exturl" data-url="aHR0cHM6Ly93d3cueW91dHViZS5jb20vY2hhbm5lbC9VQzJnZ2p0dXVXdnhySEhIaWFESDFkbFE=">李宏毅老師YouTube channel<i class="fa fa-external-link-alt"></i></span></li>
<li><span class="exturl" data-url="aHR0cHM6Ly9hcnhpdi5vcmcvYWJzLzE3MDYuMDM3NjI=">Attention is all you need (paper)<i class="fa fa-external-link-alt"></i></span></li>
</ul>
]]></content>
      <tags>
        <tag>ML</tag>
        <tag>AI</tag>
        <tag>NLP</tag>
      </tags>
  </entry>
  <entry>
    <title>AWS Global Infrastructure</title>
    <url>/posts/3777980008/</url>
    <content><![CDATA[<h1 id="Exploring-the-AWS-Global-Infrastructure"><a href="#Exploring-the-AWS-Global-Infrastructure" class="headerlink" title="Exploring the AWS Global Infrastructure"></a>Exploring the AWS Global Infrastructure</h1><h2 id="Introduction"><a href="#Introduction" class="headerlink" title="Introduction"></a>Introduction</h2><p>In today’s digital landscape, ensuring the fault tolerance, stability, and high availability of applications is paramount. Amazon Web Services (AWS) provides a robust global infrastructure designed to meet these needs. This article explores the key components of the AWS global infrastructure, including Regions, Availability Zones, Local Zones, and Points of Presence.</p>
<span id="more"></span>

<h2 id="AWS-Regions-Geographical-Isolation-for-Fault-Tolerance"><a href="#AWS-Regions-Geographical-Isolation-for-Fault-Tolerance" class="headerlink" title="AWS Regions: Geographical Isolation for Fault Tolerance"></a>AWS Regions: Geographical Isolation for Fault Tolerance</h2><h3 id="Isolation-and-Data-Residency"><a href="#Isolation-and-Data-Residency" class="headerlink" title="Isolation and Data Residency"></a>Isolation and Data Residency</h3><p>AWS Regions are isolated geographic areas that enhance fault tolerance and stability. Each Region operates independently, and resources are not automatically replicated across Regions. This design ensures that data stored in one Region stays within that Region unless explicitly replicated, aiding compliance with regulatory requirements and optimizing network latency.</p>
<h3 id="Service-Availability"><a href="#Service-Availability" class="headerlink" title="Service Availability"></a>Service Availability</h3><p>Not all AWS services are available in every Region. To check which services are offered in a specific Region, you can refer to the <span class="exturl" data-url="aHR0cHM6Ly9hd3MuYW1hem9uLmNvbS9hYm91dC1hd3MvZ2xvYmFsLWluZnJhc3RydWN0dXJlL3JlZ2lvbmFsLXByb2R1Y3Qtc2VydmljZXMv">AWS Region Table<i class="fa fa-external-link-alt"></i></span>.</p>
<h2 id="Availability-Zones-Building-Blocks-of-Resilient-Applications"><a href="#Availability-Zones-Building-Blocks-of-Resilient-Applications" class="headerlink" title="Availability Zones: Building Blocks of Resilient Applications"></a>Availability Zones: Building Blocks of Resilient Applications</h2><h3 id="Structure-and-Fault-Isolation"><a href="#Structure-and-Fault-Isolation" class="headerlink" title="Structure and Fault Isolation"></a>Structure and Fault Isolation</h3><p>Each AWS Region consists of multiple Availability Zones (AZs). An AZ includes one or more data centers designed to be independent failure zones. These zones are physically separated, reducing the risk of simultaneous failure due to localized events.</p>
<h3 id="Power-and-Connectivity"><a href="#Power-and-Connectivity" class="headerlink" title="Power and Connectivity"></a>Power and Connectivity</h3><p>Availability Zones have their own power supplies and networking connections, further enhancing fault isolation. AWS recommends distributing applications across multiple AZs to achieve high availability and resilience.</p>
<h2 id="Local-Zones-Reducing-Latency-for-End-Users"><a href="#Local-Zones-Reducing-Latency-for-End-Users" class="headerlink" title="Local Zones: Reducing Latency for End-Users"></a>Local Zones: Reducing Latency for End-Users</h2><h3 id="Purpose-and-Use-Cases"><a href="#Purpose-and-Use-Cases" class="headerlink" title="Purpose and Use Cases"></a>Purpose and Use Cases</h3><p>AWS Local Zones extend AWS Regions by bringing services closer to large population centers. This reduces latency for end-users, making them ideal for applications requiring real-time processing, such as media content creation and gaming.</p>
<h3 id="Supported-Services-and-Connectivity"><a href="#Supported-Services-and-Connectivity" class="headerlink" title="Supported Services and Connectivity"></a>Supported Services and Connectivity</h3><p>Local Zones support a variety of AWS services, including Amazon EC2, Amazon VPC, and Amazon EBS. They provide a high-bandwidth, secure connection to other AWS services in the Region, ensuring seamless integration and performance.</p>
<h2 id="Data-Centers-The-Backbone-of-AWS-Infrastructure"><a href="#Data-Centers-The-Backbone-of-AWS-Infrastructure" class="headerlink" title="Data Centers: The Backbone of AWS Infrastructure"></a>Data Centers: The Backbone of AWS Infrastructure</h2><h3 id="High-Availability-and-Redundancy"><a href="#High-Availability-and-Redundancy" class="headerlink" title="High Availability and Redundancy"></a>High Availability and Redundancy</h3><p>AWS data centers are the physical locations where data resides and processing occurs. Designed with high availability in mind, they use custom network equipment and protocols. Core applications are deployed in an N+1 configuration, ensuring load balancing and failover capabilities.</p>
<h2 id="Points-of-Presence-Enhancing-Content-Delivery"><a href="#Points-of-Presence-Enhancing-Content-Delivery" class="headerlink" title="Points of Presence: Enhancing Content Delivery"></a>Points of Presence: Enhancing Content Delivery</h2><h3 id="Content-Delivery-Network-CDN"><a href="#Content-Delivery-Network-CDN" class="headerlink" title="Content Delivery Network (CDN)"></a>Content Delivery Network (CDN)</h3><p>AWS uses Points of Presence (PoPs) to deliver content with low latency through services like Amazon CloudFront and Amazon Route 53. These PoPs include Edge Locations and Regional Edge Caches, which cache content closer to users, improving performance and reducing the load on origin servers.</p>
<h2 id="Key-Takeaways"><a href="#Key-Takeaways" class="headerlink" title="Key Takeaways"></a>Key Takeaways</h2><ul>
<li><strong>Regions</strong>: Choose based on compliance and latency requirements.</li>
<li><strong>Availability Zones</strong>: Utilize multiple AZs for fault isolation and redundancy.</li>
<li><strong>Local Zones</strong>: Reduce latency for latency-sensitive applications.</li>
<li><strong>Points of Presence</strong>: Enhance content delivery and performance.</li>
</ul>
<p>By leveraging the AWS global infrastructure, you can build robust, high-performing, and resilient applications that meet your business needs.</p>
<h2 id="Conclusion"><a href="#Conclusion" class="headerlink" title="Conclusion"></a>Conclusion</h2><p>Understanding the components of the AWS global infrastructure is crucial for designing effective cloud solutions. By strategically utilizing Regions, Availability Zones, Local Zones, and Points of Presence, you can optimize your applications for performance, reliability, and compliance.</p>
<p>For more information on AWS global infrastructure, visit the <span class="exturl" data-url="aHR0cHM6Ly9hd3MuYW1hem9uLmNvbS9hYm91dC1hd3MvZ2xvYmFsLWluZnJhc3RydWN0dXJlLw==">AWS Global Infrastructure page<i class="fa fa-external-link-alt"></i></span>.</p>
]]></content>
      <tags>
        <tag>cloud</tag>
        <tag>aws</tag>
      </tags>
  </entry>
  <entry>
    <title>Understanding CIDR - A Guide to Classless Inter-Domain Routing</title>
    <url>/posts/1789986274/</url>
    <content><![CDATA[<h1 id="Foreword"><a href="#Foreword" class="headerlink" title="Foreword"></a>Foreword</h1><p>Classless Inter-Domain Routing (CIDR) revolutionized IP address allocation and routing on the internet. By moving away from the rigid class-based system (Classes A, B, and C), CIDR introduced a more flexible and efficient method for managing IP spaces. This article delves into what CIDR is, how it works, and how to compute CIDR blocks for network planning.</p>
<span id="more"></span>

<h1 id="What-is-CIDR"><a href="#What-is-CIDR" class="headerlink" title="What is CIDR?"></a>What is CIDR?</h1><p>CIDR stands for <strong>Classless Inter-Domain Routing</strong>. It’s a method used for <strong>allocating IP addresses and routing Internet Protocol packets</strong>. CIDR allows for variable-length subnet masking which enables a more efficient allocation of IP addresses. It’s designed to replace the older system based on classes (A, B, C) to improve address space allocation and enhance routing scalability on the internet.</p>
<h1 id="Key-Concepts-of-CIDR"><a href="#Key-Concepts-of-CIDR" class="headerlink" title="Key Concepts of CIDR"></a>Key Concepts of CIDR</h1><ul>
<li>IP Address: A unique numerical label assigned to devices connected to a network that uses the Internet Protocol for communication. (e.g.192.168.1.0)</li>
<li>Subnet Mask: Defines a <strong>range</strong> of IP addresses considered to be in <strong>the same network segment</strong>. (e.g. mask 255.255.255.0 is represented as &#x2F;24 in CIDR)</li>
<li>CIDR Notation: A compact representation of an IP address and its associated routing prefix in a format like 192.168.1.0&#x2F;24.</li>
</ul>
<h1 id="How-CIDR-Works"><a href="#How-CIDR-Works" class="headerlink" title="How CIDR Works"></a>How CIDR Works</h1><p>CIDR introduces flexibility in the allocation of IP addresses by <strong>varying the length of the subnet portion of the address</strong>. Unlike the fixed subnet masks of the class-based system, CIDR notation allows the network boundary to be set anywhere, enabling both smaller and larger blocks of addresses to be allocated as needed.</p>
<h1 id="Computing-CIDR"><a href="#Computing-CIDR" class="headerlink" title="Computing CIDR"></a>Computing CIDR</h1><p>To compute a CIDR block, you need the starting IP address and the size of the network (i.e., how many addresses you need).</p>
<h2 id="Example"><a href="#Example" class="headerlink" title="Example"></a>Example</h2><p>Suppose you have an IP address of 192.168.1.0 and need to support 254 devices. You would start with the base IP address 192.168.1.0 and then use a <strong>subnet mask that supports 254 devices</strong>. The CIDR block 192.168.1.0&#x2F;24 uses a subnet mask of <strong>255.255.255.0</strong>, allowing for 256 addresses total (the last part bits), which after accounting for the <strong>network and broadcast addresses</strong>, leaves 254 usable addresses for devices.</p>
<h2 id="Calculating-Subnets-and-Hosts"><a href="#Calculating-Subnets-and-Hosts" class="headerlink" title="Calculating Subnets and Hosts"></a>Calculating Subnets and Hosts</h2><ul>
<li>Subnets: The number of available subnets can be calculated based on the number of bits borrowed for subnetting, with more bits allowing for more subnets.</li>
<li>Hosts: The formula</li>
</ul>
<p>$$ 2^{(32−\text{subnet mask length})}−2 $$</p>
<p>calculates the number of usable host addresses in a subnet, subtracting 2 for the network and broadcast addresses.</p>
<h1 id="Extra"><a href="#Extra" class="headerlink" title="Extra"></a>Extra</h1><h2 id="Network-Address"><a href="#Network-Address" class="headerlink" title="Network Address"></a>Network Address</h2><blockquote>
<p>The network address represents the start of an IP address range assigned to a network. It is used to identify the network itself.</p>
</blockquote>
<p>The network address is calculated by applying the subnet mask to any IP address within the network, resulting in the lowest possible address in the range. In binary terms, the network address is formed by performing a bitwise AND operation between any IP address in the network and the subnet mask. This address is not assignable to any individual device within the network because it is used to identify the network as a whole.</p>
<h2 id="Broadcast-Address"><a href="#Broadcast-Address" class="headerlink" title="Broadcast Address"></a>Broadcast Address</h2><blockquote>
<p>The broadcast address is the last address in a network range and is used to send data to all devices within that network. </p>
</blockquote>
<p>When a packet is sent to the broadcast address, it is delivered to all hosts in the network rather than a single recipient. The broadcast address is determined by inverting the subnet mask (turning all subnet mask 0 bits into 1s) and performing a bitwise OR operation with the network address. Like the network address, the broadcast address is not assignable to any device, as its purpose is to facilitate the broadcasting of messages to all devices on the network.</p>
<h2 id="Subnet-Mask"><a href="#Subnet-Mask" class="headerlink" title="Subnet Mask"></a>Subnet Mask</h2><p>A subnet mask is a 32-bit number that masks an IP address and divides the IP address into network address and host address. Subnet masks are made up of two parts:</p>
<ol>
<li>The network part, which identifies a particular network and is represented by the binary 1s in the mask.</li>
<li>The host part, which identifies a specific device (host) on that network and is represented by the binary 0s in the mask.</li>
</ol>
<p>For example, in the subnet mask 255.255.255.0 or in CIDR notation &#x2F;24, the first 24 bits are the network part (all 1s in binary), and the last 8 bits are the host part (all 0s in binary). This means any IP address with the same first 24 bits belongs to the same network, and the last 8 bits can vary to represent different devices within that network.</p>
<h3 id="Detailed-Example"><a href="#Detailed-Example" class="headerlink" title="Detailed Example"></a>Detailed Example</h3><p>Let’s consider the network 192.168.1.0&#x2F;24:</p>
<ul>
<li>IP Address Range: 192.168.1.0 to 192.168.1.255</li>
<li>Subnet Mask: 255.255.255.0 or &#x2F;24 in CIDR notation</li>
<li>Network Address: 192.168.1.0 (the first address in the range, represents the network itself)</li>
<li>Broadcast Address: 192.168.1.255 (the last address in the range, used to broadcast to all devices on the network)</li>
</ul>
<p>In this case, the subnet mask &#x2F;24 indicates that the first 24 bits (the first three octets) are the network part, and the last 8 bits (the last octet) are for hosts. This allows for up to 256 IP addresses (from 192.168.1.0 to 192.168.1.255), but since the first and last addresses are reserved for the network and broadcast addresses, respectively, it leaves 254 addresses available for devices.</p>
<h1 id="Resources"><a href="#Resources" class="headerlink" title="Resources"></a>Resources</h1><ul>
<li><span class="exturl" data-url="aHR0cHM6Ly96aC53aWtpcGVkaWEub3JnL3poLXR3LyVFNiU5NyVBMCVFNyVCMSVCQiVFNSU4OCVBQiVFNSU5RiU5RiVFOSU5NyVCNCVFOCVCNyVBRiVFNyU5NCVCMQ==">Wikipedia<i class="fa fa-external-link-alt"></i></span></li>
<li><span class="exturl" data-url="aHR0cHM6Ly9hd3MuYW1hem9uLmNvbS90dy93aGF0LWlzL2NpZHIv">什麼是 CIDR？(AWS)<i class="fa fa-external-link-alt"></i></span></li>
</ul>
]]></content>
      <tags>
        <tag>cloud</tag>
        <tag>networking</tag>
      </tags>
  </entry>
  <entry>
    <title>[AWS-CPE] Database and Storage</title>
    <url>/posts/3982951460/</url>
    <content><![CDATA[<h1 id="Introduction"><a href="#Introduction" class="headerlink" title="Introduction"></a>Introduction</h1><p>Storage refers to the preservation of digital data in a form that can be accessed by a computer system. It comes in various types, such as hard drives, solid-state drives, USB flash drives, and cloud storage. These storage devices can hold data temporarily or permanently.</p>
<p>Databases, on the other hand, are structured systems for organizing, storing, and retrieving data. They allow for efficient data management and can handle large amounts of information systematically. Databases can be relational, where data is stored in tables and relationships can be defined between these tables, or non-relational (NoSQL), which are more flexible and can store unstructured data. The management of databases is handled through database management systems (DBMS), which provide tools for creating, querying, updating, and administering the database.</p>
<span id="more"></span>

<h1 id="Outline"><a href="#Outline" class="headerlink" title="Outline"></a>Outline</h1><p>In this article, we will talk about the database and storage service in AWS. This article is based on the courses on the AWS SkillBuilder platform. (Module5)</p>
<ul>
<li>Summarize the basic concept of storage and databases.</li>
<li>Describe the benefits of Amazon Elastic Block Store (Amazon EBS).</li>
<li>Describe the benefits of Amazon Simple Storage Service (Amazon S3).</li>
<li>Describe the benefits of Amazon Elastic File System (Amazon EFS).</li>
<li>Summarize various storage solutions.</li>
<li>Describe the benefits of Amazon Relational Database Service (Amazon RDS).</li>
<li>Describe the benefits of Amazon DynamoDB.</li>
<li>Summarize various database services.</li>
</ul>
<h1 id="Amazon-EBS-Instance-Stores-and-Amazon-Elastic-Block-Store"><a href="#Amazon-EBS-Instance-Stores-and-Amazon-Elastic-Block-Store" class="headerlink" title="Amazon EBS (Instance Stores and Amazon Elastic Block Store)"></a>Amazon EBS (Instance Stores and Amazon Elastic Block Store)</h1><h2 id="Instance-Stores-provide"><a href="#Instance-Stores-provide" class="headerlink" title="Instance Stores provide"></a>Instance Stores provide</h2><p>  <strong>temporary block-level storage for Amazon EC2 (Elastic Compute Cloud) instances.</strong> This storage is physically <strong>attached to the host machine</strong> where the instance runs, offering high performance and low latency. However, data on instance stores is ephemeral; it <strong>persists only as long as the instance is running</strong> and is lost if the instance is stopped, terminated, or if the underlying physical drive fails.</p>
<p>Block Store: (changed only by block)<br><img data-src="/images/posts/cloud/database-and-storage/block.png"></img> </p>
<p>Instance Store:<br><img data-src="/images/posts/cloud/database-and-storage/instance-store.png"></img></p>
<h2 id="Amazon-Elastic-Block-Store-Amazon-EBS"><a href="#Amazon-Elastic-Block-Store-Amazon-EBS" class="headerlink" title="Amazon Elastic Block Store (Amazon EBS)"></a>Amazon Elastic Block Store (Amazon EBS)</h2><p>  EBS, in contrast, offers <strong>persistent block storage volumes</strong> for use with EC2 instances. EBS volumes are <strong>network-attached and persist independently of the life of an instance</strong>. This means that EBS volumes can be detached from one instance and attached to another, and their data remains intact even after the instance is stopped or terminated. EBS provides various volume types that cater to different use cases, such as high throughput or high IOPS (input&#x2F;output operations per second), and is designed for both high availability and durability.</p>
<p>EBS Settings:<br><img data-src="/images/posts/cloud/database-and-storage/ebs-config.png"></img></p>
<p>EBS:<br><img data-src="/images/posts/cloud/database-and-storage/ebs.png"></img></p>
<h2 id="Snapshot"><a href="#Snapshot" class="headerlink" title="Snapshot"></a>Snapshot</h2><ul>
<li>A snapshot in the context of cloud computing, particularly with Amazon Elastic Block Store (Amazon EBS), is a point-in-time backup of an EBS volume. It captures the exact state of a volume at the time the snapshot is taken and is stored in Amazon S3 (Simple Storage Service) for durability. Snapshots are incremental, meaning only the blocks on the device that have changed after your most recent snapshot are saved, which optimizes both the time required to create the snapshot and the space used on storage.</li>
<li>Snapshots are widely used for data backup, archiving, and disaster recovery purposes. They can also be used to create new EBS volumes, clone existing volumes, or transfer data across AWS regions. Snapshots are a key feature in ensuring data durability and recoverability in cloud environments.</li>
<li><img data-src="/images/posts/cloud/database-and-storage/snapshot.png"></img></li>
</ul>
<h2 id="Quick-Quiz"><a href="#Quick-Quiz" class="headerlink" title="Quick Quiz"></a>Quick Quiz</h2><ul>
<li><img data-src="/images/posts/cloud/database-and-storage/ebs-quiz.png"></img></li>
</ul>
<h1 id="Amazon-S3-Amazon-Simple-Storage-Service"><a href="#Amazon-S3-Amazon-Simple-Storage-Service" class="headerlink" title="Amazon S3 (Amazon Simple Storage Service)"></a>Amazon S3 (Amazon Simple Storage Service)</h1><p>Amazon S3 (Simple Storage Service) is a scalable, high-speed, web-based cloud storage service designed for online backup and archiving of data and application programs. S3 provides an object storage format, which differs from traditional file systems or block storage. Data is stored as objects within buckets, and each object consists of the file itself and metadata containing a globally unique identifier and other information. S3 is widely recognized for its durability, availability, and scalability.<br><img data-src="/images/posts/cloud/database-and-storage/obj-storage.png"></img></p>
<ul>
<li>Key features of S3 include:<ul>
<li>Buckets: Containers for storage of any amount of data at any time. Each bucket is identified by a unique, user-defined name.</li>
<li>Objects: Files or blobs of data that are stored in buckets. Objects are identified within a bucket by a unique, user-assigned key.</li>
<li>Scalability: Ability to store and retrieve any amount of data, from small files to large datasets, at any time.</li>
<li>Data Availability and Durability: Designed for 99.999999999% (11 9’s) of durability, ensuring data protection against losses.</li>
<li>Security: Supports various mechanisms to control access to data, including AWS Identity and Access Management (IAM), bucket policies, and Access Control Lists (ACLs).</li>
<li>Versioning: Allows multiple variants of an object to be stored in the same bucket, useful for backup and recovery.<br><img width="60%" data-src="/images/posts/cloud/database-and-storage/aws-s3-store.png"></img></li>
</ul>
</li>
</ul>
<p>S3 is widely used in a variety of applications such as website hosting, data backup, and a backend storage for services such as Amazon EC2, Amazon EBS, and Amazon RDS.</p>
<h2 id="Types-of-AWS-S3"><a href="#Types-of-AWS-S3" class="headerlink" title="Types of AWS S3"></a>Types of AWS S3</h2><ol>
<li>S3 Standard:<br>Use Case: Frequently accessed data.<br>Characteristics: Offers high durability, availability, and performance with low latency and high throughput.</li>
<li>S3 Intelligent-Tiering:<br>Use Case: Data with unknown or changing access patterns.<br>Characteristics: Automatically moves data between two access tiers — frequent and infrequent access — based on changing access patterns, optimizing costs without performance impact.</li>
<li>S3 Standard-Infrequent Access (S3 Standard-IA):<br>Use Case: Less frequently accessed data, but requires rapid access when needed.<br>Characteristics: Lower storage cost compared to S3 Standard, but with a retrieval fee.</li>
<li>S3 One Zone-Infrequent Access (S3 One Zone-IA):<br>Use Case: For data that is infrequently accessed and does not require the multiple Availability Zone data resilience.<br>Characteristics: Stores data in a single Availability Zone, making it less expensive than S3 Standard-IA.</li>
<li>S3 Glacier:<br>Use Case: Archiving data that is rarely accessed.<br>Characteristics: Offers very low storage cost but has higher retrieval times and fees. Suitable for data archiving and long-term backup.</li>
<li>S3 Glacier Deep Archive:<br>Use Case: Long-term storage of data that is accessed once or twice a year.<br>Characteristics: AWS’s lowest-cost storage class and supports long-term retention and digital preservation for data that may be accessed once or twice a year.</li>
</ol>
<h2 id="Quick-Quiz-1"><a href="#Quick-Quiz-1" class="headerlink" title="Quick Quiz"></a>Quick Quiz</h2><ul>
<li><img data-src="/images/posts/cloud/database-and-storage/s3-quiz.png"></img></li>
</ul>
<h1 id="Amazon-EFS-Amazon-Elastic-File-System"><a href="#Amazon-EFS-Amazon-Elastic-File-System" class="headerlink" title="Amazon EFS (Amazon Elastic File System)"></a>Amazon EFS (Amazon Elastic File System)</h1><p>AWS EFS is a cloud-based file storage service for applications that require <strong>shared file storage accessible by multiple EC2 instances</strong>. EFS is scalable and elastic, providing a simple interface that allows storage capacity to <strong>grow or shrink automatically</strong> as files are added or removed. It uses the NFS (Network File System) protocol and can be mounted on several instances simultaneously, making it ideal for applications that need common data access for multiple servers, such as content management systems and web serving.</p>
<ul>
<li>Differences between AWS EFS and AWS EBS:</li>
</ul>
<ol>
<li>Storage Type:<br>EFS is a file-level storage service that operates on the NFS protocol, allowing simultaneous access from multiple EC2 instances.<br>EBS is a block-level storage service attached to a single EC2 instance at a time, used like a traditional drive.</li>
<li>Scalability:<br>EFS automatically scales its capacity up or down as you add or remove files, with no need for manual intervention or provisioning.<br>EBS requires manual management of capacity. You must choose the volume size ahead of time, though you can resize it, the process is not automatic.</li>
<li>Use Case:<br>EFS is ideal for use cases where multiple instances need to access a common file system concurrently, such as web applications and content management systems.<br>EBS is better suited for use cases requiring dedicated, single-instance storage, such as databases or any application needing consistent block-level storage.</li>
<li>Performance:<br>EFS offers scalable performance, but with slightly higher latency due to its network-based nature.<br>EBS provides high performance with lower latency, especially with provisioned IOPS volumes for I&#x2F;O-intensive applications.</li>
<li>Data Durability and Availability:<br>EFS is designed to be highly durable and available, storing data across multiple Availability Zones automatically.<br>EBS volumes are tied to a specific Availability Zone; however, you can take snapshots and replicate them to other zones for higher availability and durability.</li>
</ol>
<h1 id="Amazon-RDS-Amazon-Relational-Database-Service"><a href="#Amazon-RDS-Amazon-Relational-Database-Service" class="headerlink" title="Amazon RDS (Amazon Relational Database Service)"></a>Amazon RDS (Amazon Relational Database Service)</h1><p>AWS RDS is a managed relational database service that simplifies the setup, operation, and scaling of a relational database in the cloud. It provides cost-efficient and resizable capacity while automating time-consuming administration tasks such as hardware provisioning, database setup, patching, and backups. RDS allows you to focus on your applications so you can give them the fast performance, high availability, security, and compatibility they need.</p>
<h2 id="Amazon-RDS-database-engines"><a href="#Amazon-RDS-database-engines" class="headerlink" title="Amazon RDS database engines"></a>Amazon RDS database engines</h2><p>Amazon RDS is available on six database engines, which optimize for memory, performance, or input&#x2F;output (I&#x2F;O). Supported database engines include:</p>
<ul>
<li>Amazon Aurora</li>
<li>PostgreSQL</li>
<li>MySQL</li>
<li>MariaDB</li>
<li>Oracle Database</li>
<li>Microsoft SQL Server</li>
</ul>
<h2 id="Amazon-Aurora"><a href="#Amazon-Aurora" class="headerlink" title="Amazon Aurora"></a>Amazon Aurora</h2><p>Amazon Aurora is an enterprise-class relational database. It is compatible with MySQL and PostgreSQL relational databases. It is up to five times faster than standard MySQL databases and up to three times faster than standard PostgreSQL databases.</p>
<p>Amazon Aurora helps to reduce your database costs by reducing unnecessary input&#x2F;output (I&#x2F;O) operations, while ensuring that your database resources remain reliable and available. </p>
<p>Consider Amazon Aurora if your workloads require high availability. It replicates six copies of your data across three Availability Zones and continuously backs up your data to Amazon S3.</p>
<h1 id="Amazon-DynamoDB"><a href="#Amazon-DynamoDB" class="headerlink" title="Amazon DynamoDB"></a>Amazon DynamoDB</h1><p>AWS DynamoDB is a fully managed NoSQL database service that provides fast and predictable performance with seamless scalability. DynamoDB lets you offload the administrative burdens of operating and scaling a distributed database, so you do not have to worry about <strong>hardware provisioning, setup and configuration, replication, software patching, or cluster scaling</strong>.</p>
<p><img width="60%" data-src="/images/posts/cloud/database-and-storage/dydb.png"></img></p>
<h2 id="Features"><a href="#Features" class="headerlink" title="Features"></a>Features</h2><p>Amazon DynamoDB is a key-value database service. It delivers single-digit millisecond performance at any scale. Here are 2 important features of Amazon DynamoDB.</p>
<ul>
<li>Serverless<ul>
<li>DynamoDB is serverless, which means that you do not have to provision, patch, or manage servers.</li>
<li>You also do not have to install, maintain, or operate software.</li>
</ul>
</li>
<li>Automatic Scaling<ul>
<li>As the size of your database shrinks or grows, DynamoDB automatically scales to adjust for changes in capacity while maintaining consistent performance. </li>
<li>This makes it a suitable choice for use cases that require high performance while scaling.</li>
</ul>
</li>
</ul>
<h2 id="Amazon-RDS-VS-Amazon-DynamoDB"><a href="#Amazon-RDS-VS-Amazon-DynamoDB" class="headerlink" title="Amazon RDS VS. Amazon DynamoDB"></a>Amazon RDS VS. Amazon DynamoDB</h2><p><img width="60%" data-src="/images/posts/cloud/database-and-storage/rds-dydb1.png"></img><br><img width="60%" data-src="/images/posts/cloud/database-and-storage/rds-dydb2.png"></img></p>
<ul>
<li>Database Type<ul>
<li>RDS: Relational database service supporting SQL queries and complex transactions, suitable for structured data.</li>
<li>DynamoDB: NoSQL database service optimized for high performance and scalability, handling key-value and document data.</li>
</ul>
</li>
<li>Use Cases<ul>
<li>RDS: Ideal for applications requiring complex queries and transactional integrity, such as business software and reporting systems.</li>
<li>DynamoDB: Best for applications needing fast access, massive scalability, like mobile apps, gaming platforms, and IoT systems.</li>
</ul>
</li>
<li>Performance and Scalability<ul>
<li>RDS: Scalable vertically (upgrading server resources); performance relies on SQL optimization.</li>
<li>DynamoDB: Automatically scalable horizontally, offering consistent single-digit millisecond latency at any scale.</li>
</ul>
</li>
<li>Management<ul>
<li>RDS: Requires some management for scaling and backups.</li>
<li>DynamoDB: Fully managed, with automatic scaling, partitioning, and replication.</li>
</ul>
</li>
<li>Pricing<ul>
<li>RDS: Charges based on compute, storage, and additional features like backups and replication.</li>
<li>DynamoDB: Pricing based on throughput, storage, and optional features like data streaming.</li>
</ul>
</li>
</ul>
<p>Conclusion: Choose RDS for complex relational databases and DynamoDB for high-performance, scalable, simple data structure needs. Each has its strengths depending on the application’s requirements.</p>
<h2 id="Quick-Quiz-2"><a href="#Quick-Quiz-2" class="headerlink" title="Quick Quiz"></a>Quick Quiz</h2><p><img data-src="/images/posts/cloud/database-and-storage/dydb-quiz.png"></img></p>
<h1 id="Amazon-Redshift"><a href="#Amazon-Redshift" class="headerlink" title="Amazon Redshift"></a>Amazon Redshift</h1><p>AWS Redshift is a <strong>fully managed, petabyte-scale data warehouse service</strong> in the cloud. It provides fast query performance by using <strong>columnar</strong> storage technology to improve I&#x2F;O efficiency and parallelizing queries across multiple nodes. </p>
<h2 id="What-is-Data-Warehouse"><a href="#What-is-Data-Warehouse" class="headerlink" title="What is Data Warehouse?"></a>What is Data Warehouse?</h2><p>A Data Warehouse is a system used for <strong>reporting and data analysis</strong>, and is considered a core component of business intelligence. It is designed to enable and support business decisions by <strong>consolidating, aggregating, and organizing data from multiple sources into a central repository</strong>. This repository is structured specifically for query and analysis, often using historical data derived from transaction data, but it can include data from other sources.</p>
<p>Data warehouses support various data analysis tools, such as online analytical processing (OLAP) and data mining. The primary function of a data warehouse is to enable data analysis and support decision-making. It acts as a central repository where information is stored and can be retrieved from multiple sources, making it a valuable resource for gaining insight into the functioning and performance of an organization over time.</p>
<h1 id="AWS-DMS-AWS-Database-Migration-Service"><a href="#AWS-DMS-AWS-Database-Migration-Service" class="headerlink" title="AWS DMS (AWS Database Migration Service)"></a>AWS DMS (AWS Database Migration Service)</h1><p>AWS Database Migration Service (AWS DMS) is a cloud service that makes it easier for you to migrate relational databases, data warehouses, NoSQL databases, and other types of data stores to AWS. DMS can also be used to migrate data between on-premises databases, between different AWS cloud services, or between a combination of cloud and on-premises setups.</p>
<p><img data-src="/images/posts/cloud/database-and-storage/dms.png"></img></p>
<h2 id="2-Processes-Heterogeneous-Database"><a href="#2-Processes-Heterogeneous-Database" class="headerlink" title="2 Processes (Heterogeneous Database)"></a>2 Processes (Heterogeneous Database)</h2><p>AWS Schema Conversion Tool to make the schema structures and database code to match the target DB<br><img data-src="/images/posts/cloud/database-and-storage/dms-proc1.png"></img></p>
<p>AWS DMS to migrate<br><img data-src="/images/posts/cloud/database-and-storage/dms.png"></img></p>
<h2 id="Use-Cases"><a href="#Use-Cases" class="headerlink" title="Use Cases"></a>Use Cases</h2><ul>
<li>Development and test database migrations<ul>
<li>Enabling developers to test applications against production data without affecting production users</li>
<li><img data-src="/images/posts/cloud/database-and-storage/dms-dev-mig.png"></img></li>
</ul>
</li>
<li>Database consolidation<ul>
<li>Combining several databases into a single database</li>
<li><img data-src="/images/posts/cloud/database-and-storage/dms-con.png"></img></li>
</ul>
</li>
<li>Continuous replication<ul>
<li>Sending ongoing copies of your data to other target sources instead of doing a one-time migration</li>
<li><img data-src="/images/posts/cloud/database-and-storage/dms-rep.png"></img></li>
</ul>
</li>
</ul>
<h1 id="Additional-Database-Services"><a href="#Additional-Database-Services" class="headerlink" title="Additional Database Services"></a>Additional Database Services</h1><p>Selecting the appropriate database and storage solutions is crucial for meeting specific business needs without compromising on functionality. AWS offers a range of specialized database services tailored to unique business requirements.</p>
<h2 id="Specialized-AWS-Database-Services"><a href="#Specialized-AWS-Database-Services" class="headerlink" title="Specialized AWS Database Services"></a>Specialized AWS Database Services</h2><ul>
<li><p>Amazon DocumentDB<br>Ideal for content-heavy applications such as content management systems, catalogs, and user profiles, Amazon DocumentDB supports complex document storage needs.</p>
</li>
<li><p>Amazon Neptune<br>Amazon Neptune is a graph database designed for managing intricate data relationships efficiently, suitable for social networks, recommendation systems, and fraud detection.</p>
</li>
<li><p>Amazon Managed Blockchain and Amazon QLDB<br>For applications requiring immutable records, such as in supply chain or financial sectors, Amazon QLDB provides a secure, immutable ledger system, whereas Amazon Managed Blockchain introduces a decentralized blockchain solution.</p>
</li>
<li><p>Performance Enhancement Options<br>Amazon ElastiCache<br>Improves database read times significantly, offering caching solutions like Memcached and Redis to enhance performance without maintenance overhead.</p>
</li>
<li><p>DAX (DynamoDB Accelerator)<br>A native caching layer for DynamoDB, DAX boosts read performance dramatically, making it ideal for applications requiring speedy data retrieval.</p>
</li>
</ul>
<h1 id="Summary"><a href="#Summary" class="headerlink" title="Summary"></a>Summary</h1><ul>
<li>Elastic Block Store (EBS): Provides persistent block storage volumes for EC2 instances, ensuring local storage that is not ephemeral.</li>
<li>Amazon S3: A robust object storage service that allows users to store and retrieve vast amounts of data easily via a user interface or API.</li>
<li>Database Options: Explored relational databases available on AWS for structured data management and DynamoDB for key-value pair, non-relational workloads.</li>
<li>Elastic File System (EFS): Offers a simple, scalable file storage solution for use with AWS Cloud services and on-premises resources.</li>
<li>Amazon Redshift: Serves as a fully managed data warehouse that provides powerful and fast data analysis across your datasets.</li>
<li>Database Migration Service (DMS): Facilitates the migration of your databases to AWS, ensuring seamless data transfer with minimal downtime.</li>
<li>Specialized Storage Services: Briefly covered less commonly known services like Amazon DocumentDB, Neptune, QLDB, and Amazon Managed Blockchain for specific use cases.</li>
<li>Caching Solutions: Discussed the use of Amazon ElastiCache and DynamoDB Accelerator to improve the performance of database reads.</li>
</ul>
<h1 id="Quiz"><a href="#Quiz" class="headerlink" title="Quiz"></a>Quiz</h1><ul>
<li><img data-src="/images/posts/cloud/database-and-storage/q1.png"></img></li>
<li><img data-src="/images/posts/cloud/database-and-storage/q2.png"></img></li>
<li><img data-src="/images/posts/cloud/database-and-storage/q3.png"></img></li>
<li><img data-src="/images/posts/cloud/database-and-storage/q4.png"></img></li>
<li><img data-src="/images/posts/cloud/database-and-storage/q5.png"></img></li>
</ul>
<h1 id="Reference"><a href="#Reference" class="headerlink" title="Reference:"></a>Reference:</h1><ul>
<li>AWS SkillBuilder CPE Course Module5</li>
</ul>
]]></content>
      <tags>
        <tag>cloud</tag>
        <tag>aws</tag>
      </tags>
  </entry>
  <entry>
    <title>[AWS-CPE] Security</title>
    <url>/posts/137330004/</url>
    <content><![CDATA[<style>
    img{
        width: 70%;
    }
</style>

<h1 id="Introduction"><a href="#Introduction" class="headerlink" title="Introduction"></a>Introduction</h1><p>In the AWS security overview, the focus is on the shared responsibility model, which is a fundamental aspect of managing security within the AWS Cloud. In this model, AWS is responsible for securing the cloud infrastructure, including data centers, hardware, and software layers. On the other hand, customers are responsible for securing the workloads they run on the AWS Cloud. This division of responsibilities is designed to ensure comprehensive security both in the cloud and of the cloud. We also hints at discussing further security services, mechanisms, and features provided by AWS to enhance cloud security in subsequent sections.</p>
<span id="more"></span>

<p>Shared Responsibility Model:<br><img data-src="/images/posts/cloud/security/share.png"></img></p>
<h1 id="Objective"><a href="#Objective" class="headerlink" title="Objective"></a>Objective</h1><ul>
<li>Explain the benefits of the shared responsibility model.</li>
<li>Describe multi-factor authentication (MFA).</li>
<li>Differentiate between the AWS Identity and Access Management (IAM) security levels.</li>
<li>Explain the main benefits of AWS Organizations.</li>
<li>Describe security policies at a basic level.</li>
<li>Summarize the benefits of compliance with AWS.</li>
<li>Explain additional AWS security services at a basic level.</li>
</ul>
<h1 id="AWS-Shared-Responsibility-Model"><a href="#AWS-Shared-Responsibility-Model" class="headerlink" title="AWS Shared Responsibility Model"></a>AWS Shared Responsibility Model</h1><p>In the AWS security model, responsibilities are clearly divided between AWS and its customers through the Shared Responsibility Model, ensuring effective security on both sides:</p>
<p><img data-src="/images/posts/cloud/security/share-ec2.png"></img></p>
<h2 id="AWS-Responsibilities-‘of’-the-cloud"><a href="#AWS-Responsibilities-‘of’-the-cloud" class="headerlink" title="AWS Responsibilities: (‘of’ the cloud)"></a>AWS Responsibilities: (‘of’ the cloud)</h2><ul>
<li>Physical security: AWS secures the physical infrastructure of the cloud, including data centers, hardware, and foundational cloud services like network and hypervisor.</li>
<li>Infrastructure management: AWS handles the security of the cloud infrastructure and reinvents technologies to enhance security and efficiency, validated by third-party audits.</li>
</ul>
<h2 id="Customer-Responsibilities-‘in’-the-cloud"><a href="#Customer-Responsibilities-‘in’-the-cloud" class="headerlink" title="Customer Responsibilities: (‘in’ the cloud)"></a>Customer Responsibilities: (‘in’ the cloud)</h2><ul>
<li>System management: Customers manage and secure their operating systems and applications.</li>
<li>Data control: Customers have full control over their data. They decide who accesses the data and manage the encryption.</li>
<li>Patch management: Customers are responsible for keeping their systems patched and updated.</li>
</ul>
<h2 id="Division-of-Responsibility"><a href="#Division-of-Responsibility" class="headerlink" title="Division of Responsibility:"></a>Division of Responsibility:</h2><ul>
<li>The boundary between AWS and customer responsibilities is the operating system level. AWS secures everything below the OS, and customers manage everything above it, including the OS itself.</li>
</ul>
<h2 id="Security-Tools-and-Compliance"><a href="#Security-Tools-and-Compliance" class="headerlink" title="Security Tools and Compliance:"></a>Security Tools and Compliance:</h2><ul>
<li>AWS provides tools for data security and compliance, allowing customers to set data access permissions and utilize encryption to protect their data, even if breaches occur.<br>This model ensures that while AWS provides a secure infrastructure, customers have the autonomy and responsibility to manage their systems and data securely.</li>
</ul>
<h2 id="Quick-Quiz"><a href="#Quick-Quiz" class="headerlink" title="Quick Quiz"></a>Quick Quiz</h2><p><img data-src="/images/posts/cloud/security/share-quiz.png"></img></p>
<h1 id="User-Permissions-and-Access-IAM-AWS-Identity-and-Access-Management"><a href="#User-Permissions-and-Access-IAM-AWS-Identity-and-Access-Management" class="headerlink" title="User Permissions and Access, IAM (AWS Identity and Access Management)"></a>User Permissions and Access, IAM (AWS Identity and Access Management)</h1><p>Let’s  look at AWS Identity and Access Management (IAM), using the analogy of a coffee shop to clarify how permissions and roles are managed within AWS. </p>
<p>IAM Overview:<br><img data-src="/images/posts/cloud/security/iam.png"></img></p>
<h3 id="Shared-Responsibility"><a href="#Shared-Responsibility" class="headerlink" title="Shared Responsibility"></a>Shared Responsibility</h3><ul>
<li>Security on AWS is a shared responsibility between AWS and the customer.<ul>
<li>AWS secures the infrastructure.</li>
<li>Customers manage their data and applications.</li>
</ul>
</li>
</ul>
<h3 id="Root-Account"><a href="#Root-Account" class="headerlink" title="Root Account"></a>Root Account</h3><ul>
<li>The <strong>root user</strong> owns the AWS account and has full permissions.</li>
<li><strong>Multi-factor authentication (MFA)</strong> is recommended immediately upon account creation to enhance security.</li>
</ul>
<h3 id="AWS-Identity-and-Access-Management-IAM"><a href="#AWS-Identity-and-Access-Management-IAM" class="headerlink" title="AWS Identity and Access Management (IAM)"></a>AWS Identity and Access Management (IAM)</h3><ul>
<li><strong>IAM Users</strong>: Initially have no permissions; specific permissions must be explicitly granted.</li>
<li><strong>IAM Policies</strong>: JSON documents define what actions users can and cannot take.</li>
<li><strong>IAM Groups</strong>: Allow collective permission management for users with similar access needs.</li>
<li><strong>IAM Roles</strong>: Provide <strong>temporary access</strong> to resources, without needing a username and password, useful for varying daily tasks or external users. (When someone assumes an IAM role, they abandon all permissions that they had under a previous role and assume the permissions of the new role.)</li>
</ul>
<p>IAM Policy:<br><img data-src="/images/posts/cloud/security/policy.png"></img><br>IAM Group:<br>≈</p>
<h3 id="Principle-of-Least-Privilege"><a href="#Principle-of-Least-Privilege" class="headerlink" title="Principle of Least Privilege"></a>Principle of Least Privilege</h3><ul>
<li>Users should have only the access necessary to perform their job functions, nothing more.</li>
</ul>
<h3 id="Application-in-AWS"><a href="#Application-in-AWS" class="headerlink" title="Application in AWS"></a>Application in AWS</h3><ul>
<li>Policies control access by specifying allowed or denied actions.</li>
<li>Roles can be assumed temporarily, mimicking changing responsibilities in a coffee shop, like different roles for staff depending on the day’s needs.</li>
</ul>
<h1 id="AWS-Organizations"><a href="#AWS-Organizations" class="headerlink" title="AWS Organizations"></a>AWS Organizations</h1><p>As businesses grow and their usage of AWS expands, managing multiple AWS accounts efficiently becomes crucial. AWS Organizations is a service designed to streamline this process by providing tools for centralized account management, ensuring better control over billing, compliance, and security across various accounts. Here’s a concise overview of the primary features and benefits of using AWS Organizations.</p>
<h2 id="Key-Features-of-AWS-Organizations"><a href="#Key-Features-of-AWS-Organizations" class="headerlink" title="Key Features of AWS Organizations"></a>Key Features of AWS Organizations</h2><ul>
<li><p><strong>Centralized Management:</strong></p>
<ul>
<li>All AWS accounts (e.g., Accounts A, B, C, F, G) are combined into a single organization for centralized governance.</li>
</ul>
</li>
<li><p><strong>Consolidated Billing:</strong></p>
<ul>
<li>One primary account manages billing for all member accounts, enabling easier tracking and payment.</li>
<li>Bulk discounts are available through consolidated purchases, providing cost savings.</li>
</ul>
</li>
<li><p><strong>Hierarchical Account Groupings:</strong></p>
<ul>
<li>Accounts can be organized into Organizational Units (OUs) based on specific needs such as security, compliance, or budget.</li>
<li>Examples include grouping accounts by business unit (BUs) or regulatory compliance requirements.</li>
</ul>
</li>
<li><p><strong>Service Control Policies (SCPs):</strong></p>
<ul>
<li>SCPs allow administrators to define and enforce permission limits across the organization.</li>
<li>These policies determine the AWS services, resources, and API actions that users and roles in member accounts can access.</li>
</ul>
</li>
</ul>
<p>Using AWS Organizations helps avoid the complexity and potential security risks associated with managing multiple AWS accounts independently, often referred to as an “AWS account spaghetti.” It simplifies access control, billing, and compliance, ensuring a structured and secure cloud environment as companies scale.</p>
<ul>
<li><img data-src="/images/posts/cloud/security/org-s1.png"></img></li>
<li><img data-src="/images/posts/cloud/security/org-s2.png"></img></li>
<li><img data-src="/images/posts/cloud/security/org-s3.png"></img></li>
</ul>
<h2 id="Quick-Quiz-1"><a href="#Quick-Quiz-1" class="headerlink" title="Quick Quiz"></a>Quick Quiz</h2><p><img data-src="/images/posts/cloud/security/org-quiz.png"></img></p>
<h1 id="Compliance-and-Auditing-in-AWS"><a href="#Compliance-and-Auditing-in-AWS" class="headerlink" title="Compliance and Auditing in AWS"></a>Compliance and Auditing in AWS</h1><p>Navigating compliance and auditing in AWS involves understanding the shared responsibilities and utilizing the right tools to ensure that your applications meet industry standards and regulations. This section explains how AWS supports your compliance efforts and what responsibilities fall on you as a user.</p>
<h2 id="Key-Points-on-Compliance-in-AWS"><a href="#Key-Points-on-Compliance-in-AWS" class="headerlink" title="Key Points on Compliance in AWS"></a><strong>Key Points on Compliance in AWS</strong></h2><ul>
<li><p><strong>Industry Standards and Audits</strong>:</p>
<ul>
<li>Similar to physical businesses (e.g., a coffee shop with health inspections), AWS users must meet applicable regulations like GDPR for EU consumer data, or HIPAA for US healthcare applications.</li>
</ul>
</li>
<li><p><strong>AWS Compliance Framework</strong>:</p>
<ul>
<li><strong>Inheritance of Security Practices</strong>: AWS provides a secure infrastructure and network following industry best practices, which customers inherit.</li>
<li><strong>Assurance Programs</strong>: AWS complies with numerous assurance programs, reducing the compliance burden for users.</li>
</ul>
</li>
<li><p><strong>Regional Compliance Considerations</strong>:</p>
<ul>
<li><strong>Data Residency</strong>: Users can choose AWS Regions that comply with specific data residency requirements to meet compliance.</li>
<li><strong>Data Replication Control</strong>: AWS does not replicate data across regions without user permission, aiding in regulatory compliance.</li>
</ul>
</li>
<li><p><strong>Data Ownership and Control</strong>:</p>
<ul>
<li>Users retain full control over their data stored in AWS.</li>
<li><strong>Encryption Flexibility</strong>: Various encryption options are available to secure data as per specific standards.</li>
</ul>
</li>
<li><p><strong>Utilizing AWS Tools and Resources</strong>:</p>
<ul>
<li><strong>AWS Artifact</strong>: Provides access to third-party compliance reports and documentation.</li>
<li><strong>AWS Compliance Center</strong>: Serves as a central hub for finding compliance information and resources, including risk and security whitepapers.</li>
</ul>
</li>
<li><p><strong>Shared Responsibility Model</strong>:</p>
<ul>
<li><strong>AWS’s Role</strong>: Secures the platform and provides compliance documentation.</li>
<li><strong>User’s Role</strong>: Responsible for securing the applications and data architectures built on AWS.</li>
</ul>
</li>
</ul>
<h3 id="Conclusion"><a href="#Conclusion" class="headerlink" title="Conclusion"></a>Conclusion</h3><p>By understanding and leveraging AWS’s compliance features, while actively managing your responsibilities, you ensure that your applications remain secure and compliant. AWS offers various tools and resources to facilitate compliance with industry standards.</p>
<ul>
<li><img data-src="/images/posts/cloud/security/comp-reg.png"></img></li>
<li><img data-src="/images/posts/cloud/security/comp-quiz.png"></img></li>
</ul>
<h1 id="Denial-of-Service-Attacks"><a href="#Denial-of-Service-Attacks" class="headerlink" title="Denial-of-Service Attacks"></a>Denial-of-Service Attacks</h1><p>Distributed Denial of Service (DDoS) attacks aim to overwhelm and incapacitate your applications by flooding them with unwanted traffic. Understanding how to defend against these attacks within AWS is crucial for maintaining service availability and security. Below, we outline the basics of DDoS attacks and the robust defense mechanisms AWS offers to protect your infrastructure.<br><img data-src="/images/posts/cloud/security/attack.png"></img></p>
<h2 id="Key-Points-on-DDoS-Attacks-and-AWS-Defense-Mechanisms"><a href="#Key-Points-on-DDoS-Attacks-and-AWS-Defense-Mechanisms" class="headerlink" title="Key Points on DDoS Attacks and AWS Defense Mechanisms"></a><strong>Key Points on DDoS Attacks and AWS Defense Mechanisms</strong></h2><ul>
<li><p><strong>Understanding DDoS Attacks</strong>:</p>
<ul>
<li>The goal is to disrupt service by overwhelming the application with excessive traffic.</li>
<li>Attacks use distributed networks of compromised computers (zombie bots) to execute large-scale assaults.</li>
<li><img data-src="/images/posts/cloud/security/ddos.png"></img></li>
</ul>
</li>
<li><p><strong>Examples of DDoS Attacks</strong>:</p>
<ul>
<li><strong>UDP Flood</strong>: Exploits server response to large requests sent to services like the National Weather Service with a spoofed return address.<ul>
<li><img data-src="/images/posts/cloud/security/udp-flood.png"></img></li>
</ul>
</li>
<li><strong>HTTP Level Attacks</strong>: Mimics legitimate user requests in high volumes to block service access to genuine users.<ul>
<li><img data-src="/images/posts/cloud/security/http-att.png"></img></li>
</ul>
</li>
<li><strong>Slowloris</strong>: Simulates a slow connection, occupying server resources for extended periods without completing requests.<ul>
<li><img data-src="/images/posts/cloud/security/slow.png"></img></li>
</ul>
</li>
</ul>
</li>
<li><p><strong>AWS Defense Tools and Strategies</strong>:</p>
<ul>
<li><strong>Security Groups</strong>: Filter incoming traffic to allow only legitimate requests, blocking irrelevant protocol traffic at the AWS network level.<ul>
<li><img data-src="/images/posts/cloud/security/sec-group.png"></img></li>
</ul>
</li>
<li><strong>Elastic Load Balancer (ELB)</strong>: Manages HTTP traffic, ensuring complete requests before passing them to servers, enhancing protection against Slowloris and other similar attacks.<ul>
<li><img data-src="/images/posts/cloud/security/elb.png"></img></li>
</ul>
</li>
<li><strong>AWS Shield and AWS WAF</strong>: Provide additional layers of security with a web application firewall that detects and blocks malicious traffic based on evolving threat signatures.<ul>
<li><img data-src="/images/posts/cloud/security/shield.png"></img></li>
<li><img data-src="/images/posts/cloud/security/waf-good.png"></img></li>
<li><img data-src="/images/posts/cloud/security/waf-bad.png"></img></li>
</ul>
</li>
</ul>
</li>
</ul>
<h2 id="Advantages-of-AWS-Infrastructure"><a href="#Advantages-of-AWS-Infrastructure" class="headerlink" title="Advantages of AWS Infrastructure:"></a><strong>Advantages of AWS Infrastructure</strong>:</h2><ul>
<li>The massive scale of AWS infrastructure provides a natural defense against volume-based attacks, making it financially and logistically challenging for attackers to overwhelm.</li>
<li>AWS’s comprehensive tools and services are designed not only for functionality but also for robust security against common and sophisticated DDoS attacks.</li>
</ul>
<h2 id="Conclusion-1"><a href="#Conclusion-1" class="headerlink" title="Conclusion"></a>Conclusion</h2><p>A well-architected system on AWS, enhanced with specific AWS services like AWS Shield Advanced, offers strong defenses against DDoS attacks, ensuring your applications remain secure and available even under attack.</p>
<h1 id="Additional-Security-Services"><a href="#Additional-Security-Services" class="headerlink" title="Additional Security Services"></a>Additional Security Services</h1><p>As a business grows, securing data—both at rest and in transit—becomes crucial. AWS offers a robust set of tools and services designed to protect data integrity and prevent unauthorized access, similar to safeguarding physical assets in a coffee shop. Below, we summarize the fundamental AWS security measures that help protect your data effectively.</p>
<h2 id="Key-AWS-Security-Measures"><a href="#Key-AWS-Security-Measures" class="headerlink" title="Key AWS Security Measures"></a><strong>Key AWS Security Measures</strong></h2><ul>
<li><p><strong>Encryption Fundamentals</strong>:</p>
<ul>
<li><strong>Encryption at Rest</strong>: Ensures data is secure when stored and not actively being accessed or moved.<ul>
<li><strong>Example</strong>: DynamoDB server-side encryption integrates with AWS Key Management Service (KMS) for secure key management.</li>
</ul>
</li>
<li><strong>Encryption in Transit</strong>: Protects data actively moving from one location to another, ensuring it remains inaccessible to unauthorized users.<ul>
<li><strong>Example</strong>: SSL connections and service certificates protect data transfers, such as between an AWS Redshift instance and a SQL client.</li>
</ul>
</li>
</ul>
</li>
<li><p><strong>AWS Security Services</strong>:</p>
<ul>
<li><strong>Amazon Inspector</strong>:<ul>
<li>Automates security assessments to check compliance with best practices and identifies vulnerabilities.</li>
<li><strong>Components</strong> include network configuration analysis, an Amazon agent on EC2 instances, and a comprehensive security assessment service.</li>
<li>Findings are displayed in the Amazon Inspector console and can also be accessed via API for remediation.</li>
</ul>
</li>
<li><strong>Amazon GuardDuty</strong>:<ul>
<li>Provides threat detection by analyzing metadata from AWS CloudTrail events, Amazon VPC Flow Logs, and DNS logs.</li>
<li>Utilizes threat intelligence, anomaly detection, and machine learning to identify potential security threats accurately.</li>
<li>Operates independently of other AWS services, ensuring no impact on performance or availability.</li>
<li><img data-src="/images/posts/cloud/security/guard.png"></img></li>
</ul>
</li>
</ul>
</li>
<li><p><strong>Additional Resources</strong>:</p>
<ul>
<li>AWS also offers Advanced Shield and Security Hub among other security services.</li>
<li>Users are encouraged to explore the Resources section for more detailed information on these tools.</li>
</ul>
</li>
</ul>
<h2 id="Conclusion-2"><a href="#Conclusion-2" class="headerlink" title="Conclusion"></a>Conclusion</h2><p>AWS provides a comprehensive suite of security tools that mirror the best practices of physical asset protection in a modern digital environment. From encryption techniques that secure data at rest and in transit to sophisticated services like Amazon Inspector and GuardDuty, AWS equips users with the necessary tools to maintain high security and compliance standards.</p>
<h1 id="Summary-AWS-Security-Management"><a href="#Summary-AWS-Security-Management" class="headerlink" title="Summary - AWS Security Management"></a>Summary - AWS Security Management</h1><p>Security in AWS is underpinned by a <strong>shared responsibility model</strong>, where AWS secures the cloud infrastructure, and users are responsible for their data and applications. Below is a breakdown of key components and strategies for effectively managing security within AWS.</p>
<h2 id="Key-Concepts-in-AWS-Security"><a href="#Key-Concepts-in-AWS-Security" class="headerlink" title="Key Concepts in AWS Security"></a><strong>Key Concepts in AWS Security</strong></h2><ul>
<li><p><strong>Shared Responsibility Model</strong>:</p>
<ul>
<li><strong>AWS Responsibilities</strong>: Secures the cloud infrastructure.</li>
<li><strong>User Responsibilities</strong>: Manages security within the cloud, including data and application security.</li>
</ul>
</li>
<li><p><strong>Identity and Access Management (IAM)</strong>:</p>
<ul>
<li><strong>Users</strong>: Individuals with credentials but no permissions by default.</li>
<li><strong>Groups</strong>: Collections of users.</li>
<li><strong>Roles</strong>: Used to grant temporary permissions and credentials.</li>
<li><strong>Policies</strong>: Define permissions, explicitly allowing or denying actions.</li>
<li><strong>Identity Federation</strong>: Allows integration of corporate identities with AWS, facilitating single sign-on.</li>
<li><strong>Multi-Factor Authentication (MFA)</strong>: Particularly recommended for the root user, who has extensive permissions.</li>
</ul>
</li>
<li><p><strong>AWS Organizations</strong>:</p>
<ul>
<li>Manages multiple accounts hierarchically.</li>
<li>Useful for isolating workloads, teams, or applications across different accounts.</li>
</ul>
</li>
<li><p><strong>Compliance</strong>:</p>
<ul>
<li><strong>Third-Party Auditors</strong>: Verify AWS’s adherence to various compliance programs.</li>
<li><strong>AWS Compliance Center and AWS Artifact</strong>: Provide access to compliance information and documents.</li>
</ul>
</li>
<li><p><strong>DDoS Protection</strong>:</p>
<ul>
<li>Tools like <strong>Elastic Load Balancing (ELB)</strong>, <strong>Security Groups</strong>, <strong>AWS Shield</strong>, and <strong>AWS Web Application Firewall (WAF)</strong> help mitigate DDoS attacks.</li>
</ul>
</li>
<li><p><strong>Encryption</strong>:</p>
<ul>
<li><strong>Data Ownership</strong>: Users own their data and are responsible for its encryption both in transit and at rest.</li>
</ul>
</li>
</ul>
<h2 id="Security-Best-Practices"><a href="#Security-Best-Practices" class="headerlink" title="Security Best Practices"></a><strong>Security Best Practices</strong></h2><ul>
<li>Adhere to the <strong>least privilege principle</strong> in IAM.</li>
<li>Employ <strong>encryption</strong> consistently for data protection.</li>
<li>Utilize AWS <strong>security services</strong> to enhance defense mechanisms.</li>
</ul>
<h2 id="Conclusion-3"><a href="#Conclusion-3" class="headerlink" title="Conclusion"></a>Conclusion</h2><p>AWS prioritizes security highly, offering robust tools and guidelines to protect user environments. It is vital for users to understand and implement AWS’s recommended security measures, customize them as per their specific needs, and stay informed via AWS documentation.</p>
<h1 id="Quiz"><a href="#Quiz" class="headerlink" title="Quiz"></a>Quiz</h1><ul>
<li><img data-src="/images/posts/cloud/security/q1.png"></img></li>
<li><img data-src="/images/posts/cloud/security/q2.png"></img></li>
<li><img data-src="/images/posts/cloud/security/q3.png"></img></li>
<li><img data-src="/images/posts/cloud/security/q4.png"></img></li>
<li><img data-src="/images/posts/cloud/security/q5.png"></img></li>
</ul>
<h1 id="Reference"><a href="#Reference" class="headerlink" title="Reference:"></a>Reference:</h1><ul>
<li>AWS SkillBuilder CPE Course Module6</li>
</ul>
]]></content>
      <tags>
        <tag>cloud</tag>
        <tag>aws</tag>
      </tags>
  </entry>
  <entry>
    <title>[AWS-CPE] Monitoring and Analytics</title>
    <url>/posts/2332420025/</url>
    <content><![CDATA[<style>
    img{
        width: 70%;
    }
</style>

<h1 id="Introduction"><a href="#Introduction" class="headerlink" title="Introduction"></a>Introduction</h1><p>Effective monitoring is crucial for managing any business, whether it’s overseeing a coffee shop or monitoring cloud-based resources in AWS. Below, we detail how monitoring functions as a critical component of both daily business operations and technology management.</p>
<span id="more"></span>

<h1 id="Objectives"><a href="#Objectives" class="headerlink" title="Objectives"></a>Objectives</h1><ul>
<li>Summarize approaches to monitoring your AWS environment.</li>
<li>Describe the benefits of Amazon CloudWatch.</li>
<li>Describe the benefits of AWS CloudTrail.</li>
<li>Describe the benefits of AWS Trusted Advisor.</li>
</ul>
<h1 id="Amazon-CloudWatch"><a href="#Amazon-CloudWatch" class="headerlink" title="Amazon CloudWatch"></a>Amazon CloudWatch</h1><p>(monitoring resources)<br>As business operations, such as a coffee shop, become more complex, maintaining visibility and proactive management of systems becomes crucial. Amazon CloudWatch provides an extensive solution for monitoring AWS resources and applications, analogous to keeping a coffee shop’s equipment and operations running smoothly.</p>
<h2 id="Key-Features-of-Amazon-CloudWatch"><a href="#Key-Features-of-Amazon-CloudWatch" class="headerlink" title="Key Features of Amazon CloudWatch"></a><strong>Key Features of Amazon CloudWatch</strong></h2><ul>
<li><p><strong>Real-Time Monitoring and Metrics</strong>:</p>
<ul>
<li>Tracks various metrics related to AWS resources and applications, similar to monitoring espresso counts in a coffee machine.</li>
<li><strong>Example</strong>: Monitoring the CPU utilization of an EC2 instance or tracking custom metrics like the number of espressos made.</li>
</ul>
</li>
<li><p><strong>Alerts and Alarms</strong>:</p>
<ul>
<li><strong>CloudWatch Alarms</strong>: Set thresholds that, when exceeded, trigger notifications or actions, such as alerting staff to clean an espresso machine after 100 uses.</li>
<li>Integrated with <strong>Amazon SNS</strong> for notifications, enabling actions like sending SMS messages to manage urgent tasks.</li>
</ul>
</li>
<li><p><strong>Dashboards</strong>:</p>
<ul>
<li>Visual displays that aggregate metrics in real-time, allowing for a comprehensive view of all monitored resources.</li>
<li>Useful for proactive monitoring and quick assessments of multiple resources at a glance.</li>
</ul>
</li>
</ul>
<p><img data-src="/images/posts/cloud/monitoring/cloud-watch.png"></img></p>
<h2 id="Benefits-of-Using-CloudWatch"><a href="#Benefits-of-Using-CloudWatch" class="headerlink" title="Benefits of Using CloudWatch"></a><strong>Benefits of Using CloudWatch</strong></h2><ul>
<li><p><strong>Centralized Metric Access</strong>:</p>
<ul>
<li>Gathers and displays metrics and logs from all AWS resources, applications, and on-premises servers in one location, enhancing system-wide visibility.</li>
</ul>
</li>
<li><p><strong>Insight Across Infrastructure</strong>:</p>
<ul>
<li>Correlates and visualizes metrics and logs, facilitating quick identification and resolution of issues.</li>
<li>Reduces mean time to resolution (MTTR) and improves total cost of ownership (TCO), analogous to optimizing machine maintenance in a coffee shop to free up resources for customer service.</li>
</ul>
</li>
<li><p><strong>Operational Optimization</strong>:</p>
<ul>
<li>Aggregates usage data across resources like EC2 instances to gain operational insights and optimize utilization.</li>
<li>Helps focus staff efforts on core activities rather than routine maintenance.</li>
</ul>
</li>
</ul>
<h2 id="Conclusion"><a href="#Conclusion" class="headerlink" title="Conclusion"></a>Conclusion</h2><p>Amazon CloudWatch equips businesses with the tools necessary to monitor, alert, and manage their AWS resources effectively. By analogizing with a coffee shop’s day-to-day operations, we can see how CloudWatch’s capabilities are essential for maintaining operational efficiency and ensuring resource optimization.</p>
<h1 id="AWS-CloudTrail"><a href="#AWS-CloudTrail" class="headerlink" title="AWS CloudTrail"></a>AWS CloudTrail</h1><p>(monitoring transactions)<br>AWS CloudTrail is a vital tool for auditing and compliance in cloud environments, similar to how a cash register tracks and records every transaction in a store. This service ensures that all actions taken within AWS are logged, providing a reliable means of monitoring and verifying all changes made to the resources.</p>
<h2 id="Key-Features-and-Benefits-of-AWS-CloudTrail"><a href="#Key-Features-and-Benefits-of-AWS-CloudTrail" class="headerlink" title="Key Features and Benefits of AWS CloudTrail"></a><strong>Key Features and Benefits of AWS CloudTrail</strong></h2><ul>
<li><p><strong>Comprehensive Auditing</strong>:</p>
<ul>
<li>Every API request made within AWS is logged by CloudTrail, whether it’s launching an EC2 instance, modifying a DynamoDB table, or altering user permissions.</li>
<li>Details recorded include the identity of the requester, the time of the API call, IP address, and both the response and new state of the system after the request.</li>
</ul>
</li>
<li><p><strong>Security and Compliance</strong>:</p>
<ul>
<li><strong>Root-Level Monitoring</strong>: Tracks changes made by high-level administrators, crucial for ensuring that critical settings, like security group configurations, remain unaltered.</li>
<li><strong>Audit Trails for Compliance</strong>: Provides auditors with definitive proof that no unauthorized changes have been made, especially to security settings.</li>
</ul>
</li>
<li><p><strong>Data Integrity and Storage</strong>:</p>
<ul>
<li>Logs are stored indefinitely in secure S3 buckets, utilizing tamper-proof features like Vault Lock to ensure data integrity and provide a clear lineage of all actions recorded.</li>
</ul>
</li>
</ul>
<p><img data-src="/images/posts/cloud/monitoring/cloud-trail.png"></img></p>
<h2 id="Conclusion-1"><a href="#Conclusion-1" class="headerlink" title="Conclusion"></a><strong>Conclusion</strong></h2><p>AWS CloudTrail is akin to a digital cash register for cloud environments, meticulously recording each transaction and ensuring that all data is accounted for and secure. This capability is essential for maintaining stringent compliance standards and proving system integrity to auditors, making it an indispensable tool for businesses operating in AWS.</p>
<h2 id="Quick-Quiz"><a href="#Quick-Quiz" class="headerlink" title="Quick Quiz"></a>Quick Quiz</h2><p><img data-src="/images/posts/cloud/monitoring/trail-quiz.png"></img></p>
<h1 id="AWS-Trusted-Advisor"><a href="#AWS-Trusted-Advisor" class="headerlink" title="AWS Trusted Advisor"></a>AWS Trusted Advisor</h1><p>AWS Trusted Advisor acts like an automated consultant for your AWS environment, analyzing your setup against best practices across five key areas. It provides insights and recommendations to enhance efficiency, security, and performance, much like an external advisor might offer suggestions to improve a business’s operations.</p>
<h2 id="Key-Features-of-AWS-Trusted-Advisor"><a href="#Key-Features-of-AWS-Trusted-Advisor" class="headerlink" title="Key Features of AWS Trusted Advisor"></a><strong>Key Features of AWS Trusted Advisor</strong></h2><ul>
<li><p><strong>Five Pillars of Evaluation</strong>:</p>
<ul>
<li><strong>Cost Optimization</strong>: Identifies underutilized resources to reduce expenses.</li>
<li><strong>Performance</strong>: Assesses configurations to ensure optimal performance.</li>
<li><strong>Security</strong>: Highlights security vulnerabilities like weak password policies or insufficient MFA.</li>
<li><strong>Fault Tolerance</strong>: Checks for adequate backup measures and balanced deployment across AZs.</li>
<li><strong>Service Limits</strong>: Alerts on approaching or reached AWS service limits.</li>
</ul>
</li>
<li><p><strong>Dashboard and Alerts</strong>:</p>
<ul>
<li><strong>Interface</strong>: Provides a dashboard that categorizes issues into three levels: action recommended (red), investigation recommended (orange), and no problems (green).</li>
<li><strong>Alert System</strong>: Can send email notifications to relevant contacts as issues are detected.</li>
</ul>
</li>
</ul>
<p><img data-src="/images/posts/cloud/monitoring/advisor.png"></img></p>
<h2 id="Practical-Examples-and-Recommendations"><a href="#Practical-Examples-and-Recommendations" class="headerlink" title="Practical Examples and Recommendations"></a><strong>Practical Examples and Recommendations</strong></h2><ul>
<li><p><strong>Cost Optimization Checks</strong>:</p>
<ul>
<li>Might indicate idle RDS instances or underutilized EC2 instances and EBS volumes, suggesting downsizing or termination to save costs.</li>
</ul>
</li>
<li><p><strong>Security Checks</strong>:</p>
<ul>
<li>May alert on urgent security issues like open security groups, advocating for immediate action to close vulnerabilities.</li>
</ul>
</li>
<li><p><strong>Fault Tolerance Advice</strong>:</p>
<ul>
<li>Could point out the absence of EBS volume snapshots, recommending creating backups to prevent data loss.</li>
</ul>
</li>
<li><p><strong>Handling Service Limits</strong>:</p>
<ul>
<li>Informs when nearing or hitting AWS service caps, advising when to request limit increases.</li>
</ul>
</li>
</ul>
<h2 id="Conclusion-2"><a href="#Conclusion-2" class="headerlink" title="Conclusion"></a><strong>Conclusion</strong></h2><p>AWS Trusted Advisor is a comprehensive tool that provides critical insights and actionable advice across multiple aspects of your AWS environment. By enabling Trusted Advisor and setting up its alert systems, you can proactively manage and optimize your resources, ensuring a secure, efficient, and reliable AWS infrastructure.</p>
<h1 id="Summary"><a href="#Summary" class="headerlink" title="Summary"></a>Summary</h1><p>Effective management of cloud environments requires a deep understanding of system operations and resource usage. AWS provides several tools to help businesses maintain efficient, secure, and compliant applications by offering insights into system behavior, user actions, and resource optimization.</p>
<h2 id="Overview-of-Key-AWS-Monitoring-and-Analytical-Tools"><a href="#Overview-of-Key-AWS-Monitoring-and-Analytical-Tools" class="headerlink" title="Overview of Key AWS Monitoring and Analytical Tools"></a><strong>Overview of Key AWS Monitoring and Analytical Tools</strong></h2><ul>
<li><p><strong>Amazon CloudWatch</strong>:</p>
<ul>
<li>Provides near real-time monitoring of <strong>AWS resources and applications</strong>.</li>
<li>Allows tracking of metrics over time to optimize system performance.</li>
<li>Alerts users to conditions that require attention, enhancing proactive management.</li>
</ul>
</li>
<li><p><strong>AWS CloudTrail</strong>:</p>
<ul>
<li>Tracks <strong>user activities and API usage</strong> within AWS.</li>
<li>Offers comprehensive auditing capabilities by recording who accessed what, when, and from where.</li>
<li>Answers most auditing queries, except for the reasons behind the actions.</li>
</ul>
</li>
<li><p><strong>AWS Trusted Advisor</strong>:</p>
<ul>
<li>Delivers a dashboard summarizing over 40 common concerns related to cost, performance, security, and resilience.</li>
<li>Provides <strong>actionable insights</strong> to help optimize resource usage and improve system efficiency.</li>
</ul>
</li>
</ul>
<h2 id="Additional-Tools-and-Resources"><a href="#Additional-Tools-and-Resources" class="headerlink" title="Additional Tools and Resources"></a><strong>Additional Tools and Resources</strong></h2><ul>
<li>While CloudWatch, CloudTrail, and Trusted Advisor are fundamental, AWS offers a variety of other monitoring and analytical tools to suit diverse business needs.</li>
</ul>
<h2 id="Conclusion-3"><a href="#Conclusion-3" class="headerlink" title="Conclusion"></a>Conclusion</h2><p>Understanding and utilizing AWS monitoring tools like CloudWatch, CloudTrail, and Trusted Advisor is crucial for any business looking to secure, optimize, and streamline their cloud operations. These tools not only provide valuable insights into system performance and user activities but also help in maintaining compliance and improving overall operational efficiency.</p>
<h1 id="Quiz"><a href="#Quiz" class="headerlink" title="Quiz"></a>Quiz</h1><ul>
<li><img data-src="/images/posts/cloud/monitoring/q1.png"></img></li>
<li><img data-src="/images/posts/cloud/monitoring/q2.png"></img></li>
<li><img data-src="/images/posts/cloud/monitoring/q3.png"></img></li>
</ul>
<h1 id="Reference"><a href="#Reference" class="headerlink" title="Reference"></a>Reference</h1><ul>
<li>AWS SkillBuilder CPE Module7</li>
</ul>
]]></content>
      <tags>
        <tag>cloud</tag>
        <tag>aws</tag>
      </tags>
  </entry>
  <entry>
    <title>Understanding Cloud Computing - Public, Private, and Hybrid Clouds</title>
    <url>/posts/3108822142/</url>
    <content><![CDATA[<h1 id="Introduction"><a href="#Introduction" class="headerlink" title="Introduction"></a>Introduction</h1><p>Cloud computing has become a cornerstone for deploying modern IT solutions, offering scalable, efficient, and versatile options for businesses and individuals alike. With the evolution of cloud technology, three primary deployment models have emerged: public, private, and hybrid clouds. Each model offers distinct features, benefits, and considerations, making it crucial to understand their differences to choose the most suitable option for your needs. In this article, we’ll delve into the nuances of public, private, and hybrid clouds, helping you navigate the cloud landscape.</p>
<span id="more"></span>

<h1 id="Public-Cloud"><a href="#Public-Cloud" class="headerlink" title="Public Cloud"></a>Public Cloud</h1><p>The public cloud is a model where computing services are <strong>offered by third-party providers</strong> over the Internet, making them available to anyone who wants to use or purchase them. This model is characterized by its scalability, reliability, and flexibility, allowing users to access a wide range of resources and services on a pay-as-you-go basis. Public clouds are ideal for businesses that need to <strong>scale their computing resources up or down quickly</strong> or that do not want to invest in heavy upfront costs for infrastructure.</p>
<p>Below are the features of public cloud:</p>
<ul>
<li>Characteristics<ul>
<li>Multi-tenancy: Shared resources among multiple users.</li>
<li>Scalability: Easily adjusts to demand.</li>
<li>Cost-Effectiveness: Pay-as-you-go model reduces upfront investments.</li>
<li>Maintenance: Handled by the provider, reducing the workload on users.</li>
</ul>
</li>
<li>Benefits<ul>
<li>Quick, easy setup and scalability.</li>
<li>Access to a broad range of services and applications.</li>
<li>No need for physical hardware investments.</li>
</ul>
</li>
<li>Challenges<ul>
<li>Lesser control over security and privacy.</li>
<li>Performance may vary due to the shared resource model.</li>
</ul>
</li>
<li>Examples:<ul>
<li>Amazon Web Services (AWS)</li>
<li>Microsoft Azure</li>
<li>Google Cloud Platform (GCP)</li>
</ul>
</li>
</ul>
<h1 id="Private-Cloud"><a href="#Private-Cloud" class="headerlink" title="Private Cloud"></a>Private Cloud</h1><p>Private cloud refers to cloud computing resources used <strong>exclusively by a single business or organization.</strong> The private cloud can be hosted on the <strong>organization’s own premises or by a third-party service provider but remains within the control of the organization it serves.</strong> This model offers enhanced security and control, making it suitable for businesses with <strong>strict regulatory compliance requirements or those that handle sensitive data.</strong></p>
<p>Below are the features of private cloud:</p>
<ul>
<li>Characteristics<ul>
<li>Single-tenancy: Dedicated resources for a single organization.</li>
<li>Customization: Tailorable to specific business needs.</li>
<li>Control: More direct oversight over security and compliance.</li>
</ul>
</li>
<li>Benefits<ul>
<li>Enhanced security and privacy.</li>
<li>Full control over the cloud environment.</li>
<li>Potentially more cost-effective for steady, predictable workloads.</li>
</ul>
</li>
<li>Challenges<ul>
<li>Requires more significant initial investment and expertise.</li>
<li>Maintenance and management can be resource-intensive.</li>
</ul>
</li>
<li>Examples:<ul>
<li>VMware vSphere</li>
<li>OpenStack</li>
<li>Microsoft Azure Stack</li>
</ul>
</li>
</ul>
<blockquote>
<p>Note: The distinction between private cloud hosted by a third-party provider and public cloud lies in <strong>the architecture, resource allocation, and control</strong> rather than the physical location or ownership of the infrastructure. </p>
</blockquote>
<h1 id="Hybrid-Cloud"><a href="#Hybrid-Cloud" class="headerlink" title="Hybrid Cloud"></a>Hybrid Cloud</h1><p>Hybrid cloud combines public and private cloud elements, with technology enabling data and applications to be shared between them. This approach allows businesses to <strong>keep critical applications and sensitive data in a private cloud for security while leveraging the robust scalability and cost-efficiency of the public cloud for less sensitive operations.</strong> Hybrid cloud is ideal for organizations looking for flexibility, more deployment options, and optimization of their workloads across different environments.</p>
<p>Below are the features of hybrid cloud:</p>
<ul>
<li>Characteristics<ul>
<li>Interoperability: Smooth integration between public and private sectors.</li>
<li>Flexibility: Workloads are movable based on needs and costs.</li>
<li>Optimization: Places workloads in the best environment for performance, cost, and compliance.</li>
</ul>
</li>
<li>Benefits<ul>
<li>Merges the scalability of public clouds with the security of private clouds.</li>
<li>Supports dynamic or fluctuating workloads effectively.</li>
<li>Ideal for meeting both operational flexibility and compliance requirements.</li>
</ul>
</li>
<li>Challenges<ul>
<li>Complexity in setup and ongoing management.</li>
<li>Necessitates strategic planning for seamless operation across environments.</li>
</ul>
</li>
<li>Examples:<ul>
<li>IBM Cloud</li>
<li>Oracle Cloud at Customer</li>
<li>AWS Outposts</li>
</ul>
</li>
</ul>
<h1 id="Comparison"><a href="#Comparison" class="headerlink" title="Comparison"></a>Comparison</h1><table>
<thead>
<tr>
<th>Feature</th>
<th>Public Cloud</th>
<th>Private Cloud</th>
<th>Hybrid Cloud</th>
</tr>
</thead>
<tbody><tr>
<td><strong>Accessibility</strong></td>
<td>Over the Internet, available to all</td>
<td>Restricted to a single organization</td>
<td>Combination, with controlled access</td>
</tr>
<tr>
<td><strong>Cost</strong></td>
<td>Pay-as-you-go, less upfront cost</td>
<td>Higher initial cost and maintenance</td>
<td>Varies, can optimize cost based on usage</td>
</tr>
<tr>
<td><strong>Control</strong></td>
<td>Limited control over infrastructure</td>
<td>High control over resources</td>
<td>Balances control and flexibility</td>
</tr>
<tr>
<td><strong>Security</strong></td>
<td>Good, but shared with other tenants</td>
<td>Enhanced, as resources are not shared</td>
<td>Customizable, can be tailored for each part</td>
</tr>
<tr>
<td><strong>Scalability</strong></td>
<td>Highly scalable, on-demand resources</td>
<td>Scalable, but within private resources</td>
<td>Highly scalable, leverages both models</td>
</tr>
<tr>
<td><strong>Suitability</strong></td>
<td>Small to large businesses needing scalability</td>
<td>Businesses with strict data control needs</td>
<td>Businesses requiring flexibility and control</td>
</tr>
<tr>
<td><strong>Maintenance</strong></td>
<td>Managed by the provider</td>
<td>Managed by the organization or provider</td>
<td>Combination, depending on where hosted</td>
</tr>
</tbody></table>
<h1 id="Reference"><a href="#Reference" class="headerlink" title="Reference"></a>Reference</h1><ul>
<li><span class="exturl" data-url="aHR0cHM6Ly93d3cuYnVzaW5lc3N0ZWNod2Vla2x5LmNvbS9vcGVyYXRpb25hbC1lZmZpY2llbmN5L2Nsb3VkLWNvbXB1dGluZy9wcml2YXRlLWNsb3VkLXZzLXB1YmxpYy1jbG91ZC8=">Private Cloud vs Public Cloud: Which Cloud Computing deployment model is best for your business?<i class="fa fa-external-link-alt"></i></span></li>
</ul>
]]></content>
      <tags>
        <tag>cloud computing</tag>
      </tags>
  </entry>
  <entry>
    <title>Understanding AWS Networking: VGW, VPC Endpoints, and Security Groups</title>
    <url>/posts/3294285798/</url>
    <content><![CDATA[<h1 id="Understanding-AWS-Networking-VGW-VPC-Endpoints-and-Security-Groups"><a href="#Understanding-AWS-Networking-VGW-VPC-Endpoints-and-Security-Groups" class="headerlink" title="Understanding AWS Networking: VGW, VPC Endpoints, and Security Groups"></a>Understanding AWS Networking: VGW, VPC Endpoints, and Security Groups</h1><p>When working with Amazon Web Services (AWS), understanding the various networking components is crucial for designing secure and efficient architectures. In this article, we will explore three key concepts: Virtual Private Gateway (VGW), VPC Endpoints, and Security Groups.</p>
<span id="more"></span>

<h2 id="Virtual-Private-Gateway-VGW"><a href="#Virtual-Private-Gateway-VGW" class="headerlink" title="Virtual Private Gateway (VGW)"></a>Virtual Private Gateway (VGW)</h2><h3 id="What-is-VGW"><a href="#What-is-VGW" class="headerlink" title="What is VGW?"></a>What is VGW?</h3><p>A <strong>Virtual Private Gateway (VGW)</strong> is a component that enables communication between your Amazon Virtual Private Cloud (VPC) and on-premises networks or other VPCs. It acts as the VPN concentrator on the AWS side of the VPN connection.</p>
<h3 id="Key-Aspects-of-VGW"><a href="#Key-Aspects-of-VGW" class="headerlink" title="Key Aspects of VGW"></a>Key Aspects of VGW</h3><ul>
<li><strong>Functionality</strong>: VGW is used to set up a VPN connection between your VPC and your on-premises data center or another VPC.</li>
<li><strong>Use Cases</strong>: It is typically used for hybrid cloud setups where resources are spread across on-premises infrastructure and AWS.</li>
<li><strong>Components</strong>: A VPN connection includes two tunnels between your VGW and the on-premises VPN appliance, providing redundancy.</li>
</ul>
<h3 id="Benefits-of-VGW"><a href="#Benefits-of-VGW" class="headerlink" title="Benefits of VGW"></a>Benefits of VGW</h3><ul>
<li><strong>Secure Connectivity</strong>: Provides secure connections between AWS and on-premises networks.</li>
<li><strong>Redundancy</strong>: Multiple VPN tunnels ensure high availability.</li>
<li><strong>Scalability</strong>: Easily scalable to meet growing network demands.</li>
</ul>
<h2 id="VPC-Endpoints"><a href="#VPC-Endpoints" class="headerlink" title="VPC Endpoints"></a>VPC Endpoints</h2><h3 id="What-are-VPC-Endpoints"><a href="#What-are-VPC-Endpoints" class="headerlink" title="What are VPC Endpoints?"></a>What are VPC Endpoints?</h3><p><strong>VPC Endpoints</strong> allow you to privately connect your VPC to supported AWS services and VPC endpoint services powered by AWS PrivateLink without requiring an internet gateway, NAT device, VPN connection, or AWS Direct Connect connection.</p>
<h3 id="Types-of-VPC-Endpoints"><a href="#Types-of-VPC-Endpoints" class="headerlink" title="Types of VPC Endpoints"></a>Types of VPC Endpoints</h3><ul>
<li><strong>Interface Endpoints</strong>: These use AWS PrivateLink and provide private connectivity to services like Amazon S3, DynamoDB, and others via a network interface in your VPC.</li>
<li><strong>Gateway Endpoints</strong>: These are used specifically for S3 and DynamoDB and route traffic to these services without traversing the internet.</li>
</ul>
<h3 id="Benefits-of-VPC-Endpoints"><a href="#Benefits-of-VPC-Endpoints" class="headerlink" title="Benefits of VPC Endpoints"></a>Benefits of VPC Endpoints</h3><ul>
<li><strong>Enhanced Security</strong>: Ensures that traffic between your VPC and AWS services does not leave the AWS network.</li>
<li><strong>Simplified Network Architecture</strong>: Reduces the need for an internet gateway or NAT device.</li>
<li><strong>Cost-Efficiency</strong>: Can reduce data transfer costs compared to internet-based communication.</li>
</ul>
<h2 id="Security-Groups"><a href="#Security-Groups" class="headerlink" title="Security Groups"></a>Security Groups</h2><h3 id="What-are-Security-Groups"><a href="#What-are-Security-Groups" class="headerlink" title="What are Security Groups?"></a>What are Security Groups?</h3><p><strong>Security Groups</strong> are virtual firewalls that control inbound and outbound traffic to AWS resources within a VPC.</p>
<h3 id="Key-Characteristics-of-Security-Groups"><a href="#Key-Characteristics-of-Security-Groups" class="headerlink" title="Key Characteristics of Security Groups"></a>Key Characteristics of Security Groups</h3><ul>
<li><strong>Functionality</strong>: Act at the instance level and control traffic based on protocols, ports, and source&#x2F;destination IP addresses.</li>
<li><strong>Rules</strong>: Define rules to allow or deny traffic. For inbound rules, specify the allowed traffic to the instance. For outbound rules, specify the allowed traffic from the instance.</li>
<li><strong>Statefulness</strong>: Security groups are stateful, meaning if you allow an inbound request from an IP, the response is automatically allowed regardless of outbound rules.</li>
</ul>
<h3 id="Benefits-of-Security-Groups"><a href="#Benefits-of-Security-Groups" class="headerlink" title="Benefits of Security Groups"></a>Benefits of Security Groups</h3><ul>
<li><strong>Granular Control</strong>: Fine-grained control over inbound and outbound traffic.</li>
<li><strong>Stateful Rules</strong>: Simplifies rule management by automatically allowing response traffic.</li>
<li><strong>Enhanced Security</strong>: Helps protect instances from unwanted traffic.</li>
</ul>
<h2 id="Quick-Quiz"><a href="#Quick-Quiz" class="headerlink" title="Quick Quiz"></a>Quick Quiz</h2><p>Which techniques should you use to secure an Amazon Relational Database Service (Amazon RDS) database? (Select THREE.)<br> [] AWS Identity and Access Management (IAM) policies to define access at the table, [] row, and column levels<br> [v] Security groups to control network access to individual instances<br> [] An Amazon Virtual Private Cloud (Amazon VPC) gateway endpoint to prevent traffic [] from traversing the internet<br> [] A virtual private gateway (VGW) to filter traffic from restricted networks<br> [v] A virtual private cloud (VPC) to provide instance isolation and firewall protection<br> [v] Encryption to protect sensitive data</p>
<h3 id="Reason-of-False-Selections"><a href="#Reason-of-False-Selections" class="headerlink" title="Reason of False Selections"></a>Reason of False Selections</h3><ol>
<li>AWS Identity and Access Management (IAM) policies to define access at the table, row, and column levels<ol>
<li>Reason: While IAM policies are crucial for managing permissions and access controls, RDS uses database-specific authentication mechanisms for access control at more granular levels such as table, row, and column. IAM is typically used for controlling access to the RDS instance itself rather than the data within it.</li>
</ol>
</li>
<li>An Amazon Virtual Private Cloud (Amazon VPC) gateway endpoint to prevent traffic from traversing the internet<ol>
<li>Reason: VPC endpoints are useful for connecting to AWS services privately without going through the public internet. However, they are not specifically designed for securing RDS instances. The primary use of VPC endpoints is for connecting to services like S3 or DynamoDB within a VPC, not for instance-level security.</li>
</ol>
</li>
<li>A virtual private gateway (VGW) to filter traffic from restricted networks<ol>
<li>Reason: A VGW is used to establish VPN connections between your VPC and on-premises networks. While it provides a secure connection for hybrid environments, it is not specifically tailored to securing RDS instances directly. VGWs are more about connecting different network environments rather than providing instance-level security.</li>
</ol>
</li>
</ol>
<h2 id="Extra-VGW-vs-IGW-Internet-Gate-Way"><a href="#Extra-VGW-vs-IGW-Internet-Gate-Way" class="headerlink" title="Extra: VGW vs. IGW(Internet Gate Way)"></a>Extra: VGW vs. IGW(Internet Gate Way)</h2><p>On-Premises Network          Internet<br>       |                       |<br>       |                       |<br>      VPN                      |<br>       |                       |<br>    VGW|                       |IGW<br>      |&#x2F;                      |&#x2F;<br>   +——-+               +——–+<br>   |  VPC  |               |  VPC   |<br>   +——-+               +——–+<br>   |       |               |        |<br> Instances              Instances</p>
<h2 id="Extra-VGW-vs-IGW-vs-VPC-gateway-endpoint"><a href="#Extra-VGW-vs-IGW-vs-VPC-gateway-endpoint" class="headerlink" title="Extra VGW vs. IGW vs. VPC gateway endpoint"></a>Extra VGW vs. IGW vs. VPC gateway endpoint</h2><ul>
<li>Internet Gateway (IGW): Provides internet access for VPC instances.</li>
<li>Virtual Private Gateway (VGW): Facilitates secure connections between VPC and on-premises networks.</li>
<li>VPC Gateway Endpoint: Allows private access to supported AWS services from within the VPC.</li>
</ul>
<h2 id="Conclusion"><a href="#Conclusion" class="headerlink" title="Conclusion"></a>Conclusion</h2><p>Understanding VGW, VPC Endpoints, and Security Groups is essential for designing secure and efficient network architectures in AWS. VGWs provide secure connectivity between VPCs and on-premises networks, VPC Endpoints allow private connections to AWS services, and Security Groups offer granular control over instance traffic. Leveraging these components effectively can significantly enhance the security and performance of your AWS infrastructure.</p>
<p>By mastering these AWS networking fundamentals, you can ensure that your applications and data remain secure and accessible, while optimizing costs and simplifying network management.</p>
<h2 id="References"><a href="#References" class="headerlink" title="References"></a>References</h2><p><span class="exturl" data-url="aHR0cHM6Ly9nb2RsZW9uLmdpdGh1Yi5pby9ibG9nL0FXUy9BV1MtU09BLVZQQy8=">AWS SOA 學習筆記 - VPC(Virtual Private Cloud)<i class="fa fa-external-link-alt"></i></span></p>
]]></content>
      <tags>
        <tag>cloud computing</tag>
      </tags>
  </entry>
  <entry>
    <title>Understanding Networking in AWS</title>
    <url>/posts/3613028504/</url>
    <content><![CDATA[<h2 id="The-Foundational-Role-of-a-VPC-in-AWS-Cloud-Networking"><a href="#The-Foundational-Role-of-a-VPC-in-AWS-Cloud-Networking" class="headerlink" title="The Foundational Role of a VPC in AWS Cloud Networking"></a>The Foundational Role of a VPC in AWS Cloud Networking</h2><p>A <strong>Virtual Private Cloud (VPC)</strong> in Amazon Web Services (AWS) serves as the cornerstone of your AWS network infrastructure. It enables you to launch AWS resources in a logically isolated virtual network that you define. This virtual network closely mirrors a traditional network you would operate in your data center, benefiting from the scalability and flexibility of AWS’s cloud infrastructure.</p>
<p><img data-src="/images/posts/cloud/vpc/structure.png"></img><br>source: <span class="exturl" data-url="aHR0cHM6Ly9kb2NzLmF3cy5hbWF6b24uY29tL3ZwYy9sYXRlc3QvdXNlcmd1aWRlL3ZwYy1leGFtcGxlLXByaXZhdGUtc3VibmV0cy1uYXQuaHRtbA==">https://docs.aws.amazon.com/vpc/latest/userguide/vpc-example-private-subnets-nat.html<i class="fa fa-external-link-alt"></i></span></p>
<span id="more"></span>

<h3 id="Key-Components-and-Concepts-of-VPC"><a href="#Key-Components-and-Concepts-of-VPC" class="headerlink" title="Key Components and Concepts of VPC"></a>Key Components and Concepts of VPC</h3><p><strong>1. Subnets:</strong></p>
<ul>
<li>Subnets are subdivisions within your VPC, each representing a range of IP addresses within the VPC. They help organize and isolate resources within different availability zones for high availability.</li>
</ul>
<p><strong>2. Internet Gateway:</strong></p>
<ul>
<li>An Internet Gateway is a horizontally scaled, redundant, and highly available VPC component that allows communication between instances in your VPC and the internet. It serves as a target in your VPC route tables for internet-routable traffic.</li>
</ul>
<p><strong>3. Route Tables:</strong></p>
<ul>
<li>Route tables contain a set of rules, called routes, that determine where network traffic is directed. Each subnet in your VPC must be associated with a route table, which controls the routing for the subnet.</li>
</ul>
<p><strong>4. Security Groups:</strong></p>
<ul>
<li>Security Groups act as virtual firewalls for your instances to control inbound and outbound traffic. They are stateful, meaning they track the state of network connections passing through them, allowing responses to outbound traffic automatically.</li>
</ul>
<p><strong>5. Network ACLs (Access Control Lists):</strong></p>
<ul>
<li>Network ACLs provide an additional layer of security at the subnet level, controlling inbound and outbound traffic to and from subnets. Unlike security groups, network ACLs are stateless, meaning they evaluate each packet that crosses the subnet boundary.</li>
</ul>
<p><strong>6. VPC Peering:</strong></p>
<ul>
<li>VPC Peering allows you to route traffic between VPCs using private IP addresses, facilitating communication between VPCs in the same or different AWS accounts and regions.</li>
</ul>
<p><strong>7. Transit Gateway:</strong></p>
<ul>
<li>AWS Transit Gateway acts as a central hub to connect multiple VPCs and on-premises networks. It simplifies network management by consolidating and controlling the routing between these environments.</li>
</ul>
<p><strong>8. AWS PrivateLink:</strong></p>
<ul>
<li>AWS PrivateLink enables private connectivity between VPCs, AWS services, and on-premises applications without exposing traffic to the public internet. This enhances security by keeping traffic within the AWS network.</li>
</ul>
<h3 id="Connecting-Your-AWS-Networking-Environment-to-the-Internet"><a href="#Connecting-Your-AWS-Networking-Environment-to-the-Internet" class="headerlink" title="Connecting Your AWS Networking Environment to the Internet"></a>Connecting Your AWS Networking Environment to the Internet</h3><p>To connect your VPC to the internet, you need to:</p>
<p><strong>1. Attach an Internet Gateway:</strong></p>
<ul>
<li>Attach an Internet Gateway to your VPC. This provides a target for internet-bound traffic from instances within your VPC.</li>
</ul>
<p><strong>2. Update Route Tables:</strong></p>
<ul>
<li>Add routes to your route tables that direct internet-bound traffic to the Internet Gateway.</li>
</ul>
<p><strong>3. Assign Public IP Addresses:</strong></p>
<ul>
<li>Ensure that instances that need to communicate with the internet have public IP addresses or Elastic IP addresses.</li>
</ul>
<p><strong>4. Configure Security Groups and Network ACLs:</strong></p>
<ul>
<li>Modify the inbound and outbound rules of your security groups and network ACLs to allow the necessary traffic to flow to and from the internet.</li>
</ul>
<h3 id="Isolating-Resources-Within-Your-AWS-Networking-Environment"><a href="#Isolating-Resources-Within-Your-AWS-Networking-Environment" class="headerlink" title="Isolating Resources Within Your AWS Networking Environment"></a>Isolating Resources Within Your AWS Networking Environment</h3><p>Isolation of resources within your AWS networking environment can be achieved through several strategies:</p>
<p><strong>1. Subnets:</strong></p>
<ul>
<li>Create public and private subnets to separate resources that need direct internet access from those that don’t. For example, public subnets can house web servers, while private subnets can house database servers.</li>
</ul>
<p><strong>2. Security Groups:</strong></p>
<ul>
<li>Configure security groups to control traffic to and from specific instances based on IP address, protocol, and port.</li>
</ul>
<p><strong>3. Network ACLs:</strong></p>
<ul>
<li>Use network ACLs to provide an additional layer of stateless traffic filtering at the subnet level.</li>
</ul>
<p><strong>4. VPC Peering and Transit Gateways:</strong></p>
<ul>
<li>Use VPC Peering or Transit Gateways to connect isolated VPCs while maintaining control over the traffic flow between them.</li>
</ul>
<h3 id="Creating-a-VPC-with-Subnets-an-Internet-Gateway-Route-Tables-and-Security-Groups"><a href="#Creating-a-VPC-with-Subnets-an-Internet-Gateway-Route-Tables-and-Security-Groups" class="headerlink" title="Creating a VPC with Subnets, an Internet Gateway, Route Tables, and Security Groups"></a>Creating a VPC with Subnets, an Internet Gateway, Route Tables, and Security Groups</h3><p>Here is a step-by-step outline to create a VPC and configure its components:</p>
<p><strong>1. Create a VPC:</strong></p>
<ul>
<li>Define your VPC with a specified IPv4 CIDR block.</li>
</ul>
<p><strong>2. Create Subnets:</strong></p>
<ul>
<li>Create public and private subnets within different availability zones to enhance fault tolerance.</li>
</ul>
<p><strong>3. Attach an Internet Gateway:</strong></p>
<ul>
<li>Attach an Internet Gateway to your VPC to allow internet access.</li>
</ul>
<p><strong>4. Update Route Tables:</strong></p>
<ul>
<li>Create route tables and associate them with your subnets. Add a route in the public subnet’s route table to direct internet-bound traffic to the Internet Gateway.</li>
</ul>
<p><strong>5. Configure Security Groups:</strong></p>
<ul>
<li>Create and configure security groups to control inbound and outbound traffic to your instances. For example, allow HTTP and HTTPS traffic for web servers, and limit SSH access to specific IP addresses.</li>
</ul>
<p><strong>6. Deploy Instances:</strong></p>
<ul>
<li>Launch EC2 instances within your subnets, applying the appropriate security groups and configuring their IP addresses as needed.</li>
</ul>
<p>By following these steps and understanding these concepts, you can effectively set up and manage a secure and scalable networking environment in AWS.</p>
<h3 id="References"><a href="#References" class="headerlink" title="References"></a>References</h3><ul>
<li><span class="exturl" data-url="aHR0cHM6Ly9kb2NzLmF3cy5hbWF6b24uY29tL3ZwYy9sYXRlc3QvdXNlcmd1aWRlL3doYXQtaXMtYW1hem9uLXZwYy5odG1s">AWS Documentation on Amazon VPC<i class="fa fa-external-link-alt"></i></span>.</li>
<li><span class="exturl" data-url="aHR0cHM6Ly9kb2NzLmF3cy5hbWF6b24uY29tL3ZwYy9sYXRlc3QvdXNlcmd1aWRlL3ZwYy1leGFtcGxlLXByaXZhdGUtc3VibmV0cy1uYXQuaHRtbA==">Example: VPC with servers in private subnets and NAT<i class="fa fa-external-link-alt"></i></span></li>
</ul>
]]></content>
      <tags>
        <tag>aws</tag>
        <tag>cloud computing</tag>
      </tags>
  </entry>
  <entry>
    <title>Connecting Networks in AWS: A Comprehensive Guide</title>
    <url>/posts/136966596/</url>
    <content><![CDATA[<h1 id="Connecting-Networks-in-AWS-A-Comprehensive-Guide"><a href="#Connecting-Networks-in-AWS-A-Comprehensive-Guide" class="headerlink" title="Connecting Networks in AWS: A Comprehensive Guide"></a>Connecting Networks in AWS: A Comprehensive Guide</h1><p>In today’s hybrid cloud environments, seamlessly connecting on-premises networks with the AWS Cloud, as well as interconnecting Virtual Private Clouds (VPCs) within AWS, is crucial for building scalable and efficient cloud architectures. In this blog, we will explore how to achieve these connections, scale VPCs, and integrate VPCs with supported AWS services.</p>
<p><img data-src="/images/posts/cloud/vpc-connect/connectivity-overview.png"></img><br>source: <span class="exturl" data-url="aHR0cHM6Ly9kb2NzLmF3cy5hbWF6b24uY29tL3ZwYy9sYXRlc3QvdXNlcmd1aWRlL2V4dGVuZC1pbnRyby5odG1s">https://docs.aws.amazon.com/vpc/latest/userguide/extend-intro.html<i class="fa fa-external-link-alt"></i></span></p>
<span id="more"></span>

<h2 id="Connecting-an-On-Premises-Network-to-the-AWS-Cloud"><a href="#Connecting-an-On-Premises-Network-to-the-AWS-Cloud" class="headerlink" title="Connecting an On-Premises Network to the AWS Cloud"></a>Connecting an On-Premises Network to the AWS Cloud</h2><p>Connecting your on-premises network to the AWS Cloud can be accomplished using several methods, each with its own set of benefits:</p>
<h3 id="AWS-Direct-Connect"><a href="#AWS-Direct-Connect" class="headerlink" title="AWS Direct Connect"></a>AWS Direct Connect</h3><p>AWS Direct Connect establishes a dedicated network connection from your premises to AWS. This private connection can offer lower latency, higher bandwidth, and a more consistent network experience compared to internet-based connections.</p>
<p><strong>Steps to set up AWS Direct Connect:</strong></p>
<ol>
<li><strong>Request a Connection:</strong> Sign up for AWS Direct Connect and request a dedicated connection.</li>
<li><strong>Set Up the Physical Connection:</strong> Connect your on-premises router to an AWS Direct Connect location using an Ethernet cable.</li>
<li><strong>Create a Virtual Interface:</strong> Define a virtual interface to establish a logical connection to the desired VPC or AWS service.</li>
<li><strong>Configure Routing:</strong> Set up Border Gateway Protocol (BGP) to route traffic between your on-premises network and AWS.</li>
</ol>
<h3 id="AWS-Site-to-Site-VPN"><a href="#AWS-Site-to-Site-VPN" class="headerlink" title="AWS Site-to-Site VPN"></a>AWS Site-to-Site VPN</h3><p>AWS Site-to-Site VPN allows you to create a secure, encrypted connection over the internet between your on-premises network and your AWS VPC.</p>
<p><strong>Steps to set up AWS Site-to-Site VPN:</strong></p>
<ol>
<li><strong>Create a Customer Gateway:</strong> Define your on-premises router as a customer gateway in the AWS Management Console.</li>
<li><strong>Create a Virtual Private Gateway:</strong> Attach a virtual private gateway to your VPC.</li>
<li><strong>Establish the VPN Connection:</strong> Set up the VPN connection between the customer gateway and the virtual private gateway.</li>
<li><strong>Configure Routing:</strong> Update your route tables to direct traffic through the VPN connection.</li>
</ol>
<h2 id="Connecting-VPCs-in-the-AWS-Cloud"><a href="#Connecting-VPCs-in-the-AWS-Cloud" class="headerlink" title="Connecting VPCs in the AWS Cloud"></a>Connecting VPCs in the AWS Cloud</h2><p>Connecting VPCs within AWS can be done using several methods, such as VPC Peering, AWS Transit Gateway, and PrivateLink. Here, we will focus on VPC Peering.</p>
<h3 id="VPC-Peering"><a href="#VPC-Peering" class="headerlink" title="VPC Peering"></a>VPC Peering</h3><p>VPC Peering allows you to connect two VPCs privately using AWS’s network, enabling you to route traffic between them using private IP addresses.</p>
<p><strong>Steps to set up VPC Peering:</strong></p>
<ol>
<li><strong>Create a Peering Connection:</strong> In the AWS Management Console, navigate to the VPC Dashboard and create a peering connection between the VPCs.</li>
<li><strong>Accept the Peering Request:</strong> The owner of the peer VPC must accept the peering connection request.</li>
<li><strong>Update Route Tables:</strong> Modify the route tables of both VPCs to route traffic through the peering connection.</li>
<li><strong>Update Security Groups:</strong> Adjust the security group rules to allow traffic between the peered VPCs.</li>
</ol>
<h2 id="Scaling-VPCs-in-the-AWS-Cloud"><a href="#Scaling-VPCs-in-the-AWS-Cloud" class="headerlink" title="Scaling VPCs in the AWS Cloud"></a>Scaling VPCs in the AWS Cloud</h2><p>As your cloud infrastructure grows, scaling your VPCs becomes essential. AWS provides several mechanisms to scale VPCs effectively:</p>
<h3 id="Subnet-Scaling"><a href="#Subnet-Scaling" class="headerlink" title="Subnet Scaling"></a>Subnet Scaling</h3><p>Divide your VPC into multiple subnets, each representing different availability zones (AZs). This allows for high availability and fault tolerance.</p>
<p><strong>Steps to scale using subnets:</strong></p>
<ol>
<li><strong>Plan Subnet IP Ranges:</strong> Allocate IP address ranges for each subnet.</li>
<li><strong>Create Subnets:</strong> In the AWS Management Console, create subnets in different AZs.</li>
<li><strong>Distribute Resources:</strong> Distribute your resources (e.g., EC2 instances) across these subnets for load balancing and redundancy.</li>
</ol>
<h3 id="Elastic-Load-Balancing-ELB"><a href="#Elastic-Load-Balancing-ELB" class="headerlink" title="Elastic Load Balancing (ELB)"></a>Elastic Load Balancing (ELB)</h3><p>ELB distributes incoming application traffic across multiple targets, such as EC2 instances, containers, and IP addresses, in one or more AZs.</p>
<p><strong>Steps to set up ELB:</strong></p>
<ol>
<li><strong>Create a Load Balancer:</strong> In the AWS Management Console, navigate to the EC2 Dashboard and create a load balancer.</li>
<li><strong>Configure Load Balancer:</strong> Specify the load balancer settings, such as listeners and security groups.</li>
<li><strong>Register Targets:</strong> Add your EC2 instances or other targets to the load balancer.</li>
<li><strong>Update Route Tables:</strong> Ensure that the route tables direct traffic to the load balancer.</li>
</ol>
<h2 id="Connecting-VPCs-to-Supported-AWS-Services"><a href="#Connecting-VPCs-to-Supported-AWS-Services" class="headerlink" title="Connecting VPCs to Supported AWS Services"></a>Connecting VPCs to Supported AWS Services</h2><p>AWS offers various services that can be integrated with your VPCs to enhance functionality and security. Some of these services include AWS Lambda, Amazon RDS, and Amazon S3.</p>
<h3 id="AWS-PrivateLink"><a href="#AWS-PrivateLink" class="headerlink" title="AWS PrivateLink"></a>AWS PrivateLink</h3><p>AWS PrivateLink enables you to privately access AWS services and third-party services from your VPC without using public IP addresses.</p>
<p><strong>Steps to set up AWS PrivateLink:</strong></p>
<ol>
<li><strong>Create a VPC Endpoint:</strong> In the AWS Management Console, create a VPC endpoint for the desired service.</li>
<li><strong>Configure Security Groups:</strong> Adjust security group rules to allow traffic between your VPC and the service endpoint.</li>
<li><strong>Update Route Tables:</strong> Add routes to direct traffic to the VPC endpoint.</li>
</ol>
<h3 id="AWS-Transit-Gateway"><a href="#AWS-Transit-Gateway" class="headerlink" title="AWS Transit Gateway"></a>AWS Transit Gateway</h3><p>AWS Transit Gateway connects multiple VPCs and on-premises networks through a central hub. This simplifies network architecture and management.</p>
<p><strong>Steps to set up AWS Transit Gateway:</strong></p>
<ol>
<li><strong>Create a Transit Gateway:</strong> In the AWS Management Console, create a transit gateway.</li>
<li><strong>Attach VPCs:</strong> Attach your VPCs to the transit gateway.</li>
<li><strong>Configure Route Tables:</strong> Set up transit gateway route tables to manage traffic flow.</li>
<li><strong>Establish VPN Connections:</strong> Optionally, create VPN connections between the transit gateway and your on-premises networks.</li>
</ol>
<p>By following these guidelines, you can create a robust and scalable network architecture that seamlessly connects your on-premises environments, VPCs, and AWS services, ensuring high availability, security, and performance.</p>
<h2 id="References"><a href="#References" class="headerlink" title="References"></a>References</h2><p><span class="exturl" data-url="aHR0cHM6Ly9kb2NzLmF3cy5hbWF6b24uY29tL3ZwYy9sYXRlc3QvdXNlcmd1aWRlL2V4dGVuZC1pbnRyby5odG1s">Connect your VPC to other networks<i class="fa fa-external-link-alt"></i></span></p>
]]></content>
      <tags>
        <tag>aws</tag>
        <tag>cloud computing</tag>
      </tags>
  </entry>
  <entry>
    <title>DataLake vs. Data Warehouse vs. Data Mart</title>
    <url>/posts/970127527/</url>
    <content><![CDATA[<h1 id="Understanding-the-Differences-Data-Lake-Data-Warehouse-and-Data-Mart"><a href="#Understanding-the-Differences-Data-Lake-Data-Warehouse-and-Data-Mart" class="headerlink" title="Understanding the Differences: Data Lake, Data Warehouse, and Data Mart"></a>Understanding the Differences: Data Lake, Data Warehouse, and Data Mart</h1><p>In today’s data-driven world, organizations leverage various data management systems to harness their data effectively. Among these systems, Data Lakes, Data Warehouses, and Data Marts are pivotal in supporting data storage, processing, and analysis. This article explores the key differences among these three systems, helping you understand their unique roles in data management.</p>
<span id="more"></span>

<p><img data-src="/images/posts/datascience/relation.png"></img><br>source: <span class="exturl" data-url="aHR0cHM6Ly9tZWRpdW0uY29tL0BkYXZpZC5hbHZhcmVzLjYyL2RhdGFsYWtlLWRhdGF3YXJlaG91c2UtZGF0YW1hcnQtd2l0aC1iaWdxdWVyeS0zMmY2YzM3MzVhOWQ=">https://medium.com/@david.alvares.62/datalake-datawarehouse-datamart-with-bigquery-32f6c3735a9d<i class="fa fa-external-link-alt"></i></span></p>
<h2 id="What-is-a-Data-Lake"><a href="#What-is-a-Data-Lake" class="headerlink" title="What is a Data Lake?"></a>What is a Data Lake?</h2><p>A Data Lake is a centralized repository designed to store a vast amount of raw data in its native format, including structured, semi-structured, and unstructured data. Data Lakes support high-scale elasticity and massive processing power, making them ideal for big data analytics where data exploration and discovery are required.</p>
<h3 id="Characteristics-of-a-Data-Lake"><a href="#Characteristics-of-a-Data-Lake" class="headerlink" title="Characteristics of a Data Lake:"></a>Characteristics of a Data Lake:</h3><ul>
<li><strong>Data Types:</strong> Supports all types of data (structured, semi-structured, unstructured).</li>
<li><strong>Processing:</strong> Data is kept in its raw form, and transformation occurs when needed (schema-on-read).</li>
<li><strong>Flexibility:</strong> Highly adaptable to changes and capable of storing vast amounts of data.</li>
<li><strong>Use Cases:</strong> Big data processing, real-time analytics, machine learning.</li>
</ul>
<h2 id="What-is-a-Data-Warehouse"><a href="#What-is-a-Data-Warehouse" class="headerlink" title="What is a Data Warehouse?"></a>What is a Data Warehouse?</h2><p>A Data Warehouse is a system used for reporting and data analysis. It is a central repository for integrated data from one or more disparate sources. Data Warehouses store current and historical data in one single place, which is used for creating analytical reports for knowledge workers throughout the enterprise.</p>
<h3 id="Characteristics-of-a-Data-Warehouse"><a href="#Characteristics-of-a-Data-Warehouse" class="headerlink" title="Characteristics of a Data Warehouse:"></a>Characteristics of a Data Warehouse:</h3><ul>
<li><strong>Data Types:</strong> Primarily structured data.</li>
<li><strong>Processing:</strong> Data is processed (ETL - Extract, Transform, Load) before entering the warehouse.</li>
<li><strong>Performance:</strong> Optimized for fast query performance and complex analytical queries.</li>
<li><strong>Use Cases:</strong> Business intelligence, reporting, complex queries.</li>
</ul>
<h2 id="What-is-a-Data-Mart"><a href="#What-is-a-Data-Mart" class="headerlink" title="What is a Data Mart?"></a>What is a Data Mart?</h2><p>A Data Mart is a subset of a data warehouse and is oriented to a specific business line or team. Unlike a data warehouse, which covers the entire organization, a data mart is limited to certain aspects.</p>
<h3 id="Characteristics-of-a-Data-Mart"><a href="#Characteristics-of-a-Data-Mart" class="headerlink" title="Characteristics of a Data Mart:"></a>Characteristics of a Data Mart:</h3><ul>
<li><strong>Data Types:</strong> Structured data.</li>
<li><strong>Scope:</strong> Focused on a specific department or business area.</li>
<li><strong>Performance:</strong> Optimized for quick response times on specific queries.</li>
<li><strong>Use Cases:</strong> Department-specific reporting and analysis.</li>
</ul>
<h2 id="Comparative-Analysis"><a href="#Comparative-Analysis" class="headerlink" title="Comparative Analysis"></a>Comparative Analysis</h2><p>The table below provides a comparison to help delineate the differences among these three data architectures:</p>
<table>
<thead>
<tr>
<th>Feature</th>
<th>Data Lake</th>
<th>Data Warehouse</th>
<th>Data Mart</th>
</tr>
</thead>
<tbody><tr>
<td><strong>Purpose</strong></td>
<td>Data exploration and large-scale analytics</td>
<td>Structured data analysis and reporting</td>
<td>Specific business function analysis</td>
</tr>
<tr>
<td><strong>Data Types</strong></td>
<td>All types (structured, semi, unstructured)</td>
<td>Primarily structured data</td>
<td>Structured data</td>
</tr>
<tr>
<td><strong>Processing</strong></td>
<td>Schema-on-read (transform on demand)</td>
<td>Schema-on-write (pre-transformed)</td>
<td>Typically pre-transformed data from a data warehouse</td>
</tr>
<tr>
<td><strong>Scope</strong></td>
<td>Enterprise-wide</td>
<td>Enterprise-wide</td>
<td>Department-specific</td>
</tr>
<tr>
<td><strong>Storage Cost</strong></td>
<td>Low</td>
<td>High</td>
<td>Moderate</td>
</tr>
<tr>
<td><strong>Complexity</strong></td>
<td>High (due to diverse data types)</td>
<td>Moderate</td>
<td>Low</td>
</tr>
<tr>
<td><strong>Best For</strong></td>
<td>Big data projects, ML, real-time analytics</td>
<td>Historical data analysis, BI</td>
<td>Focused BI tasks within departments</td>
</tr>
</tbody></table>
<h2 id="Relationships-Among-Data-Lakes-Data-Warehouses-and-Data-Marts"><a href="#Relationships-Among-Data-Lakes-Data-Warehouses-and-Data-Marts" class="headerlink" title="Relationships Among Data Lakes, Data Warehouses, and Data Marts"></a>Relationships Among Data Lakes, Data Warehouses, and Data Marts</h2><p>Understanding the relationships among Data Lakes, Data Warehouses, and Data Marts is crucial for structuring an effective data strategy.</p>
<h3 id="Hierarchical-Relationship"><a href="#Hierarchical-Relationship" class="headerlink" title="Hierarchical Relationship"></a>Hierarchical Relationship</h3><ol>
<li><p><strong>Data Lake to Data Warehouse</strong>:</p>
<ul>
<li>A <strong>Data Lake</strong> stores all raw data, both structured and unstructured. It is the initial repository for all incoming data.</li>
<li>A <strong>Data Warehouse</strong> is curated from the data lake. Data here is cleaned and structured, optimized for efficient querying and analysis.</li>
</ul>
</li>
<li><p><strong>Data Warehouse to Data Mart</strong>:</p>
<ul>
<li>A <strong>Data Warehouse</strong> contains integrated data from multiple sources for the entire organization.</li>
<li><strong>Data Marts</strong> are subsets of data warehouses tailored to specific departments, facilitating faster and more relevant data access.</li>
</ul>
</li>
</ol>
<h3 id="Use-Case-Relationship"><a href="#Use-Case-Relationship" class="headerlink" title="Use Case Relationship"></a>Use Case Relationship</h3><ul>
<li><strong>Data Lake</strong>: Ideal for massive, raw datasets used in data exploration and big data projects.</li>
<li><strong>Data Warehouse</strong>: Best for regular, consistent reporting and structured data analysis across the organization.</li>
<li><strong>Data Mart</strong>: Suited for targeted, department-specific analysis, enabling quick access to relevant data.</li>
</ul>
<h3 id="Operational-Efficiency"><a href="#Operational-Efficiency" class="headerlink" title="Operational Efficiency"></a>Operational Efficiency</h3><ul>
<li>Using all three—Data Lakes, Data Warehouses, and Data Marts—enhances data management across different levels, from raw data collection to specific business analytics.</li>
</ul>
<p>This structure ensures organizations can manage vast data efficiently, supporting diverse business needs from exploratory analytics to precise departmental reporting.</p>
<h2 id="Conclusion"><a href="#Conclusion" class="headerlink" title="Conclusion"></a>Conclusion</h2><p>Each of these systems serves distinct purposes and is best suited for different aspects of data management. A Data Lake is ideal for raw, large-scale data exploration, a Data Warehouse is suited for enterprise-wide insights from structured data, and a Data Mart is optimal for department-specific analysis. Understanding these differences can help organizations choose the right architecture to meet their data management and analytical needs.</p>
]]></content>
      <tags>
        <tag>datascience</tag>
      </tags>
  </entry>
  <entry>
    <title>Understanding 3 types of t-Tests</title>
    <url>/posts/1220979308/</url>
    <content><![CDATA[<h1 id="Understanding-3-types-of-t-Tests"><a href="#Understanding-3-types-of-t-Tests" class="headerlink" title="Understanding 3 types of t-Tests"></a>Understanding 3 types of t-Tests</h1><p>When it comes to determining the significance of findings in data science, t-tests are a commonly used statistical method. They help compare means and assess whether the differences between groups or conditions are statistically significant. In this article, we’ll explore the three main types of t-tests: independent (two-sample), single-sample, and paired-sample t-tests. We’ll delve into their concepts, hypotheses, and mathematical representations to provide a clear understanding of when and how to use each type.</p>
<span id="more"></span>

<h2 id="The-Concept-of-t-Tests"><a href="#The-Concept-of-t-Tests" class="headerlink" title="The Concept of t-Tests"></a>The Concept of t-Tests</h2><p>A t-test is a statistical hypothesis test that follows a Student’s t-distribution under the null hypothesis. The primary goal is to determine <strong>if there is a significant difference between the means of two groups or between a sample mean and a known value</strong>. The t-test relies on several key concepts:</p>
<h3 id="Central-Limit-Theorem"><a href="#Central-Limit-Theorem" class="headerlink" title="Central Limit Theorem"></a>Central Limit Theorem</h3><p>The Central Limit Theorem (CLT) is a fundamental principle in statistics that states that <strong>the distribution of the sample mean will approximate a normal distribution as the sample size becomes large, regardless of the shape of the population distribution</strong>. This property allows us to make inferences about population parameters using sample data. In the context of t-tests, the CLT justifies using the t-distribution when sample sizes are <strong>relatively small</strong> and the <strong>population standard deviation is unknown</strong>. (because of CLT, we can use confidence interval to justify t-test)</p>
<h3 id="Student’s-t-Distribution"><a href="#Student’s-t-Distribution" class="headerlink" title="Student’s t-Distribution"></a>Student’s t-Distribution</h3><p>The t-distribution is similar to the normal distribution but has heavier tails, which means it is more prone to producing values that fall far from its mean. This characteristic makes it particularly useful for small sample sizes. As the sample size increases, the t-distribution approaches the normal distribution.</p>
<h3 id="Degrees-of-Freedom"><a href="#Degrees-of-Freedom" class="headerlink" title="Degrees of Freedom"></a>Degrees of Freedom</h3><p>Degrees of freedom (df) refer to <strong>the number of independent values in a calculation that are free to vary</strong>. In a t-test, the degrees of freedom <strong>depend on the sample size(s)</strong> and are used to determine the critical value from the t-distribution.</p>
<h2 id="1-Independent-Two-Sample-t-Test"><a href="#1-Independent-Two-Sample-t-Test" class="headerlink" title="1. Independent (Two-Sample) t-Test"></a>1. Independent (Two-Sample) t-Test</h2><h3 id="Concept"><a href="#Concept" class="headerlink" title="Concept"></a>Concept</h3><p>The independent t-test, also known as the two-sample t-test, is used to compare the means of <strong>two independent groups</strong>. This test determines whether the means of the two groups are significantly different from each other.</p>
<h3 id="Hypothesis"><a href="#Hypothesis" class="headerlink" title="Hypothesis"></a>Hypothesis</h3><ul>
<li><strong>Null Hypothesis ($H_0$)</strong>: The means of the two groups are equal ($\mu_1 &#x3D; \mu_2$).</li>
<li><strong>Alternative Hypothesis ($H_A$)</strong>: The means of the two groups are not equal ($\mu_1 \neq \mu_2$).</li>
</ul>
<h3 id="Mathematical-Representation"><a href="#Mathematical-Representation" class="headerlink" title="Mathematical Representation"></a>Mathematical Representation</h3><p>The formula for the independent t-test is:</p>
<p>$$ t &#x3D; \frac{\bar{X}_1 - \bar{X}_2}{\sqrt{\frac{s_1^2}{n_1} + \frac{s_2^2}{n_2}}} $$</p>
<p>Where:</p>
<ul>
<li>$\bar{X}_1$ and $\bar{X}_2$ are the sample means of groups 1 and 2.</li>
<li>$s_1^2$ and $s_2^2$ are the sample variances of groups 1 and 2.</li>
<li>$n_1$ and $n_2$ are the sample sizes of groups 1 and 2.</li>
</ul>
<h2 id="2-Single-Sample-t-Test"><a href="#2-Single-Sample-t-Test" class="headerlink" title="2. Single-Sample t-Test"></a>2. Single-Sample t-Test</h2><h3 id="Concept-1"><a href="#Concept-1" class="headerlink" title="Concept"></a>Concept</h3><p>The single-sample t-test is used to compare the mean of <strong>a single sample to a known value or a theoretical mean</strong>. This test assesses whether the sample mean significantly differs from the hypothesized population mean.</p>
<h3 id="Hypothesis-1"><a href="#Hypothesis-1" class="headerlink" title="Hypothesis"></a>Hypothesis</h3><ul>
<li><strong>Null Hypothesis ($H_0$)</strong>: The sample mean is equal to the population mean ($\mu &#x3D; \mu_0$).</li>
<li><strong>Alternative Hypothesis ($H_A$)</strong>: The sample mean is not equal to the population mean ($\mu \neq \mu_0$).</li>
</ul>
<h3 id="Mathematical-Representation-1"><a href="#Mathematical-Representation-1" class="headerlink" title="Mathematical Representation"></a>Mathematical Representation</h3><p>The formula for the single-sample t-test is:</p>
<p>$$ t &#x3D; \frac{\bar{X} - \mu_0}{\frac{s}{\sqrt{n}}} $$</p>
<p>Where:</p>
<ul>
<li>$\bar{X}$ is the sample mean.</li>
<li>$\mu_0$ is the hypothesized population mean.</li>
<li>$s$ is the sample standard deviation.</li>
<li>$n$ is the sample size.</li>
</ul>
<h2 id="3-Paired-Sample-t-Test"><a href="#3-Paired-Sample-t-Test" class="headerlink" title="3. Paired-Sample t-Test"></a>3. Paired-Sample t-Test</h2><h3 id="Concept-2"><a href="#Concept-2" class="headerlink" title="Concept"></a>Concept</h3><p>The paired-sample t-test, also known as the dependent t-test, is used to <strong>compare the means of two related groups</strong>. This test is often used in pre-test&#x2F;post-test scenarios or when comparing measurements taken on the <strong>same subjects under different conditions</strong>.</p>
<h3 id="Hypothesis-2"><a href="#Hypothesis-2" class="headerlink" title="Hypothesis"></a>Hypothesis</h3><ul>
<li><strong>Null Hypothesis ($H_0$)</strong>: The mean difference between the paired observations is zero ($\mu_D &#x3D; 0 $).</li>
<li><strong>Alternative Hypothesis ($H_A$)</strong>: The mean difference between the paired observations is not zero ($\mu_D \neq 0$).</li>
<li>$\mu_D &#x3D; \mu_1 - \mu_2$</li>
</ul>
<h3 id="Mathematical-Representation-2"><a href="#Mathematical-Representation-2" class="headerlink" title="Mathematical Representation"></a>Mathematical Representation</h3><p>The formula for the paired-sample t-test is:</p>
<p>$$ t &#x3D; \frac{\bar{D}}{\frac{s_D}{\sqrt{n}}} $$</p>
<p>Where:</p>
<ul>
<li>$\bar{D}$ is the mean of the differences between paired observations.</li>
<li>$s_D$ is the standard deviation of the differences.</li>
<li>$n$ is the number of paired observations.</li>
</ul>
<h2 id="Python-Code-for-t-Tests"><a href="#Python-Code-for-t-Tests" class="headerlink" title="Python Code for t-Tests"></a>Python Code for t-Tests</h2><p>To perform these t-tests in Python, you can use the <code>scipy.stats</code> library, which provides convenient functions for each type of t-test. Below is the sample code for each t-test:</p>
<h3 id="Independent-Two-Sample-t-Test"><a href="#Independent-Two-Sample-t-Test" class="headerlink" title="Independent (Two-Sample) t-Test"></a>Independent (Two-Sample) t-Test</h3><figure class="highlight python"><table><tr><td class="code"><pre><span class="line"><span class="keyword">import</span> numpy <span class="keyword">as</span> np</span><br><span class="line"><span class="keyword">from</span> scipy <span class="keyword">import</span> stats</span><br><span class="line"></span><br><span class="line"><span class="comment"># Sample data for two independent groups</span></span><br><span class="line">group1 = np.array([<span class="number">12</span>, <span class="number">14</span>, <span class="number">16</span>, <span class="number">18</span>, <span class="number">20</span>])</span><br><span class="line">group2 = np.array([<span class="number">22</span>, <span class="number">24</span>, <span class="number">26</span>, <span class="number">28</span>, <span class="number">30</span>])</span><br><span class="line"></span><br><span class="line"><span class="comment"># Perform independent (two-sample) t-test</span></span><br><span class="line">t_stat, p_value = stats.ttest_ind(group1, group2)</span><br><span class="line"><span class="built_in">print</span>(<span class="string">f&quot;Independent t-test: t-statistic = <span class="subst">&#123;t_stat&#125;</span>, p-value = <span class="subst">&#123;p_value&#125;</span>&quot;</span>)</span><br></pre></td></tr></table></figure>

<h3 id="Single-Sample-t-Test"><a href="#Single-Sample-t-Test" class="headerlink" title="Single-Sample t-Test"></a>Single-Sample t-Test</h3><figure class="highlight python"><table><tr><td class="code"><pre><span class="line"><span class="keyword">import</span> numpy <span class="keyword">as</span> np</span><br><span class="line"><span class="keyword">from</span> scipy <span class="keyword">import</span> stats</span><br><span class="line"></span><br><span class="line"><span class="comment"># Sample data</span></span><br><span class="line">data = np.array([<span class="number">12</span>, <span class="number">14</span>, <span class="number">16</span>, <span class="number">18</span>, <span class="number">20</span>])</span><br><span class="line"></span><br><span class="line"><span class="comment"># Population mean to compare against</span></span><br><span class="line">population_mean = <span class="number">15</span></span><br><span class="line"></span><br><span class="line"><span class="comment"># Perform single-sample t-test</span></span><br><span class="line">t_stat, p_value = stats.ttest_1samp(data, population_mean)</span><br><span class="line"><span class="built_in">print</span>(<span class="string">f&quot;Single-sample t-test: t-statistic = <span class="subst">&#123;t_stat&#125;</span>, p-value = <span class="subst">&#123;p_value&#125;</span>&quot;</span>)</span><br></pre></td></tr></table></figure>

<h3 id="Paired-Sample-t-Test"><a href="#Paired-Sample-t-Test" class="headerlink" title="Paired-Sample t-Test"></a>Paired-Sample t-Test</h3><figure class="highlight python"><table><tr><td class="code"><pre><span class="line"><span class="keyword">import</span> numpy <span class="keyword">as</span> np</span><br><span class="line"><span class="keyword">from</span> scipy <span class="keyword">import</span> stats</span><br><span class="line"></span><br><span class="line"><span class="comment"># Sample data for paired observations</span></span><br><span class="line">before_treatment = np.array([<span class="number">12</span>, <span class="number">14</span>, <span class="number">16</span>, <span class="number">18</span>, <span class="number">20</span>])</span><br><span class="line">after_treatment = np.array([<span class="number">22</span>, <span class="number">24</span>, <span class="number">26</span>, <span class="number">28</span>, <span class="number">30</span>])</span><br><span class="line"></span><br><span class="line"><span class="comment"># Perform paired-sample t-test</span></span><br><span class="line">t_stat, p_value = stats.ttest_rel(before_treatment, after_treatment)</span><br><span class="line"><span class="built_in">print</span>(<span class="string">f&quot;Paired-sample t-test: t-statistic = <span class="subst">&#123;t_stat&#125;</span>, p-value = <span class="subst">&#123;p_value&#125;</span>&quot;</span>)</span><br></pre></td></tr></table></figure>

<h2 id="Conclusion"><a href="#Conclusion" class="headerlink" title="Conclusion"></a>Conclusion</h2><p>Each type of t-test serves a specific purpose and is suited for different kinds of data and hypotheses. By understanding the concepts, hypotheses, and mathematical representations of the independent t-test, single-sample t-test, and paired-sample t-test, you can choose the appropriate test for your data analysis project. Using these tests effectively will enable you to draw meaningful conclusions and validate your findings with statistical significance.</p>
]]></content>
      <tags>
        <tag>datascience</tag>
        <tag>statistic</tag>
      </tags>
  </entry>
  <entry>
    <title>[Git][Version Control] Git - Introduction</title>
    <url>/posts/2571684277/</url>
    <content><![CDATA[<h1 id="Foreword"><a href="#Foreword" class="headerlink" title="Foreword"></a>Foreword</h1><p>Here is the series of Git full tutorial. I will cover the important and advanced topics that you should know in git. By reading this series, you are able to know the basic(but important) concept of Git and the technique of using git. Hope you can gain a lot ! </p>
<span id="more"></span>

<h1 id="Series-Outline"><a href="#Series-Outline" class="headerlink" title="Series Outline"></a>Series Outline</h1><p>I will follow the outline below in this series.</p>
<ul>
<li>Introduction (this post)</li>
<li>Setup</li>
<li>Version Control &amp; Commands (Basics)</li>
<li>Branching &amp; Merging</li>
<li>Conflict</li>
<li>Collaboration &amp; GitHub</li>
<li>Advanced Topics</li>
</ul>
<hr>
<h1 id="Introduction-to-Git-A-Distributed-Version-Control-System"><a href="#Introduction-to-Git-A-Distributed-Version-Control-System" class="headerlink" title="Introduction to Git: A Distributed Version Control System"></a>Introduction to Git: A Distributed Version Control System</h1><p>Git is a powerful and widely used distributed version control system designed to track changes in software development projects. It provides developers with the ability to collaborate on projects efficiently, manage codebases effectively, and easily track and revert changes.</p>
<h2 id="Distributed-Version-Control-System"><a href="#Distributed-Version-Control-System" class="headerlink" title="Distributed Version Control System"></a>Distributed Version Control System</h2><p>In traditional software development, changes to a codebase were often managed through file-based systems or centralized version control systems. However, these approaches had limitations in terms of collaboration, scalability, and flexibility. Git emerged as a solution to these challenges, offering a distributed and decentralized approach to version control.</p>
<p>The special point of Git is that it supports distributed version control and can also easily share codes among multiple nodes.</p>
<p><img data-src="/images/posts/Git-Series/dist-ver-ctr.png" 
style="width: 70%; margin: 15px auto;"></p>
<h2 id="Branch"><a href="#Branch" class="headerlink" title="Branch"></a>Branch</h2><p>Git manages code based on a tree structure, allowing branches to be created, merged, and deleted, and code versions to be saved at any time.</p>
<p><img data-src="/images/posts/Git-Series/branch.png" 
style="width: 70%; margin: 15px auto;"></p>
<h2 id="Snapshot"><a href="#Snapshot" class="headerlink" title="Snapshot"></a>Snapshot</h2><p>Git saves full snapshots of files, rather than differences, making it easy to restore all historical versions of files.</p>
<p><img data-src="/images/posts/Git-Series/snapshot.jpeg" 
style="width: 70%; margin: 15px auto;"></p>
<hr>
<h1 id="Recommended-Resources"><a href="#Recommended-Resources" class="headerlink" title="Recommended Resources"></a>Recommended Resources</h1><p><a href="%22https://www.youtube.com/watch?v=8JJ101D3knE%22">Mosh Git Tutorail</a></p>
]]></content>
      <categories>
        <category>Git 系列文</category>
      </categories>
      <tags>
        <tag>Git</tag>
        <tag>Version Control</tag>
      </tags>
  </entry>
  <entry>
    <title>[Git][Version Control] Git - Setup</title>
    <url>/posts/731175240/</url>
    <content><![CDATA[<h1 id="Foreword"><a href="#Foreword" class="headerlink" title="Foreword"></a>Foreword</h1><p>Setting up the environment of Git and some recommended GUI tools.</p>
<span id="more"></span>

<h1 id="Install"><a href="#Install" class="headerlink" title="Install"></a>Install</h1><ol>
<li><p>Install Git with the link below: (with your OS)<br><a href="%22https://git-scm.com/%22">Git official website</a></p>
</li>
<li><p>Type this command to check if it is installed successfully:</p>
<figure class="highlight shell"><table><tr><td class="code"><pre><span class="line">git --version</span><br></pre></td></tr></table></figure></li>
</ol>
<h1 id="Configure-Git-type-commands-in-your-shell"><a href="#Configure-Git-type-commands-in-your-shell" class="headerlink" title="Configure Git (type commands in your shell)"></a>Configure Git (type commands in your shell)</h1><ol>
<li><p>Set your user name by running the following command:</p>
<figure class="highlight shell"><table><tr><td class="code"><pre><span class="line">git config --global user.name &quot;Your Name&quot;</span><br></pre></td></tr></table></figure>
</li>
<li><p>Set your email address by running the following command:</p>
<figure class="highlight shell"><table><tr><td class="code"><pre><span class="line">git config --global user.email youremail@example.com</span><br></pre></td></tr></table></figure></li>
<li><p>Set up SSH key (for remote repository access. e.g. GitHub)</p>
<figure class="highlight shell"><table><tr><td class="code"><pre><span class="line">ssh-keygen -t rsa -b 4096 -C &quot;your_email@example.com&quot;</span><br></pre></td></tr></table></figure></li>
</ol>
<ul>
<li><p>Press Enter to accept the default file location and passphrase (or set your own if desired).<br>Once the key is generated, run the following command to add it to the SSH agent</p>
<figure class="highlight shell"><table><tr><td class="code"><pre><span class="line">ssh-add ~/.ssh/id_rsa</span><br></pre></td></tr></table></figure>
</li>
<li><p>Copy the public key to your clipboard by running the following command</p>
<figure class="highlight shell"><table><tr><td class="code"><pre><span class="line">cat ~/.ssh/id_rsa.pub</span><br></pre></td></tr></table></figure>
</li>
<li><p>Add the copied public key to your Git hosting platform account (e.g., GitHub, GitLab) by following their documentation.</p>
</li>
<li><p>View your ssh keys</p>
<figure class="highlight shell"><table><tr><td class="code"><pre><span class="line">cd ~/.ssh</span><br></pre></td></tr></table></figure>
<figure class="highlight shell"><table><tr><td class="code"><pre><span class="line">ls</span><br></pre></td></tr></table></figure>
<ul>
<li>Use a text editor or command-line tools (such as cat or less) to view the contents of the .pub files and see the public key information: <figure class="highlight shell"><table><tr><td class="code"><pre><span class="line">cat id_rsa.pub</span><br></pre></td></tr></table></figure></li>
</ul>
</li>
</ul>
<ol start="4">
<li>You can see your configuration by typing the following commands:<figure class="highlight shell"><table><tr><td class="code"><pre><span class="line">git config --list</span><br></pre></td></tr></table></figure></li>
</ol>
<h1 id="Recommended-GUI-tools"><a href="#Recommended-GUI-tools" class="headerlink" title="Recommended GUI tools"></a>Recommended GUI tools</h1><p>Both commands and GUI tools are important to Git manupulation.<br>And Here are some recommended Git GUI tools that you can use.</p>
<ul>
<li><a href="%22https://www.sourcetreeapp.com/%22">Sourcetree</a></li>
<li><a href="%22https://www.gitkraken.com/%22">GitKraken</a></li>
</ul>
]]></content>
      <categories>
        <category>Git 系列文</category>
      </categories>
      <tags>
        <tag>Git</tag>
        <tag>Version Control</tag>
      </tags>
  </entry>
  <entry>
    <title>[SD][ML] ControlNet in StableDiffusion: A Comprehensive Guide</title>
    <url>/posts/2736514643/</url>
    <content><![CDATA[<h1 id="Overview"><a href="#Overview" class="headerlink" title="Overview"></a>Overview</h1><p>ControlNet is an innovative addition to the StableDiffusion model that enhances the model’s ability to generate high-quality images with <strong>specific control over the content and structure</strong>. This guide delves into the fundamentals of ControlNet, how it works, implementation details, and training your own ControlNet. The following sections provide an in-depth look at each aspect of this powerful tool.</p>
<img data-src="/images/posts/SD-series/controlnet/1.0.png"> 
source: https://arxiv.org/abs/2302.05543

<span id="more"></span>

<h1 id="What-is-ControlNet"><a href="#What-is-ControlNet" class="headerlink" title="What is ControlNet?"></a>What is ControlNet?</h1><blockquote>
<p>ControlNet is a neural network structure that adds conditional control to StableDiffusion models.</p>
</blockquote>
<p>It allows users to influence the generation process of the diffusion model by <strong>injecting additional information</strong>, such as <strong>edge maps, poses, or depth maps</strong>, into the model. This ensures that the generated images adhere more closely to the desired specifications, leading to more accurate and contextually relevant outputs.</p>
<p>Use OpenCV to capture the cany edge from an image:<br><img data-src="/images/posts/SD-series/controlnet/cany.png"><br>source: <span class="exturl" data-url="aHR0cHM6Ly9odWdnaW5nZmFjZS5jby9kb2NzL2RpZmZ1c2Vycy91c2luZy1kaWZmdXNlcnMvY29udHJvbG5ldA==">https://huggingface.co/docs/diffusers/using-diffusers/controlnet<i class="fa fa-external-link-alt"></i></span></p>
<p>Use cany edge as a condition input and use ControlNet to generate a new image:<br><img data-src="/images/posts/SD-series/controlnet/cany_result.png"><br>source: <span class="exturl" data-url="aHR0cHM6Ly9odWdnaW5nZmFjZS5jby9kb2NzL2RpZmZ1c2Vycy91c2luZy1kaWZmdXNlcnMvY29udHJvbG5ldA==">https://huggingface.co/docs/diffusers/using-diffusers/controlnet<i class="fa fa-external-link-alt"></i></span></p>
<h1 id="ControlNet-Structure-1-0"><a href="#ControlNet-Structure-1-0" class="headerlink" title="ControlNet Structure (1.0)"></a>ControlNet Structure (1.0)</h1><blockquote>
<p>ControlNet is a neural network structure to control diffusion models by adding extra conditions.</p>
</blockquote>
<img data-src="/images/posts/SD-series/controlnet/1.0.png"> 
source: https://arxiv.org/abs/2302.05543

<ul>
<li>It duplicates the weights of neural network blocks into two copies: one “locked” and one “trainable.”</li>
<li>The “trainable” copy adapts to your specific condition, while the “locked” copy maintains the integrity of your original model.</li>
<li>This approach ensures that training with a small dataset of image pairs won’t compromise the production-ready diffusion models.</li>
<li>The “zero convolution” is a 1×1 convolution with weights and biases initialized to zero.</li>
<li>Prior to training, all zero convolutions produce zeros, preventing any initial distortion from ControlNet.</li>
<li>No layers are trained from scratch; instead, you are fine-tuning, keeping your original model intact.</li>
<li>This method allows for training on small-scale or even personal devices.</li>
</ul>
<hr>
<p>After initialization, the untrained ControlNet parameters should be as follows:</p>
<!-- Math equation conflict: https://theme-next.js.org/docs/third-party-services/math-equations -->
<p>$$<br>\left\{<br>    \begin{array}{l}<br>        \mathcal{Z}\left(\boldsymbol{c} ; \Theta_{\mathrm{z}1}\right) &#x3D; 0 \\<br>        \\<br>        \mathcal{F}\left(x + \mathcal{Z}\left(\boldsymbol{c} ; \Theta_{\mathrm{z}1}\right); \Theta_{\mathrm{c}}\right) \\ &#x3D; \mathcal{F}\left(x ; \Theta_{\mathrm{c}}\right) \\ &#x3D; \mathcal{F}(x ; \Theta) \\<br>        \\<br>        \mathcal{Z}\left(\mathcal{F}\left(x + \mathcal{Z}\left(\boldsymbol{c} ; \Theta_{\mathrm{z}1}\right); \Theta_{\mathrm{c}}\right); \Theta_{\mathrm{z}2}\right) \\ &#x3D; \mathcal{Z}\left(\mathcal{F}\left(x ; \Theta_{\mathrm{c}}\right); \Theta_{\mathrm{z}2}\right) &#x3D; 0<br>    \end{array}<br>\right.<br>$$</p>
<ul>
<li>That is to say, when ControlNet is untrained, the output is 0, so the numbers added to the original network are also 0. </li>
<li>This has no impact on the original network, ensuring that the performance of the original network is fully preserved. </li>
<li>Subsequent training of ControlNet only optimizes the original network, which can be considered equivalent to fine-tuning the network.</li>
</ul>
<h1 id="ControlNet-in-StableDiffusion"><a href="#ControlNet-in-StableDiffusion" class="headerlink" title="ControlNet in StableDiffusion"></a>ControlNet in StableDiffusion</h1><img data-src="/images/posts/SD-series/controlnet/sd_ctn.png"> 
source: https://arxiv.org/pdf/2302.05543

<p>The previous section described how ControlNet controls individual neural network blocks. In the paper, the authors used Stable Diffusion as an example to explain how ControlNet can control large networks. The following figure shows that the process of controlling Stable Diffusion involves copying and training the encoder while using skip connections in the decoder.</p>
<p>Before proceeding, it is important to note:<br>Stable Diffusion has a preprocessing step where a 512×512 image is converted to a 64×64 image before training(for computing efficient purpose). To ensure that the control conditions are also mapped to the 64×64 conditional space, a small network $\mathcal{E}$ is added during training to convert the image space conditions to feature map conditions.<br>$$c_f &#x3D; \mathcal{E}(c_i)$$<br>This network $\mathcal{E}$ is a four-layer convolutional neural network with 4×4 kernels, a stride of 2, and channels 16, 32, 64, 128, initialized with Gaussian weights. This network is jointly trained with the entire ControlNet.</p>
<h1 id="Workflow-Wrapup"><a href="#Workflow-Wrapup" class="headerlink" title="Workflow Wrapup"></a>Workflow Wrapup</h1><ol>
<li>Input Processing: ControlNet takes an additional input, such as a sketch or a pose, along with the standard textual input.</li>
<li>Feature Extraction: It extracts features from the control signal using a separate encoder.</li>
<li>Conditional Guidance: These features are then combined with the features extracted from the text prompt.</li>
<li>Diffusion Process: The combined features guide the diffusion process, ensuring that the generated image aligns with both the text and the control input.</li>
<li>This mechanism allows for precise control over the content and structure of the generated images, making ControlNet a powerful tool for various applications, from art creation to realistic image synthesis.</li>
</ol>
<h1 id="How-to-implement"><a href="#How-to-implement" class="headerlink" title="How to implement?"></a>How to implement?</h1><ol>
<li><p>Install the required libraries:</p>
 <figure class="highlight python"><table><tr><td class="code"><pre><span class="line">pip install diffusers transformers</span><br></pre></td></tr></table></figure>
</li>
<li><p>Load the StableDiffusion and ControlNet models:</p>
 <figure class="highlight python"><table><tr><td class="code"><pre><span class="line"><span class="keyword">from</span> diffusers <span class="keyword">import</span> StableDiffusionPipeline, ControlNetModel</span><br><span class="line"></span><br><span class="line"><span class="comment"># Load the models</span></span><br><span class="line">stable_diffusion_model = StableDiffusionPipeline.from_pretrained(<span class="string">&quot;CompVis/stable-diffusion-v1-4&quot;</span>)</span><br><span class="line">controlnet_model = ControlNetModel.from_pretrained(<span class="string">&quot;lllyasviel/controlnet&quot;</span>)</span><br><span class="line"></span><br><span class="line"><span class="comment"># Integrate ControlNet with StableDiffusion</span></span><br><span class="line">stable_diffusion_model.unet.load_additional_model(controlnet_model)</span><br></pre></td></tr></table></figure>
</li>
<li><p>Prepare the inputs</p>
 <figure class="highlight python"><table><tr><td class="code"><pre><span class="line"><span class="keyword">from</span> PIL <span class="keyword">import</span> Image</span><br><span class="line"><span class="keyword">import</span> torch</span><br><span class="line"></span><br><span class="line"><span class="comment"># Load and preprocess the control image (e.g., an edge map or pose)</span></span><br><span class="line">control_image = Image.<span class="built_in">open</span>(<span class="string">&quot;path_to_control_image.jpg&quot;</span>).convert(<span class="string">&quot;RGB&quot;</span>)</span><br><span class="line">control_tensor = torch.tensor(control_image).unsqueeze(<span class="number">0</span>)  <span class="comment"># Convert to tensor and add batch dimension</span></span><br><span class="line"></span><br><span class="line"><span class="comment"># Prepare the text prompt</span></span><br><span class="line">prompt = <span class="string">&quot;A futuristic cityscape with towering skyscrapers&quot;</span></span><br></pre></td></tr></table></figure>
</li>
<li><p>Generate the image</p>
 <figure class="highlight python"><table><tr><td class="code"><pre><span class="line"><span class="comment"># Generate the image with ControlNet guidance</span></span><br><span class="line">generated_image = stable_diffusion_model(prompt=prompt, control_image=control_tensor)</span><br><span class="line">generated_image.save(<span class="string">&quot;output_image.jpg&quot;</span>)</span><br><span class="line"><span class="comment"># display(generated_image)</span></span><br></pre></td></tr></table></figure></li>
</ol>
<p>This code demonstrates the basic implementation of ControlNet with StableDiffusion, highlighting the simplicity and flexibility of integrating control signals into the image generation process.</p>
<h1 id="Conclusion"><a href="#Conclusion" class="headerlink" title="Conclusion"></a>Conclusion</h1><p>ControlNet offers a significant advancement in the field of text-to-image generation, providing precise control over the generated content. Its ability to integrate various control signals into the diffusion process makes it an invaluable tool for artists, designers, and researchers. By understanding its working principles, implementing it in your projects, and even training your own ControlNet, you can harness the full potential of this innovative technology to create stunning and accurate visual content.</p>
<h1 id="References"><a href="#References" class="headerlink" title="References"></a>References</h1><ul>
<li><span class="exturl" data-url="aHR0cHM6Ly9hcnhpdi5vcmcvYWJzLzIzMDIuMDU1NDM=">Adding Conditional Control to Text-to-Image Diffusion Models<i class="fa fa-external-link-alt"></i></span></li>
<li><span class="exturl" data-url="aHR0cHM6Ly9naXRodWIuY29tL2xsbHlhc3ZpZWwvQ29udHJvbE5ldA==">ControlNet GitHub<i class="fa fa-external-link-alt"></i></span></li>
<li><span class="exturl" data-url="aHR0cHM6Ly9zdGFibGUtZGlmZnVzaW9uLWFydC5jb20vY29udHJvbG5ldC8=">ControlNet: A Complete Guide<i class="fa fa-external-link-alt"></i></span></li>
<li><span class="exturl" data-url="aHR0cHM6Ly9qdWVqaW4uY24vcG9zdC83MjEwMzY5NjcxNjU2NTA1Mzk5">ControlNet原理解析 | 读论文<i class="fa fa-external-link-alt"></i></span></li>
<li><span class="exturl" data-url="aHR0cHM6Ly9odWdnaW5nZmFjZS5jby9kb2NzL2RpZmZ1c2Vycy9hcGkvcGlwZWxpbmVzL2NvbnRyb2xuZXQ=">ControlNet HuggingFace<i class="fa fa-external-link-alt"></i></span></li>
<li><span class="exturl" data-url="aHR0cHM6Ly9odWdnaW5nZmFjZS5jby9ibG9nL3RyYWluLXlvdXItY29udHJvbG5ldA==">Train your ControlNet with diffusers 🧨<i class="fa fa-external-link-alt"></i></span></li>
</ul>
]]></content>
      <tags>
        <tag>ML</tag>
        <tag>AI</tag>
        <tag>StableDiffusion</tag>
      </tags>
  </entry>
  <entry>
    <title>[.NET][C#][Design Pattern] - Repository Pattern</title>
    <url>/posts/1822258245/</url>
    <content><![CDATA[<h2 id="前言"><a href="#前言" class="headerlink" title="前言"></a>前言</h2><p>這篇文章主要介紹何為Repository Pattern, 並如何在.NET C#中實作<br>並結合Unit Of Work Pattern, 讓程式對Repository解耦<br>這個design pattern可以說是踏入軟體架構的敲門磚<br>也算是非常常用的pattern, 實用性非常高</p>
<span id="more"></span>

<h2 id="定義：Repository-Pattern-Unit-Of-Work-Pattern"><a href="#定義：Repository-Pattern-Unit-Of-Work-Pattern" class="headerlink" title="定義：Repository Pattern, Unit Of Work Pattern"></a>定義：Repository Pattern, Unit Of Work Pattern</h2><p>IG貼文<br><span class="exturl" data-url="aHR0cHM6Ly93d3cuaW5zdGFncmFtLmNvbS9wL0NnWm5TNjRoMjVtLz91dG1fc291cmNlPWlnX3dlYl9jb3B5X2xpbms=">Repository Pattern Post<i class="fa fa-external-link-alt"></i></span></p>
<p>GitHub連結：<br><span class="exturl" data-url="aHR0cHM6Ly9naXRodWIuY29tL21hby1jb2RlL1JlcG9zaXRvcnlQYXR0ZXJu">Sample Code<i class="fa fa-external-link-alt"></i></span></p>
<p>Repository Pattern主要就是由Repository這個元素所組成<br>我個人都會結合Unit Of Work Pattern一起使用</p>
<blockquote>
<p><strong>[Repository]</strong> :<br>    Act like <strong>a collection of object in memory.</strong></p>
</blockquote>
<blockquote>
<p><strong>[Unit of work]</strong> :<br>    Maintain a list of objects affected by a business transaction and <strong>coordinate th writting out of changes.</strong></p>
</blockquote>
<p>以上是對兩個主要元素的定義<br>簡單來說 <ins>Repository就像是一群存在記憶體中的Objects</ins><br>而 <ins>Unit of work則是針對這些<strong>被改變</strong>的Objects, 統一在這裡做處理<ins></p>
<h2 id="更簡單的解釋"><a href="#更簡單的解釋" class="headerlink" title="更簡單的解釋"></a>更簡單的解釋</h2><p><strong>UnitOfWork在這裡就像是DB, Repository就像是Table.</strong><br><strong>如果以EntityFramework來看, 就是Context和Entity的關係.</strong><br>EF本身也有做Repository和UnitOfWork.</p>
<p><strong>把每一次的operation看作一個unit of work, 等到operation結束才Complete.</strong><br>這樣一來, <strong>當我們的Controller或Service需要很多Repository來操作時, 就只需要依賴UnitOfWork, 簡化程式碼與依賴.</strong><br><strong>另一個好處是可以管理Repository之間之間的情況, 做到DB的Atomic operation.</strong><br>舉例來說, 當Repository1儲存成功, 而Repository2儲存失敗, 以一個完整的Atomic operation來說, 一個失敗, operation就算是失敗.<br>但在這裡, Repository之間是沒有聯繫的, 因此資料會處於一種dirty state, 就是一個進去, 但一個失敗了.<br>沒辦法回到最初的狀況再debug, UnitOfWork就是來處理這種問題, 當Complete時失敗, 並不會真的動到DB, 而是只動到Repository. 可以整個<strong>roll back處理</strong></p>
<blockquote>
<p>補充：Atomic operation是資料庫<strong>ACID</strong>四個特性中的其中一個, 而使用<strong>Transaction(交易)</strong>, 就是實現Atomicity的方法之一, 簡單一句話來說就是「全有，或全無」</p>
</blockquote>
<h2 id="好處"><a href="#好處" class="headerlink" title="好處"></a>好處</h2><ol>
<li>使用Repository主要是可以<strong>對ORM的框架解耦</strong>, 對Context和Entities的依賴度不會太高.<br>舉例來說我今天使用Entity Framework, 開發了幾個月, 老闆突然要使用ADO.NET<br>這時如果沒有使用Repository, 而是在程式碼裡面大量使用Context與Entity<br>那麼要修改程式碼就是一件大工程了….</li>
<li>使用Unit Of Work Pattern, 可以<strong>統一管理Repository的變動</strong>, 再<strong>統一Save到Database</strong>,<br>這麼做可以確保每一次的Operation為全有或全無, 避免資料庫處於一個Dirty State</li>
<li>下面引用我在IG上的照片, 裡面還有提到一個好處是更好做單元測試.</li>
</ol>
<blockquote>
<p>不管是No-SQL或RDB都可以套用此模式, 但MongoDB需要另外建立Replica-Set</p>
</blockquote>
<p><img data-src="/images/posts/repository-pattern/benifits.jpg" 
style="width: 70%; margin: 15px auto;"><br>可以看到, repository是存放在unit of work內, 我們的程式是去依賴unit of work,<br>unit of work再透過repository去操作ORM, ORM再去操作我們的資料庫.</p>
<h2 id="實作"><a href="#實作" class="headerlink" title="實作"></a>實作</h2><p>先附上UML<br><img data-src="/images/posts/repository-pattern/uml.jpg" 
style="width: 70%; margin: 15px auto;"><br>稍微解釋一下, 這裡會有一個Generic Repository, 負責處理基本的Get, Find, Romove等.<br>而其他Concrete Repository再去繼承和實作, 內部就針對自己的Repository做特殊的Query或Command.</p>
<p>接下來會使用.NET去實作裡面的各個細節<br>先附上專案結構和ERD<br><img data-src="/images/posts/repository-pattern/structure.jpg" 
style="width: 70%; margin: 15px auto;"><br>這裡主要是以書本和作者為範例</p>
<p>Models裡面是用EntityFramework migration過來的Context和Entities model<br>我們主要實作Repository和UnitOfWork的部分</p>
<h3 id="IRepository-Repository"><a href="#IRepository-Repository" class="headerlink" title="IRepository, Repository"></a>IRepository, Repository</h3><p><strong>這裡是generic的repository</strong></p>
<figure class="highlight c#"><figcaption><span>IRepository</span></figcaption><table><tr><td class="code"><pre><span class="line"><span class="keyword">using</span> System;</span><br><span class="line"><span class="keyword">using</span> System.Linq.Expressions;</span><br><span class="line"></span><br><span class="line"><span class="keyword">namespace</span> <span class="title">RepositoryPatternPractice.Repositories</span></span><br><span class="line">&#123;</span><br><span class="line">	<span class="comment">//Interface here act like a protocol or a license</span></span><br><span class="line">	<span class="comment">//Generic Interface -&gt; Interface&lt;T&gt;</span></span><br><span class="line">	<span class="comment">//where keyword can set some restriction on the generic</span></span><br><span class="line">	<span class="keyword">public</span> <span class="keyword">interface</span> <span class="title">IRepository</span>&lt;<span class="title">TEntity</span>&gt; <span class="keyword">where</span> <span class="title">TEntity</span>: <span class="keyword">class</span></span><br><span class="line">	&#123;</span><br><span class="line">		<span class="comment">//Three main group of functions</span></span><br><span class="line"></span><br><span class="line">		<span class="comment">//Finding objects</span></span><br><span class="line">		<span class="function">TEntity <span class="title">Get</span>(<span class="params"><span class="built_in">int</span> id</span>)</span>;</span><br><span class="line">		<span class="function">IEnumerable&lt;TEntity&gt; <span class="title">GetAll</span>()</span>;</span><br><span class="line">		<span class="function">IEnumerable&lt;TEntity&gt; <span class="title">Find</span>(<span class="params">Expression&lt;Func&lt;TEntity, <span class="built_in">bool</span>&gt;&gt; predicate</span>)</span>;</span><br><span class="line"></span><br><span class="line">		<span class="comment">/*</span></span><br><span class="line"><span class="comment">			Func&lt;input, output&gt; -&gt; 委派物件(將函數當作物件的容器)</span></span><br><span class="line"><span class="comment">				eg. Func&lt;int, int&gt; fn = n=&gt;n*n;</span></span><br><span class="line"><span class="comment">			Expression (eg. LINQ)</span></span><br><span class="line"><span class="comment">				turn an lambda to an expression tree</span></span><br><span class="line"><span class="comment">				and the LINQ can input a lambda expression (put in a generic delegate)</span></span><br><span class="line"><span class="comment">		 */</span></span><br><span class="line"></span><br><span class="line">		<span class="comment">//Adding object</span></span><br><span class="line">		<span class="function"><span class="keyword">void</span> <span class="title">Add</span>(<span class="params">TEntity entity</span>)</span>;</span><br><span class="line">		<span class="function"><span class="keyword">void</span> <span class="title">AddRange</span>(<span class="params">IEnumerable&lt;TEntity&gt; entities</span>)</span>;</span><br><span class="line"></span><br><span class="line">		<span class="comment">//Removing object</span></span><br><span class="line">		<span class="function"><span class="keyword">void</span> <span class="title">Remove</span>(<span class="params">TEntity entity</span>)</span>;</span><br><span class="line">		<span class="function"><span class="keyword">void</span> <span class="title">RemoveRange</span>(<span class="params">IEnumerable&lt;TEntity&gt; entities</span>)</span>;</span><br><span class="line">	&#125;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure>

<blockquote>
<p>!! 要注意的是, <strong>Repository不應該有直接Update或是Save Database的方法</strong>, 主要是語義問題. !!<br>Repository act like a collection of objects in memory.<br>這些改變Database的動作, 應該交由UnitOfWork, 把這些objects save到database.<br>所以應該是UnitOfWork透過Repository撈出來, 然後修改, 再透過UnitOfWork Save.<br>可以使用Transaction機制, 將Save Changes的動作統一執行, 這樣一來才可以確保Atomic Operation</p>
</blockquote>
<hr>
<figure class="highlight c#"><figcaption><span>Repository</span></figcaption><table><tr><td class="code"><pre><span class="line"><span class="keyword">using</span> System;</span><br><span class="line"><span class="keyword">using</span> System.Linq.Expressions;</span><br><span class="line"><span class="keyword">using</span> Microsoft.EntityFrameworkCore;</span><br><span class="line"></span><br><span class="line"><span class="keyword">namespace</span> <span class="title">RepositoryPatternPractice.Repositories</span></span><br><span class="line">&#123;</span><br><span class="line">	<span class="keyword">public</span> <span class="keyword">class</span> <span class="title">Repository</span>&lt;<span class="title">TEntity</span>&gt; : <span class="title">IRepository</span>&lt;<span class="title">TEntity</span>&gt; <span class="keyword">where</span> <span class="title">TEntity</span>: <span class="keyword">class</span></span><br><span class="line">	&#123;</span><br><span class="line">        <span class="comment">//the context here is generic, so it has nothing to do with my application</span></span><br><span class="line">        <span class="comment">//so you can DI some specific contexts</span></span><br><span class="line">        <span class="comment">//protected because the specific repository can use it</span></span><br><span class="line">        <span class="keyword">protected</span> <span class="keyword">readonly</span> DbContext Context;</span><br><span class="line"></span><br><span class="line">		<span class="function"><span class="keyword">public</span> <span class="title">Repository</span>(<span class="params">DbContext context</span>)</span></span><br><span class="line">		&#123;</span><br><span class="line">            Context = context;</span><br><span class="line">		&#125;</span><br><span class="line"></span><br><span class="line">        <span class="function"><span class="keyword">public</span> TEntity <span class="title">Get</span>(<span class="params"><span class="built_in">int</span> id</span>)</span></span><br><span class="line">        &#123;</span><br><span class="line">            <span class="keyword">return</span> Context.Set&lt;TEntity&gt;().Find(id);</span><br><span class="line">        &#125;</span><br><span class="line"></span><br><span class="line">        <span class="comment">//don&#x27;t return IQueryable!!</span></span><br><span class="line">        <span class="comment">//Repository should encapsulate the query</span></span><br><span class="line">        <span class="comment">//so on the Service or Controller won&#x27;t get too much pressure</span></span><br><span class="line">        <span class="function"><span class="keyword">public</span> <span class="title">IEnumerable</span>&lt;<span class="title">TEntity</span>&gt; <span class="title">GetAll</span>()</span></span><br><span class="line">        &#123;</span><br><span class="line">            <span class="keyword">return</span> Context.Set&lt;TEntity&gt;().ToList();</span><br><span class="line">        &#125;</span><br><span class="line"></span><br><span class="line">        <span class="function"><span class="keyword">public</span> <span class="title">IEnumerable</span>&lt;<span class="title">TEntity</span>&gt; <span class="title">Find</span>(<span class="params">Expression&lt;Func&lt;TEntity, <span class="built_in">bool</span>&gt;&gt; predicate</span>)</span></span><br><span class="line">        &#123;</span><br><span class="line">            <span class="keyword">return</span> Context.Set&lt;TEntity&gt;().Where(predicate);</span><br><span class="line">        &#125;</span><br><span class="line"></span><br><span class="line">        <span class="function"><span class="keyword">public</span> <span class="keyword">void</span> <span class="title">Add</span>(<span class="params">TEntity entity</span>)</span></span><br><span class="line">        &#123;</span><br><span class="line">            Context.Set&lt;TEntity&gt;().Add(entity);</span><br><span class="line">        &#125;</span><br><span class="line"></span><br><span class="line">        <span class="function"><span class="keyword">public</span> <span class="keyword">void</span> <span class="title">AddRange</span>(<span class="params">IEnumerable&lt;TEntity&gt; entities</span>)</span></span><br><span class="line">        &#123;</span><br><span class="line">            Context.Set&lt;TEntity&gt;().AddRange(entities);</span><br><span class="line">        &#125;</span><br><span class="line"></span><br><span class="line">        <span class="function"><span class="keyword">public</span> <span class="keyword">void</span> <span class="title">Remove</span>(<span class="params">TEntity entity</span>)</span></span><br><span class="line">        &#123;</span><br><span class="line">            Context.Set&lt;TEntity&gt;().Remove(entity);</span><br><span class="line">        &#125;</span><br><span class="line"></span><br><span class="line">        <span class="function"><span class="keyword">public</span> <span class="keyword">void</span> <span class="title">RemoveRange</span>(<span class="params">IEnumerable&lt;TEntity&gt; entities</span>)</span></span><br><span class="line">        &#123;</span><br><span class="line">            Context.Set&lt;TEntity&gt;().RemoveRange(entities);</span><br><span class="line">        &#125;</span><br><span class="line">    &#125;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure>
<hr>
<h3 id="IBookRepository-BookRepository"><a href="#IBookRepository-BookRepository" class="headerlink" title="IBookRepository, BookRepository"></a>IBookRepository, BookRepository</h3><p><strong>這裡開始是concrete的repository</strong></p>
<figure class="highlight c#"><figcaption><span>IBookRepository</span></figcaption><table><tr><td class="code"><pre><span class="line"><span class="keyword">using</span> System;</span><br><span class="line"><span class="keyword">using</span> RepositoryPatternPractice.Models;</span><br><span class="line"></span><br><span class="line"><span class="keyword">namespace</span> <span class="title">RepositoryPatternPractice.Repositories</span></span><br><span class="line">&#123;</span><br><span class="line">	<span class="comment">//derive from my generic Repository interface</span></span><br><span class="line">	<span class="comment">//C# allow this interface chain</span></span><br><span class="line">	<span class="keyword">public</span> <span class="keyword">interface</span> <span class="title">IBookRepository</span> : <span class="title">IRepository</span>&lt;<span class="title">Book</span>&gt;</span><br><span class="line">	&#123;</span><br><span class="line">		<span class="function">IEnumerable&lt;Book&gt; <span class="title">GetTopSellingBooks</span>(<span class="params"><span class="built_in">int</span> count</span>)</span>;</span><br><span class="line">		<span class="function">IEnumerable&lt;Book&gt; <span class="title">GetBooksByAuthor</span>(<span class="params">Author author</span>)</span>;</span><br><span class="line">	&#125;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure>

<figure class="highlight c#"><figcaption><span>BookRepository</span></figcaption><table><tr><td class="code"><pre><span class="line"><span class="keyword">using</span> System;</span><br><span class="line"><span class="keyword">using</span> RepositoryPatternPractice.Models;</span><br><span class="line"></span><br><span class="line"><span class="keyword">namespace</span> <span class="title">RepositoryPatternPractice.Repositories</span></span><br><span class="line">&#123;</span><br><span class="line">	<span class="comment">//inheritance, implementation</span></span><br><span class="line">	<span class="keyword">public</span> <span class="keyword">class</span> <span class="title">BookRepository</span> : <span class="title">Repository</span>&lt;<span class="title">Book</span>&gt;, <span class="title">IBookRepository</span></span><br><span class="line">	&#123;</span><br><span class="line">        <span class="comment">//Property</span></span><br><span class="line">        <span class="keyword">public</span> MypostgresContext MypostgresContext</span><br><span class="line">        &#123;</span><br><span class="line">            <span class="keyword">get</span> &#123; <span class="keyword">return</span> Context <span class="keyword">as</span> MypostgresContext; &#125;   </span><br><span class="line">        &#125;</span><br><span class="line"></span><br><span class="line">		<span class="function"><span class="keyword">public</span> <span class="title">BookRepository</span>(<span class="params">MypostgresContext context</span>) : <span class="title">base</span>(<span class="params">context</span>)</span></span><br><span class="line">		&#123;</span><br><span class="line">		&#125;</span><br><span class="line"></span><br><span class="line">        IEnumerable&lt;Book&gt; IBookRepository.GetBooksByAuthor(Author author)</span><br><span class="line">        &#123;</span><br><span class="line">           <span class="keyword">return</span> MypostgresContext.Books.Where(x=&gt;x.AuthorId==author.AuthorId).ToList();</span><br><span class="line">        &#125;</span><br><span class="line"></span><br><span class="line">        IEnumerable&lt;Book&gt; IBookRepository.GetTopSellingBooks(<span class="built_in">int</span> count)</span><br><span class="line">        &#123;</span><br><span class="line">            <span class="keyword">return</span> MypostgresContext.Books.OrderByDescending(b =&gt; b.Price).Take(count).ToList();</span><br><span class="line">        &#125;</span><br><span class="line">    &#125;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure>

<hr>
<h3 id="IAuthorRepository-AuthorRepository"><a href="#IAuthorRepository-AuthorRepository" class="headerlink" title="IAuthorRepository, AuthorRepository"></a>IAuthorRepository, AuthorRepository</h3><figure class="highlight c#"><figcaption><span>IAuthorRepository</span></figcaption><table><tr><td class="code"><pre><span class="line"><span class="keyword">using</span> System;</span><br><span class="line"><span class="keyword">using</span> RepositoryPatternPractice.Models;</span><br><span class="line"></span><br><span class="line"><span class="keyword">namespace</span> <span class="title">RepositoryPatternPractice.Repositories</span></span><br><span class="line">&#123;</span><br><span class="line">	<span class="keyword">public</span> <span class="keyword">interface</span> <span class="title">IAuthorRepository</span> : <span class="title">IRepository</span>&lt;<span class="title">Author</span>&gt;</span><br><span class="line">	&#123;</span><br><span class="line">		<span class="function">Author <span class="title">GetAuthorByName</span>(<span class="params"><span class="built_in">string</span> name</span>)</span>;</span><br><span class="line">	&#125;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure>

<figure class="highlight c#"><figcaption><span>AuthorRepository</span></figcaption><table><tr><td class="code"><pre><span class="line"><span class="keyword">using</span> System;</span><br><span class="line"><span class="keyword">using</span> RepositoryPatternPractice.Models;</span><br><span class="line"></span><br><span class="line"><span class="keyword">namespace</span> <span class="title">RepositoryPatternPractice.Repositories</span></span><br><span class="line">&#123;</span><br><span class="line">	<span class="keyword">public</span> <span class="keyword">class</span> <span class="title">AuthorRepository</span> : <span class="title">Repository</span>&lt;<span class="title">Author</span>&gt;, <span class="title">IAuthorRepository</span></span><br><span class="line">	&#123;</span><br><span class="line">		<span class="keyword">public</span> MypostgresContext MypostgresContext</span><br><span class="line">        &#123;</span><br><span class="line">			<span class="keyword">get</span> &#123; <span class="keyword">return</span> Context <span class="keyword">as</span> MypostgresContext; &#125;</span><br><span class="line">        &#125;</span><br><span class="line"></span><br><span class="line">		<span class="function"><span class="keyword">public</span> <span class="title">AuthorRepository</span>(<span class="params">MypostgresContext context</span>) : <span class="title">base</span>(<span class="params">context</span>)</span> </span><br><span class="line">		&#123;</span><br><span class="line">		&#125;</span><br><span class="line"></span><br><span class="line">		<span class="function"><span class="keyword">public</span> Author <span class="title">GetAuthorByName</span>(<span class="params"><span class="built_in">string</span> name</span>)</span></span><br><span class="line">		&#123;</span><br><span class="line">			<span class="keyword">return</span> MypostgresContext.Authors.Where(x=&gt;x.AuthorName.Equals(name)).ToList().FirstOrDefault();</span><br><span class="line">		&#125;</span><br><span class="line">	&#125;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure>

<h3 id="IUnitOfWork-UnitOfWork"><a href="#IUnitOfWork-UnitOfWork" class="headerlink" title="IUnitOfWork, UnitOfWork"></a>IUnitOfWork, UnitOfWork</h3><p><strong>這裡開始是unit of work的實作</strong></p>
<figure class="highlight c#"><figcaption><span>IUnitOfWork</span></figcaption><table><tr><td class="code"><pre><span class="line"><span class="keyword">using</span> System;</span><br><span class="line"><span class="keyword">using</span> RepositoryPatternPractice.Repositories;</span><br><span class="line"></span><br><span class="line"><span class="keyword">namespace</span> <span class="title">RepositoryPatternPractice</span></span><br><span class="line">&#123;</span><br><span class="line">	<span class="comment">//Unit of work</span></span><br><span class="line">	<span class="comment">//interfce chain with IDisposable, so the class implement this interface </span></span><br><span class="line">    <span class="comment">//need to implement the Dispose() method</span></span><br><span class="line">	<span class="keyword">public</span> <span class="keyword">interface</span> <span class="title">IUnitOfWork</span> : <span class="title">IDisposable</span></span><br><span class="line">	&#123;</span><br><span class="line">		<span class="comment">//Repository act the collection of objects in memory</span></span><br><span class="line">		IBookRepository Books &#123; <span class="keyword">get</span>; &#125;</span><br><span class="line">		IAuthorRepository Authors &#123; <span class="keyword">get</span>; &#125;</span><br><span class="line">		<span class="function"><span class="built_in">int</span> <span class="title">Complete</span>()</span>;</span><br><span class="line">	&#125;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure>

<figure class="highlight c#"><figcaption><span>UnitOfWork</span></figcaption><table><tr><td class="code"><pre><span class="line"><span class="keyword">using</span> System;</span><br><span class="line"><span class="keyword">using</span> RepositoryPatternPractice.Models;</span><br><span class="line"><span class="keyword">using</span> RepositoryPatternPractice.Repositories;</span><br><span class="line"></span><br><span class="line"><span class="keyword">namespace</span> <span class="title">RepositoryPatternPractice</span></span><br><span class="line">&#123;</span><br><span class="line">	<span class="keyword">public</span> <span class="keyword">class</span> <span class="title">UnitOfWork</span> : <span class="title">IUnitOfWork</span></span><br><span class="line">	&#123;</span><br><span class="line">		<span class="keyword">private</span> <span class="keyword">readonly</span> MypostgresContext _context;</span><br><span class="line"></span><br><span class="line">		<span class="keyword">public</span> IBookRepository Books &#123; <span class="keyword">get</span>; <span class="keyword">private</span> <span class="keyword">set</span>; &#125;</span><br><span class="line">		<span class="keyword">public</span> IAuthorRepository Authors &#123; <span class="keyword">get</span>; <span class="keyword">private</span> <span class="keyword">set</span>; &#125;</span><br><span class="line"></span><br><span class="line">		<span class="comment">//we will use this context across all repositories</span></span><br><span class="line">		<span class="function"><span class="keyword">public</span> <span class="title">UnitOfWork</span>(<span class="params">MypostgresContext context</span>)</span></span><br><span class="line">		&#123;</span><br><span class="line">			<span class="keyword">this</span>._context = context;</span><br><span class="line">			<span class="comment">//use the same context to initialize our repository</span></span><br><span class="line">			Books = <span class="keyword">new</span> BookRepository(_context);</span><br><span class="line">			Authors = <span class="keyword">new</span> AuthorRepository(_context);</span><br><span class="line">		&#125;</span><br><span class="line"></span><br><span class="line">		<span class="function"><span class="keyword">public</span> <span class="built_in">int</span> <span class="title">Complete</span>()</span></span><br><span class="line">        &#123;</span><br><span class="line">			<span class="keyword">return</span> <span class="keyword">this</span>._context.SaveChanges();</span><br><span class="line">        &#125;</span><br><span class="line"></span><br><span class="line">		<span class="function"><span class="keyword">public</span> <span class="keyword">void</span> <span class="title">Dispose</span>()</span></span><br><span class="line">		&#123;</span><br><span class="line">			_context.Dispose(); <span class="comment">//dispose the context</span></span><br><span class="line">		&#125;</span><br><span class="line">	&#125;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure>
<p>這裡可以發現, UnitOfWork將Repository作為自己的屬性, 這樣一來使用者就可以透過ＵnitOfWork點出Repository再做操作.<br>而Complete()方法, 就是拿來Save用的, 儲存所有異動<br>最後實作Dispose是因為可以透過using block釋放context<br>在每一次的operation, Repository應該使用同一個context (re-use)<br>但如果你有使用DI框架的話, 可以直接將Unit of work DI進去, 就不用使用using block來去釋放context. (主要是管理Transaction的session要釋放)</p>
<h2 id="使用"><a href="#使用" class="headerlink" title="使用"></a>使用</h2><p><strong>這裡是使用範例</strong></p>
<figure class="highlight c#"><figcaption><span>使用範例</span></figcaption><table><tr><td class="code"><pre><span class="line"><span class="keyword">using</span> System;</span><br><span class="line"><span class="keyword">using</span> RepositoryPatternPractice.Models;</span><br><span class="line"></span><br><span class="line"><span class="keyword">namespace</span> <span class="title">RepositoryPatternPractice</span></span><br><span class="line">&#123;</span><br><span class="line">   <span class="keyword">class</span> <span class="title">Program</span></span><br><span class="line">   &#123;</span><br><span class="line">        <span class="function"><span class="keyword">public</span> <span class="keyword">static</span> <span class="keyword">void</span> <span class="title">Main</span>(<span class="params"><span class="built_in">string</span>[] args</span>)</span></span><br><span class="line">        &#123;</span><br><span class="line">            <span class="comment">//example of using these interfaces and classes</span></span><br><span class="line">            <span class="comment">//using block like try/finally and call Dispose()</span></span><br><span class="line">            <span class="keyword">using</span> ( <span class="keyword">var</span> unitOfWork = <span class="keyword">new</span> UnitOfWork(<span class="keyword">new</span> MypostgresContext()) )</span><br><span class="line">            &#123;</span><br><span class="line">                <span class="comment">//Example 1</span></span><br><span class="line">                <span class="keyword">var</span> books = unitOfWork.Books.GetAll();</span><br><span class="line">                Console.WriteLine(<span class="string">&quot;Initial State: &quot;</span>);</span><br><span class="line">                <span class="keyword">foreach</span> (Book b <span class="keyword">in</span> books)</span><br><span class="line">                &#123;</span><br><span class="line">                    Console.WriteLine(<span class="string">$&quot;book: <span class="subst">&#123;b.BookName&#125;</span>, author: <span class="subst">&#123;unitOfWork.Authors.Get(b.AuthorId).AuthorName&#125;</span>&quot;</span>);</span><br><span class="line">                &#125;</span><br><span class="line">                Console.WriteLine();</span><br><span class="line"></span><br><span class="line">                <span class="comment">//Example 2</span></span><br><span class="line">                unitOfWork.Books.AddRange(<span class="keyword">new</span> List&lt;Book&gt;() &#123;</span><br><span class="line">                    <span class="keyword">new</span> Book() &#123; BookName=<span class="string">&quot;AIGuide&quot;</span>, Price=<span class="number">300</span>, Author=unitOfWork.Authors.GetAuthorByName(<span class="string">&quot;Xuan&quot;</span>)&#125;,</span><br><span class="line">                    <span class="keyword">new</span> Book() &#123; BookName=<span class="string">&quot;PSGuide&quot;</span>, Price=<span class="number">230</span>, Author=unitOfWork.Authors.GetAuthorByName(<span class="string">&quot;Xuan&quot;</span>)&#125;</span><br><span class="line">                &#125;);</span><br><span class="line"></span><br><span class="line"></span><br><span class="line">                <span class="comment">//Example3</span></span><br><span class="line">                unitOfWork.Books.Get(<span class="number">2</span>).BookName = <span class="string">&quot;C#dotNETGuide&quot;</span>;</span><br><span class="line"></span><br><span class="line">                unitOfWork.Complete();</span><br><span class="line"></span><br><span class="line">                Console.WriteLine(<span class="string">&quot;After changes: &quot;</span>);</span><br><span class="line">                books = unitOfWork.Books.GetAll();</span><br><span class="line">                <span class="keyword">foreach</span> (Book b <span class="keyword">in</span> books)</span><br><span class="line">                &#123;</span><br><span class="line">                    Console.WriteLine(<span class="string">$&quot;book: <span class="subst">&#123;b.BookName&#125;</span>, author: <span class="subst">&#123;unitOfWork.Authors.Get(b.AuthorId).AuthorName&#125;</span>&quot;</span>);</span><br><span class="line">                &#125;</span><br><span class="line">                Console.WriteLine();</span><br><span class="line">            &#125;</span><br><span class="line">            Console.ReadKey();</span><br><span class="line">        &#125;</span><br><span class="line">   &#125;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure>

<h2 id="結語"><a href="#結語" class="headerlink" title="結語"></a>結語</h2><p>如果想要做到非同步, 只要把Repository內部的實作改成非同步就好, 其餘都一樣.<br>其實Repository pattern, Unit of Work pattern很常會一起使用, 甚至搭配DI, 但這裡為求簡單就沒有使用DI了<br>希望大家看完這篇文章對於Repository Pattern和UnitOfWork Pattern更加理解</p>
]]></content>
      <categories>
        <category>OOP</category>
        <category>Design Pattern</category>
      </categories>
      <tags>
        <tag>OOP</tag>
        <tag>.NET C#</tag>
        <tag>design-pattern</tag>
      </tags>
  </entry>
  <entry>
    <title>[.NET][C#][Design Pattern] - Mediator Pattern (中介者模式)</title>
    <url>/posts/4072203086/</url>
    <content><![CDATA[<h2 id="前言"><a href="#前言" class="headerlink" title="前言"></a>前言</h2><p>這篇文章主要介紹Mediator Pattern的定義以及簡單實作<br>介紹使用Mediator Pattern的好處, 並提供範例<br>在軟體架構中Mediator Patternc和Repository Pattern都是常用的pattern<br>通常.NET專案中會結合MediatR套件使用</p>
<span id="more"></span>

<h2 id="定義-Mediator-Pattern"><a href="#定義-Mediator-Pattern" class="headerlink" title="定義 Mediator Pattern"></a>定義 Mediator Pattern</h2><p>IG貼文：<br><span class="exturl" data-url="aHR0cHM6Ly93d3cuaW5zdGFncmFtLmNvbS9wL0NpTTBlenBMTzR6Lz9obD16aC10dw==">Mediator Pattern Post<i class="fa fa-external-link-alt"></i></span></p>
<p>GitHub連結：<br><span class="exturl" data-url="aHR0cHM6Ly9naXRodWIuY29tL21hby1jb2RlL01lZGlhdG9yUGF0dGVybg==">Sample Code<i class="fa fa-external-link-alt"></i></span></p>
<p>定義一個 Mediator 物件用來<strong>封裝一組物件的互動方式</strong>。<br>Mediator 藉由<strong>避免物件間相互直接的引用</strong>，從而降低它們之間的耦合程度，並且可以讓我們獨立地改變這些物件間的互動方式。</p>
<p>下面這兩張圖片可以簡單體現出Mediator Pattern在做的事</p>
<p><strong>沒有使用中介者, 物件彼此之間直接調用</strong><br><img data-src="/images/posts/mediator-pattern/origin-graph.png" 
style="width: 70%; margin: 15px auto;"></p>
<p><strong>使用中介者, 物件只依賴mediator來與其他物件溝通</strong><br><img data-src="/images/posts/mediator-pattern/star.png" 
style="width: 70%; margin: 15px auto;"></p>
<p>可以發現使用中介者模式之後, 物件都只依賴中介者來傳遞訊息, 而不是直接調用彼此, 以此達到解耦。<br>在現實生活中, 也有類似的例子, 例如開發團隊, 客服團隊, 產銷團隊, 設計團隊, 若彼此之間沒有一個統一的溝通,<br>那麼各個團隊耦合度很高, 開發跟設計要協調介面, 又要和產銷和客服討論如何贏得如何符合市場, 這樣一來分工太複雜。<br>所以需要一個<strong>產品經理來當中介者</strong>, 協助溝通各個團隊。</p>
<h2 id="優缺點"><a href="#優缺點" class="headerlink" title="優缺點"></a>優缺點</h2><ul>
<li>優點 <ol>
<li>降低物件之間的耦合性，讓物件容易重複使用。</li>
<li>物件之間一對多的關聯性變成一對一，提高系統靈活性，也讓整體容易維護及擴充。</li>
</ol>
</li>
<li>缺點 (Trade-off)<ol>
<li>同事類別過多時，中介者責任很大，會使系統提升一定程度的複雜性。</li>
</ol>
</li>
</ul>
<h2 id="UML與成員"><a href="#UML與成員" class="headerlink" title="UML與成員"></a>UML與成員</h2><p><img data-src="/images/posts/mediator-pattern/uml.png" 
style="width: 70%; margin: 15px auto;"></p>
<table>
    <thead>
        <th>成員</th>
        <th>定義</th>
    </thead>
    <tbody>
        <tr>
            <td>Mdiator</td>
            <td>抽象中介者, 定義註冊進入mediator以及轉發的方法</td>
        </tr>
        <tr>
            <td>ConcreteMediator</td>
            <td>具體中介者, 定義一個集合來管理同事, 所以聚合Colleague</td>
        </tr>
        <tr>
            <td>Colleague</td>
            <td>抽象同事, 可以保存中介者, 調用內部方法, 所以聚合Mediator</td>
        </tr>
        <tr>
            <td>ConcreteColleague</td>
            <td>具體同事, 當要跟其他物件溝通時, 利用內部的中介者進行轉發</td>
        </tr>
    </tbody>
</table>

<h2 id="實作"><a href="#實作" class="headerlink" title="實作"></a>實作</h2><p><strong>專案結構</strong><br><img data-src="/images/posts/mediator-pattern/structure.png" 
style="width: 70%; margin: 15px auto;"></p>
<p><strong>抽象Mediator</strong></p>
<figure class="highlight c#"><figcaption><span>Mediator(抽象)</span></figcaption><table><tr><td class="code"><pre><span class="line"><span class="keyword">using</span> System;</span><br><span class="line"><span class="keyword">using</span> MediatorPattern.colleagues;</span><br><span class="line"></span><br><span class="line"><span class="keyword">namespace</span> <span class="title">MediatorPattern.mediator</span></span><br><span class="line">&#123;</span><br><span class="line">    <span class="comment">// 團隊列舉</span></span><br><span class="line">    <span class="keyword">public</span> <span class="built_in">enum</span> teamType</span><br><span class="line">    &#123;</span><br><span class="line">        ENGINEERING, DESIGN, SERVICE, MARKETING</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line">    <span class="comment">// 抽象mediator</span></span><br><span class="line">    <span class="keyword">public</span> <span class="keyword">abstract</span> <span class="keyword">class</span> <span class="title">Mediator</span></span><br><span class="line">	&#123;</span><br><span class="line">		<span class="function"><span class="keyword">public</span> <span class="keyword">abstract</span> <span class="keyword">void</span> <span class="title">Register</span>(<span class="params">teamType type, Colleague colleague</span>)</span>; <span class="comment">//註冊進入mediator</span></span><br><span class="line">		<span class="function"><span class="keyword">public</span> <span class="keyword">abstract</span> <span class="keyword">void</span> <span class="title">Relay</span>(<span class="params">teamType type, <span class="built_in">string</span> msg</span>)</span>; <span class="comment">//轉發;傳遞</span></span><br><span class="line">	&#125;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure>

<p><strong>PackageManager(具體Mediator)</strong></p>
<figure class="highlight c#"><figcaption><span>PackageManager(具體Mediator)</span></figcaption><table><tr><td class="code"><pre><span class="line"><span class="keyword">using</span> System;</span><br><span class="line"><span class="keyword">using</span> MediatorPattern.colleagues;</span><br><span class="line"></span><br><span class="line"><span class="keyword">namespace</span> <span class="title">MediatorPattern.mediator</span></span><br><span class="line">&#123;</span><br><span class="line">	<span class="comment">// 具體mediator</span></span><br><span class="line">	<span class="keyword">public</span> <span class="keyword">class</span> <span class="title">ProductManager</span> : <span class="title">Mediator</span></span><br><span class="line">	&#123;</span><br><span class="line">        <span class="comment">// 聚合Colleague, 一個儲存colleague物件的dictionary</span></span><br><span class="line">        <span class="keyword">private</span> Dictionary&lt;teamType, Colleague&gt; colleagues = <span class="keyword">new</span> Dictionary&lt;teamType, Colleague&gt;();</span><br><span class="line"></span><br><span class="line">        <span class="function"><span class="keyword">public</span> <span class="keyword">override</span> <span class="keyword">void</span> <span class="title">Register</span>(<span class="params">teamType type, Colleague colleague</span>)</span></span><br><span class="line">        &#123;</span><br><span class="line">            colleague.setMediator(<span class="keyword">this</span>);</span><br><span class="line">            <span class="keyword">this</span>.colleagues.Add(type, colleague);</span><br><span class="line">        &#125;</span><br><span class="line"></span><br><span class="line">        <span class="function"><span class="keyword">public</span> <span class="keyword">override</span> <span class="keyword">void</span> <span class="title">Relay</span>(<span class="params">teamType type, <span class="built_in">string</span> msg</span>)</span></span><br><span class="line">        &#123;</span><br><span class="line">            Colleague toColleague = <span class="keyword">this</span>.colleagues[type];</span><br><span class="line">            toColleague.receive(msg);</span><br><span class="line">        &#125;</span><br><span class="line">    &#125;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure>

<p><strong>抽象Colleague</strong></p>
<figure class="highlight c#"><figcaption><span>Colleague(抽象)</span></figcaption><table><tr><td class="code"><pre><span class="line"><span class="keyword">using</span> System;</span><br><span class="line"><span class="keyword">using</span> MediatorPattern.mediator;</span><br><span class="line"></span><br><span class="line"><span class="keyword">namespace</span> <span class="title">MediatorPattern.colleagues</span></span><br><span class="line">&#123;</span><br><span class="line">	<span class="keyword">public</span> <span class="keyword">abstract</span> <span class="keyword">class</span> <span class="title">Colleague</span></span><br><span class="line">	&#123;</span><br><span class="line">		<span class="keyword">protected</span> Mediator? Mediator &#123; <span class="keyword">get</span>; <span class="keyword">private</span> <span class="keyword">set</span>; &#125;</span><br><span class="line"></span><br><span class="line">		<span class="function"><span class="keyword">public</span> <span class="keyword">void</span> <span class="title">setMediator</span>(<span class="params">Mediator mediator</span>)</span></span><br><span class="line">		&#123;</span><br><span class="line">            <span class="keyword">this</span>.Mediator = mediator;</span><br><span class="line">		&#125;</span><br><span class="line"></span><br><span class="line">		<span class="comment">// 接收訊息後如何反應</span></span><br><span class="line">		<span class="function"><span class="keyword">public</span> <span class="keyword">abstract</span> <span class="keyword">void</span> <span class="title">receive</span>(<span class="params"><span class="built_in">string</span> msg</span>)</span>;</span><br><span class="line"></span><br><span class="line">		<span class="comment">// 傳送訊息給特定物件</span></span><br><span class="line">		<span class="function"><span class="keyword">public</span> <span class="keyword">abstract</span> <span class="keyword">void</span> <span class="title">send</span>(<span class="params">teamType type, <span class="built_in">string</span> msg</span>)</span>;</span><br><span class="line">	&#125;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure>

<p><strong>具體Colleague</strong></p>
<figure class="highlight c#"><table><tr><td class="code"><pre><span class="line"><span class="comment">//具體colleague</span></span><br><span class="line"><span class="keyword">public</span> <span class="keyword">class</span> <span class="title">DesignTeam</span> : <span class="title">Colleague</span></span><br><span class="line">&#123;</span><br><span class="line">    <span class="function"><span class="keyword">public</span> <span class="keyword">override</span> <span class="keyword">void</span> <span class="title">receive</span>(<span class="params"><span class="built_in">string</span> msg</span>)</span></span><br><span class="line">    &#123;</span><br><span class="line">        Console.WriteLine(<span class="string">&quot;設計團隊收到訊息: &quot;</span> + msg);</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line">    <span class="function"><span class="keyword">public</span> <span class="keyword">override</span> <span class="keyword">void</span> <span class="title">send</span>(<span class="params">teamType type, <span class="built_in">string</span> msg</span>)</span></span><br><span class="line">    &#123;</span><br><span class="line">        Console.WriteLine(<span class="string">&quot;設計團隊發送訊息: &quot;</span>+msg);</span><br><span class="line">        <span class="keyword">if</span> (<span class="keyword">this</span>.Mediator != <span class="literal">null</span>)</span><br><span class="line">        &#123;</span><br><span class="line">            <span class="keyword">this</span>.Mediator.Relay(type, msg);</span><br><span class="line">        &#125;</span><br><span class="line">    &#125;</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="keyword">public</span> <span class="keyword">class</span> <span class="title">EngineeringTeam</span> : <span class="title">Colleague</span></span><br><span class="line">&#123;</span><br><span class="line">    <span class="function"><span class="keyword">public</span> <span class="keyword">override</span> <span class="keyword">void</span> <span class="title">receive</span>(<span class="params"><span class="built_in">string</span> msg</span>)</span></span><br><span class="line">    &#123;</span><br><span class="line">        Console.WriteLine(<span class="string">&quot;工程師團隊收到訊息: &quot;</span>+msg);</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line">    <span class="function"><span class="keyword">public</span> <span class="keyword">override</span> <span class="keyword">void</span> <span class="title">send</span>(<span class="params">teamType type, <span class="built_in">string</span> msg</span>)</span></span><br><span class="line">    &#123;</span><br><span class="line">        Console.WriteLine(<span class="string">&quot;工程師團隊發送訊息: &quot;</span>+msg);</span><br><span class="line">        <span class="keyword">if</span> (<span class="keyword">this</span>.Mediator != <span class="literal">null</span>)</span><br><span class="line">        &#123;</span><br><span class="line">            <span class="keyword">this</span>.Mediator.Relay(type, msg);</span><br><span class="line">        &#125;</span><br><span class="line">    &#125;</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="keyword">public</span> <span class="keyword">class</span> <span class="title">MarketingTeam</span> : <span class="title">Colleague</span></span><br><span class="line">&#123;</span><br><span class="line">    <span class="function"><span class="keyword">public</span> <span class="keyword">override</span> <span class="keyword">void</span> <span class="title">receive</span>(<span class="params"><span class="built_in">string</span> msg</span>)</span></span><br><span class="line">    &#123;</span><br><span class="line">        Console.WriteLine(<span class="string">&quot;產銷團隊收到訊息: &quot;</span> + msg);</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line">    <span class="function"><span class="keyword">public</span> <span class="keyword">override</span> <span class="keyword">void</span> <span class="title">send</span>(<span class="params">teamType type, <span class="built_in">string</span> msg</span>)</span></span><br><span class="line">    &#123;</span><br><span class="line">        Console.WriteLine(<span class="string">&quot;產銷團隊發送訊息: &quot;</span>+msg);</span><br><span class="line">        <span class="keyword">if</span> (<span class="keyword">this</span>.Mediator != <span class="literal">null</span>)</span><br><span class="line">        &#123;</span><br><span class="line">            <span class="keyword">this</span>.Mediator.Relay(type, msg);</span><br><span class="line">        &#125;</span><br><span class="line">    &#125;</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="keyword">public</span> <span class="keyword">class</span> <span class="title">ServiceTeam</span> : <span class="title">Colleague</span></span><br><span class="line">&#123;</span><br><span class="line">    <span class="function"><span class="keyword">public</span> <span class="keyword">override</span> <span class="keyword">void</span> <span class="title">receive</span>(<span class="params"><span class="built_in">string</span> msg</span>)</span></span><br><span class="line">    &#123;</span><br><span class="line">        Console.WriteLine(<span class="string">&quot;客服團隊收到訊息: &quot;</span> + msg);</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line">    <span class="function"><span class="keyword">public</span> <span class="keyword">override</span> <span class="keyword">void</span> <span class="title">send</span>(<span class="params">teamType type, <span class="built_in">string</span> msg</span>)</span></span><br><span class="line">    &#123;</span><br><span class="line">        Console.WriteLine(<span class="string">&quot;客服團隊發送訊息: &quot;</span>+msg);</span><br><span class="line">        <span class="keyword">if</span> (<span class="keyword">this</span>.Mediator != <span class="literal">null</span>)</span><br><span class="line">        &#123;</span><br><span class="line">            <span class="keyword">this</span>.Mediator.Relay(type, msg);</span><br><span class="line">        &#125;</span><br><span class="line">    &#125;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure>

<p><strong>Client端使用</strong></p>
<figure class="highlight c#"><figcaption><span>Program.cs</span></figcaption><table><tr><td class="code"><pre><span class="line"><span class="keyword">using</span> MediatorPattern.colleagues;</span><br><span class="line"><span class="keyword">using</span> MediatorPattern.mediator;</span><br><span class="line"></span><br><span class="line"><span class="comment">// 具體mediator</span></span><br><span class="line">ProductManager pm = <span class="keyword">new</span> ProductManager();</span><br><span class="line"></span><br><span class="line"><span class="comment">//具體colleague</span></span><br><span class="line">Colleague ds = <span class="keyword">new</span> DesignTeam();</span><br><span class="line">Colleague eg = <span class="keyword">new</span> EngineeringTeam();</span><br><span class="line">Colleague mk = <span class="keyword">new</span> MarketingTeam();</span><br><span class="line">Colleague sv = <span class="keyword">new</span> ServiceTeam();</span><br><span class="line"></span><br><span class="line"><span class="comment">//註冊進入mediator</span></span><br><span class="line">pm.Register(teamType.DESIGN, ds);</span><br><span class="line">pm.Register(teamType.ENGINEERING, eg);</span><br><span class="line">pm.Register(teamType.MARKETING, mk);</span><br><span class="line">pm.Register(teamType.SERVICE, sv);</span><br><span class="line"></span><br><span class="line"><span class="comment">//執行</span></span><br><span class="line">ds.send(teamType.ENGINEERING, <span class="string">&quot;UI設計稿完成&quot;</span>);</span><br><span class="line">Console.WriteLine(String.Concat(Enumerable.Repeat(<span class="string">&quot;-&quot;</span>, <span class="number">10</span>)));</span><br><span class="line"></span><br><span class="line">eg.send(teamType.MARKETING, <span class="string">&quot;軟體開發完成&quot;</span>);</span><br><span class="line">Console.WriteLine(String.Concat(Enumerable.Repeat(<span class="string">&quot;-&quot;</span>, <span class="number">10</span>)));</span><br><span class="line"></span><br><span class="line">Console.ReadKey();</span><br></pre></td></tr></table></figure>

<p><strong>結果</strong><br><img data-src="/images/posts/mediator-pattern/result.png" 
style="width: 70%; margin: 15px auto;"></p>
<h2 id="疑問"><a href="#疑問" class="headerlink" title="疑問"></a>疑問</h2><ol>
<li>DI不也是讓物件間解耦嗎？他們有什麼不同？<ol>
<li>應用場景不同：Mediator模式主要用於解耦合多個相互協作的物件，特別是在複雜的互動場景中。DI則主要用於管理和注入依賴關係，以實現鬆耦合和可測試的程式碼。</li>
<li>解決不同問題：Mediator解決的是物件之間的協作和通信問題，而DI解決的是依賴關係管理的問題。</li>
<li>實現方式不同：Mediator需要一個中介者來協調物件之間的互動，通常由一個專門的Mediator類別來實現。DI則是一種設計模式，它可以與不同語言和框架一起使用，並不需要特定的中介者類別。</li>
</ol>
</li>
</ol>
<h2 id="結語"><a href="#結語" class="headerlink" title="結語"></a>結語</h2><ul>
<li>定義:<br>  定義一個 Mediator 物件用來封裝一組物件的互動方式。Mediator 藉由避免物件間相互直接的引用，從而降低它們之間的耦合程度，並且可以讓我們獨立地改變這些物件間的互動方式。</li>
<li>成員<ul>
<li>抽象Mediator: 定義方法, 轉接訊息</li>
<li>具體Mediator: 實作抽象Mediator, Colleague (擁有0~多個Colleague)</li>
<li>抽象Colleague: 定義方法, 聚合抽象Mediator (Colleague擁有Mediator)</li>
<li>具體Colleague: 實作抽象Mediator</li>
</ul>
</li>
<li>優點 <ol>
<li>降低物件之間的耦合性，讓物件容易重複使用。</li>
<li>物件之間一對多的關聯性變成一對一，提高系統靈活性，也讓整體容易維護及擴充。</li>
</ol>
</li>
<li>缺點 (Trade-off)<ol>
<li>同事類別過多時，中介者責任很大，會使系統提升一定程度的複雜性。</li>
</ol>
</li>
</ul>
]]></content>
      <categories>
        <category>OOP</category>
        <category>Design Pattern</category>
      </categories>
      <tags>
        <tag>OOP</tag>
        <tag>.NET C#</tag>
        <tag>design-pattern</tag>
      </tags>
  </entry>
  <entry>
    <title>[NLP] MLLM(多模態LLM) 論文簡要欣賞</title>
    <url>/posts/2312529159/</url>
    <content><![CDATA[<h1 id="Abstract-Overview"><a href="#Abstract-Overview" class="headerlink" title="Abstract &amp; Overview"></a>Abstract &amp; Overview</h1><p>這篇文章我們來欣賞一下在arXiv上的一篇論文：<br><span class="exturl" data-url="aHR0cHM6Ly9hcnhpdi5vcmcvYWJzLzIzMDYuMTM1NDk=">A Servey on Multimodal Large Language Models<i class="fa fa-external-link-alt"></i></span><br>這篇論文主要在介紹與整理現行代表性的MLLM(Multimodal LLM)，並將其分類為4大體裁：<br>Multimodal Instruction Tuning(M-IT), Multimodal In-Context Learning(M-ICL), Multimodal Chain-of_Thought(M-CoT)以及LLM-Aided Visual Reasoning(LAVR)。<br>其中，前三者為MLLM的基礎，而最後一個則是以LLM為核心的multimodal system，類似於一個系統框架。</p>
<p>MLLM指的是在LLM的基礎上，從單模態走向多模態，從人工智慧的角度來看，MLLM比LLM向前又跨出了一步，原因如下：</p>
<ol>
<li>MLLM更符合人的感官世界，自然地接受多感官輸入</li>
<li>MLLM提供一個友好的介面，支持多模態輸入，使其易於與使用者交流</li>
<li>MLLM是一個更全面的問題解決者，雖然LLM可以解決NLP的問題，但MLLM通常可以支持更大範圍的任務</li>
</ol>
<p>下面我將針對這篇論文中較為<strong>重要的部分做簡要的欣賞與分析</strong> ，其他細節部分可以自行閱讀這篇精彩的論文！</p>
<span id="more"></span>
<h1 id="Prerequest-Note"><a href="#Prerequest-Note" class="headerlink" title="Prerequest &amp; Note"></a>Prerequest &amp; Note</h1><p>在開始之前，先介紹在論文中常出現的一些基本名詞</p>
<h2 id="IT-ICL-CoT"><a href="#IT-ICL-CoT" class="headerlink" title="IT, ICL, CoT"></a>IT, ICL, CoT</h2><p>讓我們先看一下這三者的定義：</p>
<ol>
<li><strong>Instruction Tuning:</strong><ul>
<li>Instruction tuning involves fine-tuning a pre-trained language model by providing <strong>specific instructions during the training process.</strong></li>
<li>These instructions can be in the form of <strong>prompts or demonstrations</strong>, guiding the model to generate responses in a desired manner.</li>
<li>It allows for controlled language generation and can be useful for generating responses tailored to specific tasks or contexts.</li>
</ul>
</li>
<li><strong>In Context Learning:</strong><ul>
<li>In context learning refers to training a language model on a <strong>large dataset with diverse contexts and topics.</strong></li>
<li>The model learns to <strong>understand</strong> and generate responses based on the context it is given, which allows for more <strong>contextually relevant and coherent responses.</strong></li>
<li>This approach is suitable for applications where the model needs to <strong>understand and adapt to various contexts</strong> during language generation.</li>
</ul>
</li>
<li><strong>Chain of Thought:</strong><ul>
<li>Chain of thought is a concept in natural language generation where the model maintains <strong>coherence throughout a conversation by remembering past interactions.</strong></li>
<li>The model keeps track of the conversation history, ensuring <strong>consistent</strong> responses and avoiding contradictions.</li>
<li>This is especially valuable in chatbot applications to provide more human-like and coherent conversations.</li>
</ul>
</li>
</ol>
<p>簡單整理一下三者的差異：</p>
<blockquote>
<ul>
<li>Instruction Tuning 著重於微調模型以遵循特定的指令或指導，使其更好地符合特定任務或需求的生成。</li>
<li>In-Context Learning 強調理解上下文，確保生成的結果與上下文相關且連貫。</li>
<li>Chain-of-Thought 強調模型生成文本時的連貫性，能夠記住過去的互動，使生成的文本像人類思維的連貫性一樣。<br>這三個方法或概念是獨立的，但可以搭配使用，創造更好的效果。</li>
</ul>
</blockquote>
<h2 id="Zero-Shot-Few-Shot-Learning"><a href="#Zero-Shot-Few-Shot-Learning" class="headerlink" title="Zero-Shot, Few-Shot Learning"></a>Zero-Shot, Few-Shot Learning</h2><p>先看一下兩者的定義：</p>
<ul>
<li><p><strong>Zero-Shot Learning:</strong></p>
<ul>
<li>Zero-shot learning is a learning paradigm where a model is trained to perform a task for which <strong>it has never seen any examples during training.</strong> </li>
<li>In other words, the model is expected to <strong>generalize its knowledge</strong> from the training data to <strong>new, unseen categories or tasks.</strong></li>
</ul>
</li>
<li><p><strong>Few-Shot Learning:</strong></p>
<ul>
<li>Few-shot learning is a similar concept, but it allows a model to be trained on a very small number of examples for each new category or task it encounters. * </li>
<li>Instead of needing a large amount of data per class, few-shot learning aims to generalize from a few examples. </li>
<li>It’s like teaching a model to learn from just a handful of samples. This is particularly useful in scenarios where collecting a substantial amount of data for each new class is impractical.</li>
</ul>
</li>
</ul>
<h2 id="Note"><a href="#Note" class="headerlink" title="Note"></a>Note</h2><p>這邊整理了一些重要的專有名詞</p>
<ul>
<li>Learning Paradigm<ul>
<li>A learning paradigm refers to a <strong>specific approach or framework used to train machine learning models.</strong> </li>
<li>It encompasses the fundamental principles, methods, and strategies that guide how models are constructed, trained, and evaluated. </li>
<li>Different learning paradigms offer distinct ways of addressing various types of problems and data.</li>
<li>E.g. Supervised Learning, Semi-Supervised Learning, Reinforcement Learning, Transfer Learning, etc.</li>
</ul>
</li>
</ul>
<p>簡單整理一下兩者的差異：</p>
<blockquote>
<ul>
<li>Zero-Shot Learning 利用先前學習的知識，執行全新、未看過的任務類別。</li>
<li>Few-Shot Learning 利用少量的範例(有答案)，去適應新的任務類別。</li>
</ul>
</blockquote>
<h1 id="M-IT"><a href="#M-IT" class="headerlink" title="M-IT"></a>M-IT</h1><p>Instruction指的是對於任務的描述，IT是一種利用instruction-formatted datasets來訓練LLM的技巧。利用這個技巧，LLM可以透過跟隨新的instruction來泛化未曾看過的新任務，進而實現zero-shot learning的成效。<br>下面是論文中引用來描繪instruction tuning與其他learning paradigms的圖示</p>
<p><img data-src="/images/posts/paper-study/MLLM-1.png" 
style="width: 70%; margin: 15px auto;"></p>
<hr>
<p>為了要從unimodality到mutimodality，需要調整的部分主要 <strong>分為兩個：”Data” 和 “Model”。</strong></p>
<ol>
<li><p>在Data方面，我們常 <strong>調整現存的benchmark datasets來取得M-IT datasets，或是使用self-instruction。</strong><br>下面的圖簡單呈現了M-IT data的模板<br><img data-src="/images/posts/paper-study/MLLM-2.png" 
style="width: 70%; margin: 15px auto;"></p>
</li>
<li><p>在Model方面，一個常見的做法是將 <strong>其他modalities的資訊注入進LLM，並把LLM當成是一個強大的推理者。</strong> 其他相關的著作常會直接將理解外來modality的能力嵌入進LLM；或是使用expert model把外來的modality轉換成LLM可以理解的natural language。</p>
</li>
</ol>
<hr>
<h2 id="Formulation-M-IT-M-IT-sample"><a href="#Formulation-M-IT-M-IT-sample" class="headerlink" title="Formulation M-IT &amp; M-IT sample"></a>Formulation M-IT &amp; M-IT sample</h2><p>這邊我簡單把論文中提到有關M-IT以及其目標函數的內容拉出來<br>M-IT sample其實可以被標示為三元素的數組：<br>$$(I, M, R)$$<br>分別代表instruction, multimodal input以及ground-truth response</p>
<p>MLLM利用給予的instruction和multimodal input預測answer<br>$$ A &#x3D; f(I, M; \theta)$$<br>其中，A代表預測的答案(answer)，$\theta$ 則代表模型中的parameters</p>
<p>而訓練的目標是要最小化loss function，而MLLM的目標是要預測下一個response的token，基於這點，loss function可以這麼表示：<br>$$\begin{equation}<br>    L(\theta) &#x3D; -\sum_{i&#x3D;1}^N\log{}p(R_i|I,R_{&lt;i};\theta)<br>\end{equation}$$<br>其中，</p>
<ul>
<li>N為ground-truth response的長度</li>
<li>使用log是因為將機率連乘轉為連加，以避免underflow</li>
<li>因為是loss function，所以加個負號，最小化loss function同時最大化預測下個token的機率</li>
<li>利用instruction與前i-1個ground truth response token預測第i個token</li>
</ul>
<hr>
<h2 id="Bridgin-the-gap-between-different-modalities"><a href="#Bridgin-the-gap-between-different-modalities" class="headerlink" title="Bridgin the gap between different modalities"></a>Bridgin the gap between different modalities</h2><p>在這篇論文中提到一個非常重要的問題：該怎麼連結不同的modalities？<br>主要有兩種方法：</p>
<ol>
<li>Learnable Interface: <ul>
<li>在LLM的其中一個模組或權重下去做調整，並插入在pre-trained visual encoder與LLM之間，作為一個可以用來訓練模態轉換的model。</li>
<li>連結不同的modalities的同時，凍結pre-trained model的parameters。</li>
<li>如何將視覺內容轉為LLM可以理解得文字格式</li>
</ul>
</li>
<li>Expert model: <ul>
<li>利用其他模型，轉換外來的modality成不需要訓練訓練的語言，例如image captioning model(不用訓練)。</li>
<li>可能不像learnable interface那樣彈性，且有information loss的風險</li>
</ul>
</li>
</ol>
<h1 id="M-ICL"><a href="#M-ICL" class="headerlink" title="M-ICL"></a>M-ICL</h1><p>先介紹兩個ICL的特點：</p>
<ol>
<li>supervised learning是從大量的資料中學習資料背後的模式，與傳統supervised learning不同的是，ICL的關鍵是”類推”，從少量的資料搭配一些選填的instruction，去外推新的問題與任務，因此使用few-shot learning的方式解決全新的問題。</li>
<li>ICL常常使用training-free的方式實作，因此可以在inference stage，很容易地整合進不同的框架。</li>
<li>IT與ICL十分相關，IT常常被拿來加強模型的ICL能力。</li>
</ol>
<hr>
<p>下面的圖示描繪了簡化過後的M-ICL query模板，其中，使用了兩個in-context範例和一個query，兩者用虛線隔開，模型的目的是要完成這個request。<br><img data-src="/images/posts/paper-study/MLLM-3.png" 
style="width: 70%; margin: 15px auto;"></p>
<h1 id="M-CoT"><a href="#M-CoT" class="headerlink" title="M-CoT"></a>M-CoT</h1><blockquote>
<p>CoT is “a series of intermediate reasoning steps”, which has been proven to be effective in complex reasoning tasks.</p>
</blockquote>
<ul>
<li>CoT主要是想讓我們在prompt LLMs時，讓他生成不只是最後的答案，而是要有推理的過程而引導到最終解答。</li>
<li>在M-CoT中有幾個重要的概念：modality bridging(解決modality gap), learning paradigms, chain configuration以及generation patterns。</li>
</ul>
<hr>
<h2 id="Modality-Bridging"><a href="#Modality-Bridging" class="headerlink" title="Modality Bridging"></a>Modality Bridging</h2><ol>
<li>使用Learnable Interface：<ul>
<li>這個方法使用一個learnable interface把visual embedding mapping到word embedding空間。</li>
<li>這個mapped embedding可以被視為是一個prompt，進而傳遞給LLMs，引出其M-CoT的能力</li>
<li>舉例來說，CoT-PT使用多個Meta-Net來作prompt tuning，Meta-Net將visual features轉換成階段性的prompt，其中，可以把Meta-Net想像為CoT-PT的其中一個模組。</li>
<li>Multimodal-CoT使用shared Transformer-based 架構，visual與textual特徵通過cross-attention進行交互。</li>
</ul>
</li>
<li>使用Expert Model：<ul>
<li>引用expert models來將visual input翻譯為texual description</li>
<li>儘管其非常直接且簡單，但在轉換的過程中可能會遇到information loss的問題</li>
</ul>
</li>
</ol>
<h2 id="Learning-Paradigms"><a href="#Learning-Paradigms" class="headerlink" title="Learning Paradigms"></a>Learning Paradigms</h2><p>learning paradigms也可以解釋為模型如何從資訊中獲取知識。大致可以分為 <strong>三種方式來習得M-CoT的能力</strong></p>
<ol>
<li>fintuning：<br>通常需要針對M-CoT的datasets</li>
<li>training-free few-shot learning：<br>同常需要手作一些in-context的範例讓模型去學如何推理</li>
<li>training-free zero-shot learning：<br>直接prompt就可以，不需要其他明顯的指引，舉例來說”Let’s think frame by frame”</li>
</ol>
<p>其中對於sample size的要求由上而下遞減</p>
<h2 id="Chain-Configuration"><a href="#Chain-Configuration" class="headerlink" title="Chain Configuration"></a>Chain Configuration</h2><p>簡單來說就是要什麼時後該停止推裡，主要有adaptive和pre-defined formation兩種方法。</p>
<ul>
<li>Adaptive:<br>要求LLMs自己決定何時該停止reasoing chains</li>
<li>Pre-defined formation<br>使用者事先設定好reasoning chains的長度</li>
</ul>
<h2 id="Generation-Pattern"><a href="#Generation-Pattern" class="headerlink" title="Generation Pattern"></a>Generation Pattern</h2><p>Reasoning chain是如何建構的？<br>主要有兩種可能</p>
<ol>
<li>Infilling-based pattern：<br>需要在上下文中去做演繹(前幾步與後幾步)，去填補邏輯漏洞</li>
<li>predicting-based pattern：<br>利用已知的條件、instruction以及過去推理的資訊去擴充reasoning chain<br>不管是哪種模式，都必須要求生成的文本必須是連續且正確的</li>
</ol>
<h1 id="LAVR"><a href="#LAVR" class="headerlink" title="LAVR"></a>LAVR</h1><p>將LLMs作為helpers，以及其他不同的角色，建構出一個支持特定任務或genral-purpose的visual reasoning system<br>跟傳統的visual reasoning system相比，LAVR有一些更好的特點：</p>
<ol>
<li>Strong generalization ability：<br>擁有眾多知識並以大量資料及訓練的模型，可以容易地泛化未看過的任務，並在zero-shot&#x2F;few-shot擁有不錯的性能。</li>
<li>Emergent abilities：<br>其定義為，在小的模型不會出現，而在大模型會出現的能力，當模型到一定的scale會湧現的能力，例如能夠看到圖片表面下的意義，像是能夠理解會和一個迷因是好笑的。</li>
<li>Better interactvity and control：<br>LLM-based system提供一個更好使用與控制的使用者介面，例如使用自然語言的query進行互動。<br>下面我將介紹論文中提到的，LAVR不同的training paradigm以及LLM在這個系統中扮演的角色</li>
</ol>
<h2 id="Training-Paradigms"><a href="#Training-Paradigms" class="headerlink" title="Training Paradigms"></a>Training Paradigms</h2><p>主要有兩種：</p>
<ol>
<li>training-free：<ul>
<li>few-shot models：<br>   需要少量的hand-crafted in-context sample，去指引LLMs產生程式或一連串的執行步驟， <strong>這些程式或執行步驟是作為其他對應模型或外部工具與模組的instructions</strong></li>
<li>zero-shot models：<br>   依賴LLMs的語言相關知識與推理能力，例如CAT使用LLMs去refine影像的caption，使其更符合使用者的需求</li>
</ul>
</li>
<li>finetuning：<br>主要是想要激活LLMs在LAVR中的planning abilities(對應到工具的使用)，以及instruction-following abilities。</li>
</ol>
<h2 id="Functions"><a href="#Functions" class="headerlink" title="Functions"></a>Functions</h2><p>LLMs在LAVR system中扮演的主要角色，主要有三種</p>
<blockquote>
<ol>
<li>LLM as a Controller</li>
<li>LLM as a Decision Maker</li>
<li>LLM as a Semantics Refiner</li>
</ol>
</blockquote>
<p>前兩者(controller &amp; decision maker)與CoT相關，因為複雜的任務需要被拆解為intermediate simpler tasks。</p>
<p>當LLMs作為controller時，任務常常是在single round完成的，而multi-round更常見於decision maker</p>
<h3 id="LLM-as-a-Controller"><a href="#LLM-as-a-Controller" class="headerlink" title="LLM as a Controller"></a>LLM as a Controller</h3><ol>
<li>Break down a complex task into simpler sub-tasks<br>常運用LLMs的CoT能力</li>
<li>Assigns these tasks to appropriate tools&#x2F;modules</li>
</ol>
<h3 id="LLM-as-a-Decision-Maker"><a href="#LLM-as-a-Decision-Maker" class="headerlink" title="LLM as a Decision Maker"></a>LLM as a Decision Maker</h3><p>在這個case中，複雜的任務會以multi-round的方式被解決，decision maker通常要滿足以下能力</p>
<ol>
<li>總結目前的上下文以及歷史資訊，並判斷目前的資訊是否足夠推導出最終解答</li>
<li>整理並總結答案，並且用user-friendly的方式呈現給使用者</li>
</ol>
<h3 id="LLM-as-a-Semantics-Refiner"><a href="#LLM-as-a-Semantics-Refiner" class="headerlink" title="LLM as a Semantics Refiner"></a>LLM as a Semantics Refiner</h3><p>使用LLMs豐富的語言與語義知識，去對最終解答做加強</p>
<h1 id="Conclusion"><a href="#Conclusion" class="headerlink" title="Conclusion"></a>Conclusion</h1><p>這篇論文整理的現行MLLM的資訊，並給出了主要的幾個方向，包括包括3個常見的技巧(M-IT, M-ICL, M-CoT)，和一個task-solving systems的廣泛框架(LAVR)。<br>論文中還有許多evaluation的方式，以及現在研究需要被填充的gap，其他更詳細的內容還有待讀者親自去欣賞這篇論文！</p>
<h1 id="References"><a href="#References" class="headerlink" title="References"></a>References</h1><ul>
<li><span class="exturl" data-url="aHR0cHM6Ly9hcnhpdi5vcmcvYWJzLzIzMDYuMTM1NDk=">https://arxiv.org/abs/2306.13549<i class="fa fa-external-link-alt"></i></span></li>
<li><span class="exturl" data-url="aHR0cHM6Ly96aHVhbmxhbi56aGlodS5jb20vcC82Mzk2NjQ2MTU=">https://zhuanlan.zhihu.com/p/639664615<i class="fa fa-external-link-alt"></i></span></li>
</ul>
]]></content>
      <categories>
        <category>MLLM論文</category>
      </categories>
      <tags>
        <tag>ML</tag>
        <tag>AI</tag>
        <tag>NLP</tag>
        <tag>paper-study</tag>
      </tags>
  </entry>
</search>