Skip to content

Commit

Permalink
sphnix updates
Browse files Browse the repository at this point in the history
  • Loading branch information
erdogant committed Apr 28, 2020
1 parent f4c8d52 commit 3d54ab7
Show file tree
Hide file tree
Showing 22 changed files with 248 additions and 146 deletions.
Binary file added docs/figs/example_fig1a.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/figs/example_fig1b.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/pages/doctrees/Algorithm.doctree
Binary file not shown.
Binary file modified docs/pages/doctrees/Examples.doctree
Binary file not shown.
Binary file modified docs/pages/doctrees/Save and Load.doctree
Binary file not shown.
Binary file modified docs/pages/doctrees/distfit.distfit.doctree
Binary file not shown.
Binary file modified docs/pages/doctrees/environment.pickle
Binary file not shown.
20 changes: 7 additions & 13 deletions docs/pages/html/Algorithm.html
Original file line number Diff line number Diff line change
Expand Up @@ -97,10 +97,7 @@
<li class="toctree-l2"><a class="reference internal" href="#residual-sum-of-squares-rss">Residual Sum of Squares (RSS)</a></li>
<li class="toctree-l2"><a class="reference internal" href="#goodness-of-fit">Goodness-of-fit</a></li>
<li class="toctree-l2"><a class="reference internal" href="#predictions">Predictions</a></li>
<li class="toctree-l2"><a class="reference internal" href="#output-parameters">Output parameters</a><ul>
<li class="toctree-l3"><a class="reference internal" href="#fit-transform">fit_transform</a></li>
</ul>
</li>
<li class="toctree-l2"><a class="reference internal" href="#output-variables">Output variables</a></li>
</ul>
</li>
<li class="toctree-l1"><a class="reference internal" href="Performance.html">Performance</a></li>
Expand Down Expand Up @@ -270,23 +267,21 @@ <h2>Predictions<a class="headerlink" href="#predictions" title="Permalink to thi
</dd>
</dl>
</div>
<div class="section" id="output-parameters">
<h2>Output parameters<a class="headerlink" href="#output-parameters" title="Permalink to this headline"></a></h2>
<div class="section" id="output-variables">
<h2>Output variables<a class="headerlink" href="#output-variables" title="Permalink to this headline"></a></h2>
<p>There are many output parameters provided by <code class="docutils literal notranslate"><span class="pre">distfit</span></code>.
It all starts with the initialization:</p>
<div class="highlight-python notranslate"><div class="highlight"><pre><span></span><span class="c1"># Initialize model and select popular distributions</span>
<span class="n">dist</span> <span class="o">=</span> <span class="n">distfit</span><span class="p">(</span><span class="n">alpha</span><span class="o">=</span><span class="mf">0.01</span><span class="p">)</span>
</pre></div>
</div>
<div class="section" id="fit-transform">
<h3>fit_transform<a class="headerlink" href="#fit-transform" title="Permalink to this headline"></a></h3>
<p>The object now returns variables that are set by default, except for the <code class="docutils literal notranslate"><span class="pre">alpha</span></code> parameter (nothing else is provided). For more details, see the <strong>returns</strong> in the docstrings at <a class="reference internal" href="distfit.distfit.html#distfit.distfit.distfit" title="distfit.distfit.distfit"><code class="xref py py-func docutils literal notranslate"><span class="pre">distfit.distfit.distfit()</span></code></a>. In the next step, input-data <em>X</em> can be provided:</p>
<div class="highlight-python notranslate"><div class="highlight"><pre><span></span><span class="c1"># Initialize model and select popular distributions</span>
<span class="n">dist</span><span class="o">.</span><span class="n">fit_transform</span><span class="p">(</span><span class="n">X</span><span class="p">)</span>
</pre></div>
</div>
<p>The object can now be feeded with input data X that will add more output variables to the object.
Feeding the object is done with the <code class="docutils literal notranslate"><span class="pre">fit</span></code> and <code class="docutils literal notranslate"><span class="pre">transform</span></code> function. Instead of using the two functions seperately, it can be done with <code class="docutils literal notranslate"><span class="pre">fit_transform</span></code>: <a class="reference internal" href="distfit.distfit.html#distfit.distfit.distfit.fit_transform" title="distfit.distfit.distfit.fit_transform"><code class="xref py py-func docutils literal notranslate"><span class="pre">distfit.distfit.distfit.fit_transform()</span></code></a>.</p>
<p>The object can now be feeded with data <em>X</em>, using <code class="docutils literal notranslate"><span class="pre">fit</span></code> and <code class="docutils literal notranslate"><span class="pre">transform</span></code> function, that will add more output variables to the object.
Instead of using the two functions seperately, it can also be performed with <code class="docutils literal notranslate"><span class="pre">fit_transform</span></code>: <a class="reference internal" href="distfit.distfit.html#distfit.distfit.distfit.fit_transform" title="distfit.distfit.distfit.fit_transform"><code class="xref py py-func docutils literal notranslate"><span class="pre">distfit.distfit.distfit.fit_transform()</span></code></a>.</p>
<p>The fit_transform outputs the variables <em>summary</em>, <em>distributions</em> and <em>model</em></p>
<dl class="simple">
<dt>dist.summary</dt><dd><p>The summary of the fits across the distributions.</p>
Expand All @@ -307,9 +302,9 @@ <h3>fit_transform<a class="headerlink" href="#fit-transform" title="Permalink to
</pre></div>
</div>
<dl class="simple">
<dt><strong>dist.distributions</strong> is a list containing the extracted <code class="docutils literal notranslate"><span class="pre">pdfs</span></code> from <code class="docutils literal notranslate"><span class="pre">scripy</span></code></dt><dd><p>The collected distributions.</p>
<dt><strong>dist.distributions</strong> is a list containing the extracted pdfs from <code class="docutils literal notranslate"><span class="pre">scripy</span></code></dt><dd><p>The collected distributions.</p>
</dd>
<dt><strong>dist.model</strong> contains information regarding the best scoring <code class="docutils literal notranslate"><span class="pre">pdf</span></code>:</dt><dd><ul class="simple">
<dt><strong>dist.model</strong> contains information regarding the best scoring pdf:</dt><dd><ul class="simple">
<li><p>dist.model[‘RSS’]</p></li>
<li><p>dist.model[‘name’]</p></li>
<li><p>dist.model[‘distr’]</p></li>
Expand All @@ -321,7 +316,6 @@ <h3>fit_transform<a class="headerlink" href="#fit-transform" title="Permalink to
</dd>
</dl>
</div>
</div>
</div>


Expand Down
79 changes: 49 additions & 30 deletions docs/pages/html/Examples.html
Original file line number Diff line number Diff line change
Expand Up @@ -99,8 +99,8 @@
<p class="caption"><span class="caption-text">Examples</span></p>
<ul class="current">
<li class="toctree-l1 current"><a class="current reference internal" href="#">Examples</a><ul>
<li class="toctree-l2"><a class="reference internal" href="#learn-new-model-with-gridsearch-and-train-test-set">Learn new model with gridsearch and train-test set</a></li>
<li class="toctree-l2"><a class="reference internal" href="#learn-new-model-on-the-entire-data-set">Learn new model on the entire data set</a></li>
<li class="toctree-l2"><a class="reference internal" href="#fit-distribution">Fit distribution</a></li>
<li class="toctree-l2"><a class="reference internal" href="#make-predictions">Make predictions</a></li>
</ul>
</li>
</ul>
Expand Down Expand Up @@ -175,39 +175,58 @@
<hr class="docutils" id="code-directive" />
<div class="section" id="examples">
<h1>Examples<a class="headerlink" href="#examples" title="Permalink to this headline"></a></h1>
<div class="section" id="learn-new-model-with-gridsearch-and-train-test-set">
<h2>Learn new model with gridsearch and train-test set<a class="headerlink" href="#learn-new-model-with-gridsearch-and-train-test-set" title="Permalink to this headline"></a></h2>
<p>AAA</p>
<div class="highlight-python notranslate"><div class="highlight"><pre><span></span><span class="c1"># Import library</span>
<span class="kn">import</span> <span class="nn">distfit</span>

<span class="c1"># Load example data set</span>
<span class="n">X</span><span class="p">,</span><span class="n">y_true</span> <span class="o">=</span> <span class="n">distfit</span><span class="o">.</span><span class="n">load_example</span><span class="p">()</span>

<span class="c1"># Retrieve URLs of malicous and normal urls:</span>
<span class="n">model</span> <span class="o">=</span> <span class="n">distfit</span><span class="o">.</span><span class="n">fit_transform</span><span class="p">(</span><span class="n">X</span><span class="p">,</span> <span class="n">y_true</span><span class="p">,</span> <span class="n">pos_label</span><span class="o">=</span><span class="s1">&#39;bad&#39;</span><span class="p">,</span> <span class="n">train_test</span><span class="o">=</span><span class="kc">True</span><span class="p">,</span> <span class="n">gridsearch</span><span class="o">=</span><span class="kc">True</span><span class="p">)</span>

<span class="c1"># The test error will be shown</span>
<span class="n">results</span> <span class="o">=</span> <span class="n">distfit</span><span class="o">.</span><span class="n">plot</span><span class="p">(</span><span class="n">model</span><span class="p">)</span>
<div class="section" id="fit-distribution">
<h2>Fit distribution<a class="headerlink" href="#fit-distribution" title="Permalink to this headline"></a></h2>
<div class="highlight-python notranslate"><div class="highlight"><pre><span></span><span class="c1"># Example data</span>
<span class="n">X</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">random</span><span class="o">.</span><span class="n">normal</span><span class="p">(</span><span class="mi">10</span><span class="p">,</span> <span class="mi">3</span><span class="p">,</span> <span class="mi">2000</span><span class="p">)</span>
<span class="n">y</span> <span class="o">=</span> <span class="p">[</span><span class="mi">3</span><span class="p">,</span><span class="mi">4</span><span class="p">,</span><span class="mi">5</span><span class="p">,</span><span class="mi">6</span><span class="p">,</span><span class="mi">10</span><span class="p">,</span><span class="mi">11</span><span class="p">,</span><span class="mi">12</span><span class="p">,</span><span class="mi">18</span><span class="p">,</span><span class="mi">20</span><span class="p">]</span>

<span class="c1"># Initialize</span>
<span class="n">dist</span> <span class="o">=</span> <span class="n">distfit</span><span class="p">()</span>
<span class="n">dist</span><span class="o">.</span><span class="n">fit_transform</span><span class="p">(</span><span class="n">X</span><span class="p">)</span>

<span class="c1"># Make prediction</span>
<span class="n">dist</span><span class="o">.</span><span class="n">predict</span><span class="p">(</span><span class="n">y</span><span class="p">)</span>
</pre></div>
</div>
<table class="docutils align-center" id="id1">
<caption><span class="caption-text">Distribution fit</span><a class="headerlink" href="#id1" title="Permalink to this table"></a></caption>
<colgroup>
<col style="width: 100%" />
</colgroup>
<tbody>
<tr class="row-odd"><td><p><a class="reference internal" href="_images/example_fig1a.png"><img alt="fig1a" src="_images/example_fig1a.png" style="width: 492.0px; height: 420.8px;" /></a></p></td>
</tr>
</tbody>
</table>
</div>
<div class="section" id="learn-new-model-on-the-entire-data-set">
<h2>Learn new model on the entire data set<a class="headerlink" href="#learn-new-model-on-the-entire-data-set" title="Permalink to this headline"></a></h2>
<p>BBBB</p>
<div class="highlight-python notranslate"><div class="highlight"><pre><span></span><span class="c1"># Import library</span>
<span class="kn">import</span> <span class="nn">distfit</span>

<span class="c1"># Load example data set</span>
<span class="n">X</span><span class="p">,</span><span class="n">y_true</span> <span class="o">=</span> <span class="n">distfit</span><span class="o">.</span><span class="n">load_example</span><span class="p">()</span>

<span class="c1"># Retrieve URLs of malicous and normal urls:</span>
<span class="n">model</span> <span class="o">=</span> <span class="n">distfit</span><span class="o">.</span><span class="n">fit_transform</span><span class="p">(</span><span class="n">X</span><span class="p">,</span> <span class="n">y_true</span><span class="p">,</span> <span class="n">pos_label</span><span class="o">=</span><span class="s1">&#39;bad&#39;</span><span class="p">,</span> <span class="n">train_test</span><span class="o">=</span><span class="kc">False</span><span class="p">,</span> <span class="n">gridsearch</span><span class="o">=</span><span class="kc">True</span><span class="p">)</span>

<span class="c1"># The train error will be shown. Such results are heavily biased as the model also learned on this set of data</span>
<span class="n">results</span> <span class="o">=</span> <span class="n">distfit</span><span class="o">.</span><span class="n">plot</span><span class="p">(</span><span class="n">model</span><span class="p">)</span>
<div class="section" id="make-predictions">
<h2>Make predictions<a class="headerlink" href="#make-predictions" title="Permalink to this headline"></a></h2>
<p>Make some predictions can with the <code class="docutils literal notranslate"><span class="pre">predict</span></code> function.
Due to multiple testing it can occur that samples are outside the boundary
of the distribution confidence interval but are not marked as significant.</p>
<div class="highlight-python notranslate"><div class="highlight"><pre><span></span><span class="c1"># Example data</span>
<span class="n">X</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">random</span><span class="o">.</span><span class="n">normal</span><span class="p">(</span><span class="mi">10</span><span class="p">,</span> <span class="mi">3</span><span class="p">,</span> <span class="mi">2000</span><span class="p">)</span>
<span class="n">y</span> <span class="o">=</span> <span class="p">[</span><span class="mi">3</span><span class="p">,</span><span class="mi">4</span><span class="p">,</span><span class="mi">5</span><span class="p">,</span><span class="mi">6</span><span class="p">,</span><span class="mi">10</span><span class="p">,</span><span class="mi">11</span><span class="p">,</span><span class="mi">12</span><span class="p">,</span><span class="mi">18</span><span class="p">,</span><span class="mi">20</span><span class="p">]</span>

<span class="c1"># Initialize</span>
<span class="n">dist</span> <span class="o">=</span> <span class="n">distfit</span><span class="p">(</span><span class="n">distr</span><span class="o">=</span><span class="s1">&#39;full&#39;</span><span class="p">,</span> <span class="n">alpha</span><span class="o">=</span><span class="mf">0.01</span><span class="p">)</span>
<span class="n">dist</span><span class="o">.</span><span class="n">fit_transform</span><span class="p">(</span><span class="n">X</span><span class="p">)</span>

<span class="c1"># Make prediction</span>
<span class="n">dist</span><span class="o">.</span><span class="n">predict</span><span class="p">(</span><span class="n">y</span><span class="p">)</span>
</pre></div>
</div>
<table class="docutils align-center" id="id2">
<caption><span class="caption-text">Plot distribution with predictions</span><a class="headerlink" href="#id2" title="Permalink to this table"></a></caption>
<colgroup>
<col style="width: 100%" />
</colgroup>
<tbody>
<tr class="row-odd"><td><p><a class="reference internal" href="_images/example_fig1b.png"><img alt="fig1b" src="_images/example_fig1b.png" style="width: 492.0px; height: 420.8px;" /></a></p></td>
</tr>
</tbody>
</table>
</div>
</div>

Expand Down
Loading

0 comments on commit 3d54ab7

Please sign in to comment.