Skip to content

Commit

Permalink
Merged
Browse files Browse the repository at this point in the history
  • Loading branch information
noahho committed Jan 8, 2025
1 parent d5aee30 commit 9c725e4
Show file tree
Hide file tree
Showing 5 changed files with 21 additions and 34 deletions.
39 changes: 12 additions & 27 deletions site/getting_started/api/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -510,15 +510,6 @@
<nav class="md-nav" aria-label="Current Limitations">
<ul class="md-nav__list">

<li class="md-nav__item">
<a href="#data-privacy-and-security" class="md-nav__link">
<span class="md-ellipsis">
Data Privacy and Security
</span>
</a>

</li>

<li class="md-nav__item">
<a href="#size-limitations" class="md-nav__link">
<span class="md-ellipsis">
Expand Down Expand Up @@ -2705,9 +2696,7 @@ <h3 id="usage-cost-calculation">Usage Cost Calculation<a class="headerlink" href
<p>The cost for each API request is calculated as:
<div class="language-python highlight"><pre><span></span><code><span id="__span-2-1"><a id="__codelineno-2-1" name="__codelineno-2-1" href="#__codelineno-2-1"></a><span class="n">api_cost</span> <span class="o">=</span> <span class="p">(</span><span class="n">num_train_rows</span> <span class="o">+</span> <span class="n">num_test_rows</span><span class="p">)</span> <span class="o">*</span> <span class="n">num_cols</span> <span class="o">*</span> <span class="n">n_estimators</span>
</span></code></pre></div></p>
<p>Where <code>n_estimators</code> is by default:
- 4 for classification tasks
- 8 for regression tasks</p>
<p>Where <code>n_estimators</code> is by default 4 for classification tasks and 8 for regression tasks.</p>
<h3 id="monitoring-usage">Monitoring Usage<a class="headerlink" href="#monitoring-usage" title="Permanent link">&para;</a></h3>
<p>Track your API usage through response headers:</p>
<table>
Expand All @@ -2733,7 +2722,6 @@ <h3 id="monitoring-usage">Monitoring Usage<a class="headerlink" href="#monitorin
</tbody>
</table>
<h2 id="current-limitations">Current Limitations<a class="headerlink" href="#current-limitations" title="Permanent link">&para;</a></h2>
<h3 id="data-privacy-and-security">Data Privacy and Security<a class="headerlink" href="#data-privacy-and-security" title="Permanent link">&para;</a></h3>
<div class="admonition warning">
<p class="admonition-title">Important Data Guidelines</p>
<ul>
Expand All @@ -2748,21 +2736,18 @@ <h3 id="size-limitations">Size Limitations<a class="headerlink" href="#size-limi
<ol>
<li>
<p>Maximum total cells per request must be below 100,000:
<div class="language-python highlight"><pre><span></span><code><span id="__span-3-1"><a id="__codelineno-3-1" name="__codelineno-3-1" href="#__codelineno-3-1"></a><span class="n">max_cells</span> <span class="o">=</span> <span class="p">(</span><span class="n">num_train_rows</span> <span class="o">+</span> <span class="n">num_test_rows</span><span class="p">)</span> <span class="o">*</span> <span class="n">num_cols</span>
<div class="language-text highlight"><pre><span></span><code><span id="__span-3-1"><a id="__codelineno-3-1" name="__codelineno-3-1" href="#__codelineno-3-1"></a>(num_train_rows + num_test_rows) * num_cols &lt; 100,000
</span></code></pre></div></p>
</li>
<li>
<p>For regression with full output (<code>return_full_output=True</code>), the number of test samples must be below 500:
<div class="language-python highlight"><pre><span></span><code><span id="__span-4-1"><a id="__codelineno-4-1" name="__codelineno-4-1" href="#__codelineno-4-1"></a><span class="k">if</span> <span class="n">task</span> <span class="o">==</span> <span class="s1">&#39;regression&#39;</span> <span class="ow">and</span> <span class="n">return_full_output</span> <span class="ow">and</span> <span class="n">num_test_samples</span> <span class="o">&gt;</span> <span class="mi">500</span><span class="p">:</span>
</span><span id="__span-4-2"><a id="__codelineno-4-2" name="__codelineno-4-2" href="#__codelineno-4-2"></a> <span class="k">raise</span> <span class="ne">ValueError</span><span class="p">(</span><span class="s2">&quot;Cannot return full output for regression with &gt;500 test samples&quot;</span><span class="p">)</span>
</span></code></pre></div></p>
<p>For regression with full output turned on (<code>return_full_output=True</code>), the number of test samples must be below 500.</p>
</li>
</ol>
<p>These limits will be increased in future releases.</p>
<p>These limits will be relaxed in future releases.</p>
<h3 id="managing-user-data">Managing User Data<a class="headerlink" href="#managing-user-data" title="Permanent link">&para;</a></h3>
<p>You can access and manage your personal information:</p>
<div class="language-python highlight"><pre><span></span><code><span id="__span-5-1"><a id="__codelineno-5-1" name="__codelineno-5-1" href="#__codelineno-5-1"></a><span class="kn">from</span> <span class="nn">tabpfn_client</span> <span class="kn">import</span> <span class="n">UserDataClient</span>
</span><span id="__span-5-2"><a id="__codelineno-5-2" name="__codelineno-5-2" href="#__codelineno-5-2"></a><span class="nb">print</span><span class="p">(</span><span class="n">UserDataClient</span><span class="o">.</span><span class="n">get_data_summary</span><span class="p">())</span>
<div class="language-python highlight"><pre><span></span><code><span id="__span-4-1"><a id="__codelineno-4-1" name="__codelineno-4-1" href="#__codelineno-4-1"></a><span class="kn">from</span> <span class="nn">tabpfn_client</span> <span class="kn">import</span> <span class="n">UserDataClient</span>
</span><span id="__span-4-2"><a id="__codelineno-4-2" name="__codelineno-4-2" href="#__codelineno-4-2"></a><span class="nb">print</span><span class="p">(</span><span class="n">UserDataClient</span><span class="o">.</span><span class="n">get_data_summary</span><span class="p">())</span>
</span></code></pre></div>
<h2 id="error-handling">Error Handling<a class="headerlink" href="#error-handling" title="Permanent link">&para;</a></h2>
<p>The API uses standard HTTP status codes:</p>
Expand All @@ -2788,12 +2773,12 @@ <h2 id="error-handling">Error Handling<a class="headerlink" href="#error-handlin
</tr>
</tbody>
</table>
<p>Example error response:
<div class="language-json highlight"><pre><span></span><code><span id="__span-6-1"><a id="__codelineno-6-1" name="__codelineno-6-1" href="#__codelineno-6-1"></a><span class="p">{</span>
</span><span id="__span-6-2"><a id="__codelineno-6-2" name="__codelineno-6-2" href="#__codelineno-6-2"></a><span class="w"> </span><span class="nt">&quot;error&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;API_LIMIT_REACHED&quot;</span><span class="p">,</span>
</span><span id="__span-6-3"><a id="__codelineno-6-3" name="__codelineno-6-3" href="#__codelineno-6-3"></a><span class="w"> </span><span class="nt">&quot;message&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;Usage limit exceeded&quot;</span><span class="p">,</span>
</span><span id="__span-6-4"><a id="__codelineno-6-4" name="__codelineno-6-4" href="#__codelineno-6-4"></a><span class="w"> </span><span class="nt">&quot;next_available_at&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;2024-01-07 00:00:00&quot;</span>
</span><span id="__span-6-5"><a id="__codelineno-6-5" name="__codelineno-6-5" href="#__codelineno-6-5"></a><span class="p">}</span>
<p>Example response, when limit reached:
<div class="language-json highlight"><pre><span></span><code><span id="__span-5-1"><a id="__codelineno-5-1" name="__codelineno-5-1" href="#__codelineno-5-1"></a><span class="p">{</span>
</span><span id="__span-5-2"><a id="__codelineno-5-2" name="__codelineno-5-2" href="#__codelineno-5-2"></a><span class="w"> </span><span class="nt">&quot;error&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;API_LIMIT_REACHED&quot;</span><span class="p">,</span>
</span><span id="__span-5-3"><a id="__codelineno-5-3" name="__codelineno-5-3" href="#__codelineno-5-3"></a><span class="w"> </span><span class="nt">&quot;message&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;Usage limit exceeded&quot;</span><span class="p">,</span>
</span><span id="__span-5-4"><a id="__codelineno-5-4" name="__codelineno-5-4" href="#__codelineno-5-4"></a><span class="w"> </span><span class="nt">&quot;next_available_at&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;2024-01-07 00:00:00&quot;</span>
</span><span id="__span-5-5"><a id="__codelineno-5-5" name="__codelineno-5-5" href="#__codelineno-5-5"></a><span class="p">}</span>
</span></code></pre></div></p>


Expand Down
2 changes: 1 addition & 1 deletion site/getting_started/install/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -2529,7 +2529,7 @@

<h1>Installation</h1>

<p>You can access our models through our API (<a href="https://github.com/automl/tabpfn-client">https://github.com/automl/tabpfn-client</a>) or via our user interface built on top of the API (<a href="https://www.ux.priorlabs.ai/">https://www.ux.priorlabs.ai/</a>).</p>
<p>You can access our models through our API (<a href="https://github.com/automl/tabpfn-client">https://github.com/automl/tabpfn-client</a>), via our user interface built on top of the API (<a href="https://www.ux.priorlabs.ai/">https://www.ux.priorlabs.ai/</a>) or locally.</p>
<div class="tabbed-set tabbed-alternate" data-tabs="1:4"><input checked="checked" id="__tabbed_1_1" name="__tabbed_1" type="radio" /><input id="__tabbed_1_2" name="__tabbed_1" type="radio" /><input id="__tabbed_1_3" name="__tabbed_1" type="radio" /><input id="__tabbed_1_4" name="__tabbed_1" type="radio" /><div class="tabbed-labels"><label for="__tabbed_1_1">Python API Client (No GPU, Online)</label><label for="__tabbed_1_2">Python Local (GPU)</label><label for="__tabbed_1_3">Web Interface</label><label for="__tabbed_1_4">R</label></div>
<div class="tabbed-content">
<div class="tabbed-block">
Expand Down
12 changes: 7 additions & 5 deletions site/getting_started/intended_use/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -2631,26 +2631,28 @@ <h1 id="usage-tips">Usage tips<a class="headerlink" href="#usage-tips" title="Pe
<div class="admonition note">
<p class="admonition-title">Note</p>
<p>For a simple example getting started with classification see <a href="../../tutorials/classification/">classification tutorial</a>.</p>
<p>We provide a comprehensive demo notebook that guides through installation and functionalities at <a href="https://tinyurl.com/tabpfn-colab-local">Interactive Colab Tutorial (with GPU usage)</a> and <a href="https://tinyurl.com/tabpfn-colab-online">Interactive Colab Tutorial (without GPU usage)</a>.</p>
<p>We provide two comprehensive demo notebooks that guides through installation and functionalities. One <a href="https://tinyurl.com/tabpfn-colab-online">colab tutorial using the cloud</a> and one <a href="https://tinyurl.com/tabpfn-colab-local">colab tutorial using the local GPU</a>.</p>
</div>
<h3 id="when-to-use-tabpfn">When to use TabPFN<a class="headerlink" href="#when-to-use-tabpfn" title="Permanent link">&para;</a></h3>
<p>TabPFN excels in handling small to medium-sized datasets with up to 10,000 samples and 500 features. For larger datasets, approaches such as CatBoost, XGB, or AutoGluon are likely to outperform TabPFN.</p>
<p>TabPFN excels in handling small to medium-sized datasets with up to 10,000 samples and 500 features. For larger datasets, methods such as CatBoost, XGBoost, or AutoGluon are likely to outperform TabPFN.</p>
<h3 id="intended-use-of-tabpfn">Intended Use of TabPFN<a class="headerlink" href="#intended-use-of-tabpfn" title="Permanent link">&para;</a></h3>
<p>While TabPFN provides a powerful drop-in replacement for traditional tabular data models, achieving top performance on real-world problems often requires domain expertise and the ingenuity of data scientists. Data scientists should continue to apply their skills in feature engineering, data cleaning, and problem framing to get the most out of TabPFN.</p>
<p>TabPFN is intended as a powerful drop-in replacement for traditional tabular data prediction tools, where top performance and fast training matter.
It still requires data scientists to prepare the data using their domain knowledge.
Data scientists will see benefits in performing feature engineering, data cleaning, and problem framing to get the most out of TabPFN.</p>
<h3 id="limitations-of-tabpfn">Limitations of TabPFN<a class="headerlink" href="#limitations-of-tabpfn" title="Permanent link">&para;</a></h3>
<ol>
<li>TabPFN's inference speed may be slower than highly optimized approaches like CatBoost.</li>
<li>TabPFN's memory usage scales linearly with dataset size, which can be prohibitive for very large datasets.</li>
<li>Our evaluation focused on datasets with up to 10,000 samples and 500 features; scalability to larger datasets requires further study.</li>
</ol>
<h3 id="computational-and-time-requirements">Computational and Time Requirements<a class="headerlink" href="#computational-and-time-requirements" title="Permanent link">&para;</a></h3>
<p>TabPFN is computationally efficient and can run on consumer hardware for most datasets. Training on a new dataset is recommended to run on a GPU as this speeds it up significantly. However, TabPFN is not optimized for real-time inference tasks.</p>
<p>TabPFN is computationally efficient and can run inference on consumer hardware for most datasets. Training on a new dataset is recommended to run on a GPU as this speeds it up significantly. TabPFN is not optimized for real-time inference tasks, but V2 can perform much faster predictions than V1 of TabPFN.</p>
<h3 id="data-preparation">Data Preparation<a class="headerlink" href="#data-preparation" title="Permanent link">&para;</a></h3>
<p>TabPFN can handle raw data with minimal preprocessing. Provide the data in a tabular format, and TabPFN will automatically handle missing values, encode categorical variables, and normalize features. While TabPFN works well out-of-the-box, performance can further be improved using dataset-specific preprocessings.</p>
<h3 id="interpreting-results">Interpreting Results<a class="headerlink" href="#interpreting-results" title="Permanent link">&para;</a></h3>
<p>TabPFN's predictions come with uncertainty estimates, allowing you to assess the reliability of the results. You can use SHAP to interpret TabPFN's predictions and identify the most important features driving the model's decisions.</p>
<h3 id="hyperparameter-tuning">Hyperparameter Tuning<a class="headerlink" href="#hyperparameter-tuning" title="Permanent link">&para;</a></h3>
<p>TabPFN provides strong performance out-of-the-box without extensive hyperparameter tuning. If you have additional computational resources, you can further optimize TabPFN's performance using random hyperparameter tuning or the Post-Hoc Ensembling (PHE) technique.</p>
<p>TabPFN provides strong performance out-of-the-box without extensive hyperparameter tuning. If you have additional computational resources, you can automatically tune its hyperparameters using <a href="https://github.com/PriorLabs/tabpfn-extensions/tree/main/src/tabpfn_extensions/post_hoc_ensembles">post-hoc ensembling</a> or <a href="https://github.com/PriorLabs/tabpfn-extensions/tree/main/src/tabpfn_extensions/hpo">random tuning</a>.</p>



Expand Down
2 changes: 1 addition & 1 deletion site/search/search_index.json

Large diffs are not rendered by default.

Binary file modified site/sitemap.xml.gz
Binary file not shown.

0 comments on commit 9c725e4

Please sign in to comment.