Skip to content

Commit

Permalink
Change landing docs
Browse files Browse the repository at this point in the history
  • Loading branch information
noahho committed Jan 8, 2025
1 parent 833ebe2 commit 1464766
Show file tree
Hide file tree
Showing 6 changed files with 68 additions and 300 deletions.
189 changes: 34 additions & 155 deletions docs/docs.md
Original file line number Diff line number Diff line change
Expand Up @@ -147,204 +147,83 @@ samples and 4 features, the runtime on GPU is typically less than 1 second. For
runtime on GPU is typically less than 10 seconds.
-->


## Why TabPFN
## TabPFN Integrations

<div class="grid cards" markdown>

- :material-speedometer:{ .lg .middle } **Rapid Training**
- :material-cloud-check:{ .lg .middle } **API Client**

---

TabPFN significantly reduces training time, outperforming traditional models tuned for hours in just a few seconds. For instance, it surpasses an ensemble of the strongest baselines in 2.8 seconds compared to 4 hours of tuning.
The fastest way to get started with TabPFN. Access our models through the cloud without requiring local GPU resources.

[comment]: <> ([:octicons-arrow-right-24: Learn More](#))
[:octicons-arrow-right-24: TabPFN Client](https://github.com/PriorLabs/tabpfn-client)

- :material-chart-line:{ .lg .middle } **Superior Accuracy**
- :material-application:{ .lg .middle } **User Interface**

---

TabPFN consistently outperforms state-of-the-art methods like gradient-boosted decision trees (GBDTs) on datasets with up to 10,000 samples. It achieves higher accuracy and better performance metrics across a range of classification and regression tasks.

- :material-shield-check:{ .lg .middle } **Robustness**

---
Visual interface for no-code interaction with TabPFN. Perfect for quick experimentation and visualization.

The model demonstrates robustness to various dataset characteristics, including uninformative features, outliers, and missing values, maintaining high performance where other methods struggle.
[:octicons-arrow-right-24: Access GUI](https://ux.priorlabs.ai/)

- :material-creation-outline:{ .lg .middle } **Generative Capabilities**
- :material-language-python:{ .lg .middle } **Python Package**

---

As a generative transformer-based model, TabPFN can be fine-tuned for specific tasks, generate synthetic data, estimate densities, and learn reusable embeddings. This makes it versatile for various applications beyond standard prediction tasks.

- :material-code-tags-check:{ .lg .middle } **Sklearn Interface**

---
Local installation for research and privacy sesitive use cases with GPU support and scikit-learn compatible interface.

TabPFN follows the interfaces provided by scikit-learn, making it easy to integrate into existing workflows and utilize familiar functions for fitting, predicting, and evaluating models.
[:octicons-arrow-right-24: TabPFN Local](https://github.com/PriorLabs/tabpfn)

- :material-file-excel-box:{ .lg .middle } **Minimal Preprocessing**
- :material-language-r:{ .lg .middle } **R Integration**

---

The model handles various types of raw data, including missing values and categorical variables, with minimal preprocessing. This reduces the burden on users to perform extensive data preparation.
Currently in development. Bringing TabPFN's capabilities to the R ecosystem for data scientists and researchers. Contact us for more information, or to get involved!

</div>

## TabPFN Integrations

## Why TabPFN

<div class="grid cards" markdown>

- :material-cloud-check:{ .lg .middle } **API Client**
- :material-speedometer:{ .lg .middle } **Rapid Training**

---

The fastest way to get started with TabPFN. Access our models through the cloud without requiring local GPU resources.
TabPFN significantly reduces training time, outperforming traditional models tuned for hours in just a few seconds. For instance, it surpasses an ensemble of the strongest baselines in 2.8 seconds compared to 4 hours of tuning.

[:octicons-arrow-right-24: TabPFN Client](https://github.com/PriorLabs/tabpfn-client)
[comment]: <> ([:octicons-arrow-right-24: Learn More](#))

- :material-application:{ .lg .middle } **User Interface**
- :material-chart-line:{ .lg .middle } **Superior Accuracy**

---

Visual interface for no-code interaction with TabPFN. Perfect for quick experimentation and visualization.

[:octicons-arrow-right-24: Access GUI](https://ux.priorlabs.ai/)
TabPFN consistently outperforms state-of-the-art methods like gradient-boosted decision trees (GBDTs) on datasets with up to 10,000 samples. It achieves higher accuracy and better performance metrics across a range of classification and regression tasks.

- :material-language-python:{ .lg .middle } **Python Package**
- :material-shield-check:{ .lg .middle } **Robustness**

---

Local installation for research and privacy sesitive use cases with GPU support and scikit-learn compatible interface.
The model demonstrates robustness to various dataset characteristics, including uninformative features, outliers, and missing values, maintaining high performance where other methods struggle.

[:octicons-arrow-right-24: TabPFN Local](https://github.com/PriorLabs/tabpfn)
- :material-creation-outline:{ .lg .middle } **Generative Capabilities**

- :material-language-r:{ .lg .middle } **R Integration**
---

As a generative transformer-based model, TabPFN can be fine-tuned for specific tasks, generate synthetic data, estimate densities, and learn reusable embeddings. This makes it versatile for various applications beyond standard prediction tasks.

- :material-code-tags-check:{ .lg .middle } **Sklearn Interface**

---

Currently in development. Bringing TabPFN's capabilities to the R ecosystem for data scientists and researchers. Contact us for more information, or to get involved!
TabPFN follows the interfaces provided by scikit-learn, making it easy to integrate into existing workflows and utilize familiar functions for fitting, predicting, and evaluating models.

</div>
- :material-file-excel-box:{ .lg .middle } **Minimal Preprocessing**

<!---
#### Software Dependencies and Operating Systems
Python: Version >= 3.9
Operating Systems: The software has been tested on major operating systems including:
- Ubuntu 20.04, 22.04
- Windows 10, 11
- macOS 11.0 (Big Sur) and later
Git Version 2 or later ([https://git-scm.com/](https://git-scm.com/))
#### Software Dependencies (as specified in `requirements.txt`):
=== "TabPFN"
```
torch>=2.1 (Includes CUDA support in version 2.1 and later)
scikit-learn>=1.4.2
tqdm>=4.66.
numpy>=1.21.2
hyperopt==0.2.7 (Note: Earlier versions fail with numpy number generator change)
pre-commit>=3.3.3
einops>=0.6.0
scipy>=1.8.0
torchmetrics==1.2.0
pytest>=7.1.3
pandas[plot,output_formatting]>=2.0.3,<2.2 (Note: Version 2.2 has a bug with multi-index tables (https://github.com/pandas-dev/pandas/issues/57663), recheck when fixed)
pyyaml>=6.0.1
kditransform>=0.2.0
```
=== "TabPFN and Baselines"
```
torch>=2.1 (Includes CUDA support in version 2.1 and later)
scikit-learn>=1.4.2
tqdm>=4.66.
numpy>=1.21.2
hyperopt==0.2.7 (Note: Earlier versions fail with numpy number generator change)
pre-commit>=3.3.3
einops>=0.6.0
scipy>=1.8.0
torchmetrics==1.2.0
pytest>=7.1.3
pandas[plot,output_formatting]>=2.0.3,<2.2 (Note: Version 2.2 has a bug with multi-index tables (https://github.com/pandas-dev/pandas/issues/57663), recheck when fixed)
pyyaml>=6.0.1
kditransform>=0.2.0
seaborn==0.12.2
openml==0.14.1
numba>=0.58.1
shap>=0.44.1
# Baselines
lightgbm==3.3.5
xgboost>=2.0.0
catboost>=1.1.1
#auto-sklearn==0.14.5
#autogluon==0.4.0
# -- Quantile Baseline
quantile-forest==1.2.4
```
For GPU usage CUDA 12.1 has been tested.
#### Non-Standard Hardware
GPU: A CUDA-enabled GPU is recommended for optimal performance, though the software can also run on a CPU.
## Example usage
=== "Classification"
```python
import numpy as np
import sklearn
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from tabpfn import TabPFNClassifier
# Create a classifier
clf = TabPFNClassifier(fit_at_predict_time=True)
X, y = load_iris(return_X_y=True)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33, random_state=42)
clf.fit(X_train, y_train)
preds = clf.predict_proba(X_test)
y_eval = np.argmax(preds, axis=1)
print('ROC AUC: ', sklearn.metrics.roc_auc_score(y_test, preds, multi_class='ovr'), 'Accuracy', sklearn.metrics.accuracy_score(y_test, y_eval))
```
=== "Regression"
```python
from tabpfn import TabPFNRegressor
from sklearn.datasets import load_diabetes
from sklearn.model_selection import train_test_split
import numpy as np
import sklearn
reg = TabPFNRegressor(device='auto')
X, y = load_diabetes(return_X_y=True)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33, random_state=42)
reg.fit(X_train, y_train)
preds = reg.predict(X_test)
print('Mean Squared Error (MSE): ', sklearn.metrics.mean_squared_error(y_test, preds))
print('Mean Absolute Error (MAE): ', sklearn.metrics.mean_absolute_error(y_test, preds))
print('R-squared (R^2): ', sklearn.metrics.r2_score(y_test, preds))
```
-->
<br>
<br>
---

The model handles various types of raw data, including missing values and categorical variables, with minimal preprocessing. This reduces the burden on users to perform extensive data preparation.

</div>
3 changes: 3 additions & 0 deletions docs/getting_started/install.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,6 @@
**Client** [![colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1KKKOVJk-5N972ZRUeGmRXh8EibRIRGxA#scrollTo=o03aOVAw0Etg)
**Local** [![colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1SHa43VuHASLjevzO7y3-wPCxHY18-2H6#scrollTo=o03aOVAw0Etg&line=3&uniqifier=1)

You can access our models through our API (https://github.com/automl/tabpfn-client), via our user interface built on top of the API (https://www.ux.priorlabs.ai/) or locally.

=== "Python API Client (No GPU, Online)"
Expand Down
Loading

0 comments on commit 1464766

Please sign in to comment.