A GPU-native, all-Python platform for tree-based machine learning.
Note: OpenBoost is in active development. APIs may change between releases. Use at your own risk.
For standard GBDT, use XGBoost/LightGBM — they're highly optimized C++.
For GBDT variants (probabilistic predictions, interpretable GAMs, custom algorithms), OpenBoost brings GPU acceleration to methods that were previously CPU-only and slow:
- NaturalBoost: 1.3-2x faster than NGBoost
- OpenBoostGAM: 10-40x faster than InterpretML EBM
Plus: ~20K lines of readable Python. Modify, extend, and build on — no C++ required.
| XGBoost / LightGBM | OpenBoost | |
|---|---|---|
| Code | 200K+ lines of C++ | ~20K lines of Python |
| GPU | Added later | Native from day one |
| Customize | Modify C++, recompile | Modify Python, reload |
OpenBoost provides primitives (histograms, binning, tree fitting) that you combine into algorithms:
- Standard GBDT — drop-in gradient boosting with multiple growth strategies, early stopping, and callbacks
- Distributional GBDT — predict full probability distributions with NGBoost-style natural gradient boosting
- Interpretable GAMs — explainable feature effects inspired by EBM
- DART — dropout regularization for reduced overfitting
- Linear-leaf models — linear models in tree leaves for better extrapolation
- Your own algorithms — custom losses, distributions, or entirely new methods
All run on GPU with the same Python code. All models support save()/load() persistence, and most support callbacks and early stopping.
High-level API:
import openboost as ob
model = ob.GradientBoosting(n_trees=100, max_depth=6, random_state=42)
model.fit(X_train, y_train,
eval_set=[(X_val, y_val)],
callbacks=[ob.EarlyStopping(patience=10)])
predictions = model.predict(X_test)sklearn-compatible:
from openboost import OpenBoostRegressor
from sklearn.model_selection import GridSearchCV
# Works with GridSearchCV, Pipeline, cross_val_score, etc.
model = OpenBoostRegressor(n_estimators=100, random_state=42)
search = GridSearchCV(model, {"max_depth": [4, 6, 8]}, cv=5)
search.fit(X_train, y_train)
# Also available: OpenBoostClassifier, OpenBoostDARTRegressor,
# OpenBoostGAMRegressor, OpenBoostDistributionalRegressorHyperparameter suggestions:
# Auto-suggest params based on dataset characteristics
params = ob.suggest_params(X_train, y_train, task='regression', style='core')
model = ob.GradientBoosting(**params)Low-level API (full control over the training loop):
import openboost as ob
X_binned = ob.array(X_train)
pred = np.zeros(len(y_train), dtype=np.float32)
for round in range(100):
grad = 2 * (pred - y_train) # your gradients
hess = np.ones_like(grad) * 2
tree = ob.fit_tree(X_binned, grad, hess, max_depth=6)
pred += 0.1 * tree(X_binned)pip install openboost
# With GPU support
pip install openboost[cuda]
# With sklearn integration
pip install openboost[sklearn]Full docs, tutorials, and API reference: jxucoder.github.io/openboost
On standard GBDT, OpenBoost's GPU-native tree builder is 3-4x faster than XGBoost's GPU histogram method on an A100, with comparable accuracy:
| Task | Data | Trees | OpenBoost | XGBoost | Speedup |
|---|---|---|---|---|---|
| Regression | 2M x 80 | 300 | 10.0s | 45.5s | 4.6x |
| Binary | 2M x 80 | 300 | 11.8s | 40.9s | 3.5x |
Benchmark details
- Hardware: NVIDIA A100 (Modal)
- Fairness controls: both receive raw numpy arrays (no pre-built DMatrix),
cuda.synchronize()after OpenBoostfit(), both at default threading, XGBoostmax_bin=256to match OpenBoost, JIT/GPU warmup before timing - Metric: median of 3 trials, timing
fit()only - XGBoost config:
tree_method="hist",device="cuda"
Reproduce with:
# Local (requires CUDA GPU)
uv run python benchmarks/bench_gpu.py --task all --scale medium
# On Modal A100
uv run modal run benchmarks/bench_gpu.py --task all --scale mediumAvailable scales: small (500K), medium (2M), large (5M), xlarge (10M).
Where OpenBoost really shines is on GBDT variants that don't exist in XGBoost/LightGBM:
| Model | vs. | Speedup |
|---|---|---|
| NaturalBoost (GPU) | NGBoost | 1.3-2x |
| OpenBoostGAM (GPU) | InterpretML EBM | 10-40x |
Note: Benchmarks reflect the current state of development and may change as both OpenBoost and comparison libraries evolve.
Train-many optimization: Industry workloads often train many models (hyperparameter tuning, CV, per-segment models). XGBoost optimizes for one model fast. OpenBoost plans to enable native optimization for training many models efficiently.
OpenBoost implements and builds on ideas from these papers:
- Gradient Boosting: Friedman, J. H. (2001). Greedy Function Approximation: A Gradient Boosting Machine. Annals of Statistics.
- XGBoost: Chen, T., & Guestrin, C. (2016). XGBoost: A Scalable Tree Boosting System. KDD.
- LightGBM: Ke, G., et al. (2017). LightGBM: A Highly Efficient Gradient Boosting Decision Tree. NeurIPS.
- CatBoost: Prokhorenkova, L., et al. (2018). CatBoost: Unbiased Boosting with Categorical Features. NeurIPS.
- NGBoost: Duan, T., et al. (2020). NGBoost: Natural Gradient Boosting for Probabilistic Prediction. ICML.
- EBM: Nori, H., et al. (2019). InterpretML: A Unified Framework for Machine Learning Interpretability.
- DART: Rashmi, K. V., & Gilad-Bachrach, R. (2015). DART: Dropouts meet Multiple Additive Regression Trees. AISTATS.
Apache 2.0