skglm#

skglm

The fastest and most modular Python package for regularized Generalized Linear Models — designed for researchers and engineers who demand speed, structure, and scikit-learn compatibility.

Get Started

Simple. Modular. Powerful.

Everything you need to build fast, flexible, and scalable GLMs — in one modular library.

Easy to Use

Get started in minutes with an intuitive API, comprehensive examples, and out-of-the-box estimators.

Modular Design

Compose custom estimators from interchangeable datafits and penalties tailored to your use case.

Speed

Solve large-scale problems with lightning-fast solvers — up to 100× faster than scikit-learn.

Plug & Extend

Fully scikit-learn compatible and ready for custom research and production workflows.

Support Us

Citation

Using skglm in your work? You are free to use it. It is licensed under BSD 3-Clause. As the result of perseverant academic research, the best way to support its development is by citing it.

@inproceedings{skglm,
    title     = {Beyond L1: Faster and better sparse models with skglm},
    author    = {Q. Bertrand and Q. Klopfenstein and P.-A. Bannier
                 and G. Gidel and M. Massias},
    booktitle = {NeurIPS},
    year      = {2022},
}

@article{moufad2023skglm,
    title  = {skglm: improving scikit-learn for regularized Generalized Linear Models},
    author = {Moufad, Badr and Bannier, Pierre-Antoine and Bertrand, Quentin
              and Klopfenstein, Quentin and Massias, Mathurin},
    year   = {2023}
}

Contributions

Contributions, improvements, and bug reports are always welcome. Help us make skglm better!

How to Contribute

Real-World Applications

skglm drives impactful solutions across diverse sectors with its fast, modular approach to regularized GLMs and sparse modeling. Find various advanced topics in our Tutorials and Examples sections.

Healthcare

Enhance clinical trial analytics and early biomarker discovery by efficiently analyzing high-dimensional biological data and features like cox regression modeling.

Finance

Conduct transparent and interpretable risk modeling with scalable, robust sparse regression across vast datasets.

Energy

Optimize real-time electricity forecasting and load analysis by processing large time-series datasets for predictive maintenance and anomaly detection.