Catalyst: Research Projects

Efficient, Flexible and Portable LLM Structured Generation

A sparse attention framework for large language model decoding

A universal solution that allows any language model to be deployed natively on a diverse set of hardware backends and native applications.

Low-Latency, High-Performance LLM Serving

End-to-end compilation of ML applications with dynamic and irregular control flow and data structure accesses

TVM: An Automated End-to-End Optimizing Compiler for Deep Learning

Principled early-stopping approaches for hyperparameter optimization

Automatically Discovering Fast Parallelization Strategies for DNN Training

The Tensor Algebra SuperOptimizer for Deep Learning

A Scalable Tree Boosting System