Research Projects

TidalDecode

A sparse attention framework for large language model decoding

Read more »

Machine Learning Compilation for Large Language Models

A universal solution that allows any language model to be deployed natively on a diverse set of hardware backends and native applications.

Read more »

FlexFlow Serve

Low-Latency, High-Performance LLM Serving

Read more »

Cortex

End-to-end compilation of ML applications with dynamic and irregular control flow and data structure accesses

Read more »

Apache TVM Stack

TVM: An Automated End-to-End Optimizing Compiler for Deep Learning

Read more »

Hyperband and ASHA

Principled early-stopping approaches for hyperparameter optimization

Read more »

FlexFlow

Automatically Discovering Fast Parallelization Strategies for DNN Training

Read more »

TASO

The Tensor Algebra SuperOptimizer for Deep Learning

Read more »

XGBoost

A Scalable Tree Boosting System

Read more »