Skip to content

kubernetes-sigs/gateway-api-inference-extension

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Gateway API Inference Extension

The Gateway API Inference Extension came out of wg-serving and is sponsored by SIG Network. This repo contains: the load balancing algorithm, ext-proc code, CRDs, and controllers of the extension.

This extension is intented to provide value to multiplexed LLM services on a shared pool of compute. See the proposal for more info.

Status

This project is currently in development.

For more rapid testing, our PoC is in the ./examples/ dir.

Getting Started

Install the CRDs into the cluster:

make install

Delete the APIs(CRDs) from the cluster:

make uninstall

Deploying the ext-proc image Refer to this README on how to deploy the Ext-Proc image.

Contributing

Our community meeting is weekly at Th 10AM PDT; zoom link here.

We currently utilize the #wg-serving slack channel for communications.

Contributions are readily welcomed, follow the dev guide to start contributing!

Code of conduct

Participation in the Kubernetes community is governed by the Kubernetes Code of Conduct.