zhwu
- Karma
- 31
- Created
- 3 years ago
Recent Submissions
- 1. ▲ A collection of reproducible LLM inference engine benchmarks: SGLang vs. vLLM (github.com)
- 2. ▲ Efficient GPU Resource Management for ML Workloads Using SkyPilot, Kueue on GKE (github.com)
- 3. ▲ New Recipe: Serving Llama-2 with VLLM's OpenAI-Compatible API Server (github.com)
- 4. ▲ Train Your Own Vicuna on Llama-2 (github.com)
- 5. ▲ Guide on fine-tuning your own Vicuna on Llama-2 (twitter.com)
- 6. ▲ Serving LLM 24x Faster on the Cloud with VLLM and SkyPilot (blog.skypilot.co)
- 7. ▲ Biologists are moving to the clouds with SkyPilot from UC Berkeley (twitter.com)