inference (2 configs)

LLM inference in C/C++
CLAUDE.md+AGENTS.md 130 lines Ai Infrastructure
A high-throughput and memory-efficient inference and serving engine for LLMs
CLAUDE.md+AGENTS.md 100 lines Ai Infrastructure