Martin Shkreli

	A	B	C	D
1	Main
2		vLLM	PagedAttention	Efficient management of attention key and value memory
3			continuous batching
4			tensor parallelism
5		DeepSpeed
6			model parallelism, inference-customized kernels, MoQ quantization
7
8		TensorRT
9
10		PowerInfer
11
12
13		Accelerate	https://huggingface.co/docs/accelerate/index

Martin Shkreli