Llama.cpp: Deterministic Inference Mode (CUDA): RMSNorm, MatMul, Attention github.com 3 points by diwank 4 hours ago