sure, but I tried this and they all suck. I only have 8GB of ram so can only use the smaller versions of the models, and they’re much much worse at just making up random shit
I am, admittedly, “GPU rich”; I have at ~48GB of VRAM at my disposal on my main workstation, and 24GB on my gaming rig. Thus, I am using Q8 and Q6_L quantized GGUFs.
Naturally, my experience with the “fidelity” of my LLM models re: hallucinations would be better.
Self-hosted LLMs are the way.
I actually think that (presently) self hosted LLMs are much worse for hallucination
sure, but I tried this and they all suck. I only have 8GB of ram so can only use the smaller versions of the models, and they’re much much worse at just making up random shit
Oof, ok, my apologies.
I am, admittedly, “GPU rich”; I have at ~48GB of VRAM at my disposal on my main workstation, and 24GB on my gaming rig. Thus, I am using Q8 and Q6_L quantized
GGUF
s.Naturally, my experience with the “fidelity” of my LLM models re: hallucinations would be better.