another AI question

Wed Apr 8 12:16:29 EDT 2026

Martin Cracauer wrote on Tue, Apr 07, 2026 at 05:50:21PM -0400: 
> The situation with LLMs on FreeBSD is not totally catastrophic.
> 
> The NVidia drivers are currently broken on my 5090, so I cannot
> compare Vulkan/FreeBSD to Linux/Cuda.

Made them work, you need loader.conf
hw.nvidia.registry.EnableGpuFirmware=17

Performance on bartowski/Qwen_Qwen3.5-27B-GGUF:Q6_K_L in llama.cpp is:
- FreeBSD Vulkan 49 tokens/second
- Linux CUDA 56 tokens/second

Will get Linux/Vulkan numbers when I have a chance.

But this is encouraging.  Windows was also 10% slower than Linux.

Martin

> But they work on my 2080ti with Vulkan and run both ollama and
> llama.cpp, accelerated.
> 
> On my laptop with "AMD Ryzen 7 PRO 4750U with Radeon Graphics" also
> runs Vulkan and accelerates ollama (although only by a factor of 3
> compared to CPU).  This combo does not run llama.cpp
> 
> Now that NVidia drivers are running on at least one of my cards I'll
> give it another go to run CUDA through Linuxulator.

That go failed.  No CUDA on Linuxulator still.

Martin
-- 
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
Martin Cracauer <cracauer at cons.org>   http://www.cons.org/cracauer/