• 5 Posts
  • 50 Comments
Joined 3 years ago
cake
Cake day: July 1st, 2023

help-circle



  • so i’m using an open source model which leverages a 3090. I tried to make the app itself as agnostic as possible so I can just plug any API compatible server into each component (track metadata generation, actual song generation, AI DJ script, DJ script and so on). I figured making this thing as flexible as possible would be best overall

    Edit: model is ace step 1.5














  • omg, I’m retarded. Your comment made me start thinking about things and…I’ve been using q4 without knowing it… I assumed ollama ran the fp16 by default 😬

    about vllm, yeah I see that you have to specify how much to offload manually which I wasn’t a fan of. I have 4x 3090 in an ML server at the moment but I’m using those for all AI workloads so the VRAM is shared for TTS/STT/LLM/Image Gen

    thats basically why I kind of really want auto offload