

I’m curious, do you have ADG enabled at all? How many steps do you use generally?


I’m curious, do you have ADG enabled at all? How many steps do you use generally?


Yeah I completely agree on the lyrics it can generate! That being said I actually haven’t tried using the 4b lm model version just yet. I should probably give that a shot…


so i’m using an open source model which leverages a 3090. I tried to make the app itself as agnostic as possible so I can just plug any API compatible server into each component (track metadata generation, actual song generation, AI DJ script, DJ script and so on). I figured making this thing as flexible as possible would be best overall
Edit: model is ace step 1.5


Yup, I have instrumental only stations (and they’re tagged as such so you can filter by those only)


Yeah I’m not disheartened, my mom said it was cool /s
In all seriousness, my wife and I think it’s cool and I literally use it (all day) during the week while I’m working


Sorry for the oversight! It’s basically an AI radio app (with stations playing different music styles). There’s even an AI DJ feature that’s audience aware (think weather callouts for listener locations, audience polls etc)


Richardwayne Garywayne


Things like meshtastic might be a good thing too


Needs more crayon and grunts


It’s one of the reasons I got solar!
My electric bill was higher than my loan payment so it just made sense for me.


I’m just gonna try vllm, seems like ik_llama.cpp doesnt have a quick docker method


IK sounds promising! Will check it out to see if it can run in a container


I’ll take a look at both tabby and vllm tomorrow
Hopefully there’s cpu offload in the works so I can test those crazy models without too much fiddling in the future (server also has 128gb of ram)


Unfortunately i didn’t set up nvlink, but ollama auto splits things for models which require it
I really just a “set and forget” model server lol (that’s why I keep mentioning the auto offload)
Ollama integrates nicely with OWUI


omg, I’m retarded. Your comment made me start thinking about things and…I’ve been using q4 without knowing it… I assumed ollama ran the fp16 by default 😬
about vllm, yeah I see that you have to specify how much to offload manually which I wasn’t a fan of. I have 4x 3090 in an ML server at the moment but I’m using those for all AI workloads so the VRAM is shared for TTS/STT/LLM/Image Gen
thats basically why I kind of really want auto offload


yeah, im currently running the gemma 27b model locally I recently took a look at vllm but the only reason i didnt want to switch is because it doesnt have automatic offloading (seems that it’s a manual thing right now)


Just read the L1 post and I’m just now realizing this is mainly for running quants which I generally avoid
I guess I could spin it up just to mess around with it but probably wouldn’t replace my main model


Thanks, will check that out!


I’m currently using ollama to serve llms, what’s everyone using for these models?
I’m also using open webui as well and ollama seemed the easiest (at the time) to use in conjunction with that
ADG for ace step is Adaptive Dual Guidance. try turning that on + make sure you have thinking enabled, I saw a big difference with both those on (more so thinking)