Did anyone figure out what it takes to run <Mellum...
# random
s
Did anyone figure out what it takes to run Mellum-4b-base smoothly? I've got an MacBook Pro M1 with 32 GB - it's really slow in responding if asked via AI Chat in IntelliJ... I'm running it through LM Studio. Responses inside are okay, but from IntelliJ it collects the context for an eternity. Edit: I'm trying the MLX forks now. Maybe that helps.