DeepHermes Preview features swappable standard output to R1 distill CoT reasoning. Its kind of blowing my mind.

SmokeyDope@lemmy.world · edit-2 4 days ago

DeepHermes Preview features swappable standard output to R1 distill CoT reasoning. Its kind of blowing my mind.

Sims@lemmy.ml · 4 days ago

Agree. I also shift between them. As the bare minimum, I use a thinking model to ‘open up’ the conversation, and then often continue with a normal model, but it certainly depends on the topic.

Long ago we got ‘routellm’ I think, that routed a request depended on its content, but the concept never got traction for some reason. Now it seems that closedai and other big names are putting some attention to it. Great to see DeepHermes and other open players be in front of the pack.

I don’t think it will take long before we have the agentic framework do the activation of different ‘modes’ of thinking dependent on content/context, goals etc. It would be great if a model can be triggered into several modes in a standard way.

SmokeyDope@lemmy.world · edit-2 2 days ago

I think the idea of calling multiple different kinds of ways to for llms to ‘process’ a given input in a standard way is promising.

I feel that after reasoning we will train models how to think emotionally in a more intricate way. By combining reasoning with a more advanced sense of individuality and greater emotions simulation we may get a little closer to finding a breakthrough.

OpticalMoose@discuss.tchncs.de · 4 days ago

That’s pretty cool. I’ve tried a few of the distills, but I’ve mostly gone back to regular models.

simple@lemm.ee · 4 days ago

How does it compare to regular deepseek distills though?

SmokeyDope@lemmy.world · edit-2 4 days ago

DeepHermes 24B CoT thought patterns feels about on par with the official R1 distill Ive tried. Its important to note though my experience is limited to the deepseek r1 NeMo 12B distill as thats what fit nice and fast on my card.

All the r1 distill thought process internal monolouge humanisms “let me write that down” “if I remember correctly” “oh, but wait that doesnt sound right lets try again” are there. the multiple 'but wait, what if’s" before ending the thought to examine multiple sides are there too. It spends about 2-5k tokens thinking. It tends to stay on track and catch minor mistakes or hallucinations.

Compared to the unofficial mistral-24b distills this is top tier for sure. I think its toe to toe with ComputationDolphins 24B R1 distill, and its just a preview.