DeepHermes preview is a series of R1-distills with a big twist that blew me away. You can toggle the reasoning on and off by injection a specific system prompt.

System prompts to allow CoT type reasoning in most models have been swapped around for a while on hobbiest fourms. But they tended to be quite large taking up valuable context space. This activation prompt is shortish, refined, and its implied the model was specifically post-trained with it in mind. I would love to read the technical paper behind what they did different.

You are a deep thinking AI, you may use extremely long chains of thought to deeply consider the problem and deliberate with yourself via systematic reasoning processes to help come to a correct solution prior to answering. You should enclose your thoughts and internal monologue inside <think> </think> tags, and then provide your solution or response to the problem.

Ive been playing around with R1 CoT models a few months now. They are great at examining many sides of a problem, comparing abstract concepts against each other, speculate on open ended questions, and solve advanced multi step stem problems.

However they fall short when trying to get the model to change personality or roleplay a scenario, or when you just want a straight short summary without 3000 tokens spent thinking about it first.

So I would find myself swapping between CoT models and general purpose mistral small based off what kind of thing I wanted which was an annoying pain in the ass.

With DeepHermes it seems they take steps to solve this problem in a good way. Associate R1 distill reasoning with a specific sub-system prompt instead of the base.

Unfortunately constantly editing the system prompt is annoying. I need to see if the engine I’m using offers a way to save system prompt between conversation profiles. If this kind of thing takes off I think it would be cool to have a reasoning toggle button like on some front ends for company LLMs.

  • Sims@lemmy.ml
    link
    fedilink
    English
    arrow-up
    3
    ·
    4 days ago

    Agree. I also shift between them. As the bare minimum, I use a thinking model to ‘open up’ the conversation, and then often continue with a normal model, but it certainly depends on the topic.

    Long ago we got ‘routellm’ I think, that routed a request depended on its content, but the concept never got traction for some reason. Now it seems that closedai and other big names are putting some attention to it. Great to see DeepHermes and other open players be in front of the pack.

    I don’t think it will take long before we have the agentic framework do the activation of different ‘modes’ of thinking dependent on content/context, goals etc. It would be great if a model can be triggered into several modes in a standard way.

    • SmokeyDope@lemmy.worldOPM
      link
      fedilink
      English
      arrow-up
      1
      ·
      edit-2
      2 days ago

      I think the idea of calling multiple different kinds of ways to for llms to ‘process’ a given input in a standard way is promising.

      I feel that after reasoning we will train models how to think emotionally in a more intricate way. By combining reasoning with a more advanced sense of individuality and greater emotions simulation we may get a little closer to finding a breakthrough.