QwQ-32B is a 32 billion parameter language model achieves comparable performance to DeepSeek-R1 with 671 billion parameters, using reinforcement learning for scaling

☆ Yσɠƚԋσʂ ☆@lemmy.ml · edit-2 7 hours ago

QwQ-32B is a 32 billion parameter language model achieves comparable performance to DeepSeek-R1 with 671 billion parameters, using reinforcement learning for scaling

BaconIsAVeg@lemmy.ml · 6 hours ago

Settings don’t really matter if you’re comparing two models side by side, and yes I see the same indecision in your reply.

 .model ollama:qwq:32b

> Which is larger, 9.21 or 9.1999999999?
<think>
Okay, let's see... I need to figure out which number is bigger between 9.21 and 9.1999999999. Hmm, both numbers start with 9. So the first digit after the decimal point is important here.

Starting with the first number, 9.21. That's straightforward—it's nine and twenty-one hundredths. The digits after the decimal are 2 and 1. 

The second number is 9.1999999999. This looks like it's nine point one followed by a bunch of 9s. Let me write them out to compare digit by digit.

Let me line them up:

9.2100000000
9.1999999999

Wait, right? Because 9.21 is the same as 9.2100000000 if we add more decimal places. So now comparing each digit after the decimal:

First digit after the decimal: both have 2 vs. 1. Wait no—the first number's first decimal digit is 2, and the second one's first decimal digit is 1? Wait wait, hold on.

Wait, actually, let me check again. The first number is 9.21. So breaking it down:

- The integer part is 9.
- Decimal parts: tenths place is 2, hundredths is 1, and then the rest are zeros if we go further.

The second number is 9.1999999999. Let's see:

- Integer part also 9.
- Tenths place: 1 (since it's .1...), then the next digits are all 9s up to the 10th decimal place?

Wait, so tenths digit of first number is 2 vs. second one's tenths digit is 1? That already would make the first number larger because 2 is bigger than 1 in the tenths place. 

But maybe I'm miscounting the places? Let me confirm.

For 9.21: The first decimal digit (tenths) is 2, second (hundredths) is 1, then zeros beyond.

For 9.1999999999: Tenths place is 1, hundredths is 9, and then each subsequent digit up to the 10th decimal is also 9.

So when comparing the two numbers:

At the tenths place:
- 9.21 has a 2
- 9.199... has a 1

Since 2 > 1 in the tenths place, that means 9.21 is larger right there. The digits after that don't even matter because the first differing digit (the tenths) already determines it.

But maybe I should think if there's another way. Suppose someone might confuse the number of decimal places? Like thinking that more 9s make it closer to 9.2? Wait, but 0.1999999999 is actually equal to 0.2 when rounded to one decimal place because 0.1999999999... recurring would be exactly 0.2, but here the second number has finite decimals—it's not an infinite series.

Wait a second—wait, 0.1999999999 is just under 0.2, right? Because if you have infinitely many 9s after 0.1..., then it equals exactly 0.2. But in this case, the number given is 9.1999999999 with only ten decimal places (assuming that's how it's written). Let me count the digits after the decimal for the second number: 

After the decimal, starting from first digit:

1 (tenths), then nine more 9s? Let's see: 9.1 followed by 9, so total of 10 decimal places?

Yes, "9.1999999999" has 10 digits after the decimal. So that's tenths digit is 1, and then nine more 9s in hundredths to ten-billionths or whatever.

Therefore, comparing with 9.21 (which is 9.2100000000), the first difference is at the tenths place: 2 vs. 1. Therefore 9.21 is bigger. 

Alternatively, if someone thought maybe converting both to the same number of decimal places? Let's see:

9.21 can be written as 9.2100000000

Comparing term by term:

First digit after .: 2 vs 1 → 2 is bigger. So yeah, definitely.

Alternatively, subtract them to see the difference?

Calculate 9.21 - 9.1999999999 = ?

Let's do that subtraction:

9.2100000000  
-9.1999999999  
= 0.0100000001

Which is a positive number, so yes, 9.21 is larger by approximately 0.01 (exactly 0.0100000001).

Therefore the answer should be that 9.21 is larger.
</think>

The number 9.21 is larger than 9.1999999999. 

**Step-by-Step Explanation:**

1. **Compare the integer parts:** Both numbers have an integer part of 9, so move to the decimal parts.
   
2. **First decimal place (tenths):**
   - 9.21 has a tenths digit of **2**.
   - 9.1999999999 has a tenths digit of **1**.
   
   Since \(2 > 1\), the number 9.21 is already larger at this point. The remaining digits do not need to be compared further.

3. **Verification via subtraction:**
   - Subtracting \(9.1999999999\) from \(9.21\) gives:
     \[
     9.2100000000 - 9.1999999999 = 0.0100000001
     \]
   The positive result confirms that \(9.21\) is indeed larger.

**Answer:** 9.21 is larger than 9.1999999999.

> .copy

> .info
model                   ollama:qwq:32b
temperature             0.7
top_p                   0.9
use_tools               null
max_output_tokens       null
save_session            null
compress_threshold      4000
rag_reranker_model      null
rag_top_k               5
dry_run                 false
function_calling        true
stream                  true
save                    true
keybindings             vi
wrap                    auto
wrap_code               false
highlight               true
light_theme             false

☆ Yσɠƚԋσʂ ☆@lemmy.ml · 6 hours ago

Ultimately what matters is whether it gets the correct answer or not. It’s interesting that yours wasn’t able to do the strawberry test while mine did it with very short thinking cycle.

BaconIsAVeg@lemmy.ml · 6 hours ago

Ultimately what matters is whether it gets the correct answer or not.

That’s… not true at all. It had the right answer, to most of the questions I asked it, just as fast as R1, and yet it kept saying “but wait! maybe I’m wrong”. It’s a huge red flag when the CoT is just trying to 1000 monkeys a problem.

While it did manage to complete the strawberry problem when I adjusted the top_p/top_k, I was using the previous values with other models I’ve tested and never had a CoT go that off kilter before. And this is considering even the 7B Deepseek model was able to get the correct answer for 1/4 of the vram.

☆ Yσɠƚԋσʂ ☆@lemmy.ml · 5 hours ago

It’s true for me. I generally don’t read through the think part. I make the query, do something else, and then come back to see what the actual output it. Overall, I find it gives me way better answers than I got with the version of R1 I was able to get running locally. Turns out the settings do matter though.

QwQ-32B is a 32 billion parameter language model achieves comparable performance to DeepSeek-R1 with 671 billion parameters, using reinforcement learning for scaling

QwQ-32B is a 32 billion parameter language model achieves comparable performance to DeepSeek-R1 with 671 billion parameters, using reinforcement learning for scaling

QwQ-32B: Embracing the Power of Reinforcement Learning