AMD Announces "Instella" Fully Open-Source 3B Language Models

just_another_person@lemmy.world · 11 hours ago

AMD Announces "Instella" Fully Open-Source 3B Language Models

𝕽𝖚𝖆𝖎𝖉𝖍𝖗𝖎𝖌𝖍@midwest.social · 4 hours ago

I need to catch up on training. I need an LLM that I can train on all my ebooks and digitized music, and can answer questions “what’s that book where the girl goes to the thing and does that deed?”

catloaf@lemm.ee · 20 minutes ago

Existing implementations can probably do that already.

just_another_person@lemmy.world · 53 minutes ago

Interesting idea!

TheGrandNagus@lemmy.world · 9 hours ago

Fully open and accessible: Fully open-source release of model weights, training hyperparameters, datasets, and code, fostering innovation and collaboration within the AI community.

That’s actually pretty good. Seems to be open source as the OSI defines it, rather than the much more common “this model is open source, but the dataset is a secret”.

werefreeatlast@lemmy.world · 4 hours ago

It knows everything about everything you ever received by mail from your local grocery store.

Can it learn my local database of PDF books I illegally downloaded years ago? No!

That’s right! Isn’t it great?

just_another_person@lemmy.world · 52 minutes ago

Huh?

HappyFrog@lemmy.blahaj.zone · 8 hours ago

I see all these graphs about how much better this LLM is than another, but do those graphs actually translate to real world usefulness?

just_another_person@lemmy.world · 7 hours ago

I think more of the issue is what constitutes actual open source. This is actually open source, and it performs well. If you’re familiar with the space, then it’s a big deal.

null_dot@lemmy.dbzer0.com · 4 hours ago

I’m not familiar with the space but realised this was a big deal.

I feel like I need to shower after interacting with any of the other LMs.

Something fully open source will hopefully be embraced by the community and be used for some interesting, useful, and value producing things instead of just attracting venture capital.

HappyFrog@lemmy.blahaj.zone · edit-2 5 hours ago

I see, thank you.

Damn, they even chose a dataset with a open license.

brokenlcd@feddit.it · 10 hours ago

The problem is… How do we run it if rocm is still a mess for most of their gpus? Cpu time?

just_another_person@lemmy.world · 10 hours ago

Well it’s not necessarily geared towards consumer devices. As mentioned in the writeup, it’s not trained on consumer gear.

GaMEChld@lemmy.world · edit-2 10 hours ago

Smart people, I beg of thee, explain! What can it do?

Edit: looks to be another text based one, not image generation right?

just_another_person@lemmy.world · 10 hours ago

It’s language only, hence, LM