LLMs develop their own understanding of reality as their language abilities improve

Hackworth@lemmy.world · 20 days ago

All of the data centers in the US combined use 4% of total electric load.

Hackworth@lemmy.world · 21 days ago

I’ll try it out! It’s been a hot minute, and it seems like there are new options all the time.

Hackworth@lemmy.world · 21 days ago

Yeah, I’ve had decent results running the 7B/8B models, particularly the fine tuned ones for specific use cases. But as ya mentioned, they’re only really good in thier scope for a single prompt or maybe a few follow-ups. I’ve seen little improvement with the 13B/14B models and find them mostly not worth the performance hit.

Hackworth@lemmy.world · 21 days ago

All of the data centers in the US combined use 4% of total electric load.

Hackworth@lemmy.world · 22 days ago

Then again, the US and China are basically the only players in this “game” atm. Hugging Face is trying hard to get the EU on-boarded, and I’m sure we’ll see more contenders. But right now it’s a 2-player game.

Hackworth@lemmy.world · edit-2 26 days ago

Calling something illegal in spite of or in absence of precedent is a time-honored tactic - though not a particularly persuasive one.

Hackworth@lemmy.world · 26 days ago

For perspective, all of the data centers in the US combined use 4% of total electric load.

Hackworth@lemmy.world · 2 months ago

It’s probably a vision model (like this) with custom instructions that direct it to focus on those factors. It’d be interesting to see the instructions.

Hackworth@lemmy.world · edit-2 2 months ago

I think it’s more likely a compound sigmoid (don’t Google that). LLMs are composed of distinct technologies working together. As we’ve reached the inflection point of the scaling for one, we’ve pivoted implementations to get back on track. Notably, context windows are no longer an issue. But the most recent pivot came just this week, allowing for a huge jump in performance. There are more promising stepping stones coming into view. Is the exponential curve just a series of sigmoids stacked too close together? In any case, the article’s correct - just adding more compute to the same exact implementation hasn’t enabled scaling exponentially.

Hackworth@lemmy.world · 2 months ago

There used to be very real hardware reasons that upload had much lower bandwidth. I have no idea if there still are.

Hackworth@lemmy.world · 2 months ago

Ditto, I was about to start waxing poetic about my bard.

Hackworth@lemmy.world · 3 months ago

https://www.unitree.com/g1

Hackworth@lemmy.world · 3 months ago

Yeah, but they encourage confining it to a virtual machine with limited access.

Hackworth@lemmy.world · 3 months ago

Huh. Grandpa Simpson was right. It did happen to me too.

Hackworth@lemmy.world · 3 months ago

Logic and Path-finding?

Hackworth@lemmy.world · 3 months ago

Shithole country.

Hackworth@lemmy.world · edit-2 3 months ago

Yeah, using image recognition on a screenshot of the desktop and directing a mouse around the screen with coordinates is definitely an intermediate implementation. Open Interpreter, Shell-GPT, LLM-Shell, and DemandGen make a little more sense to me for anything that can currently be done from a CLI, but I’ve never actually tested em.

Hackworth@lemmy.world · edit-2 5 months ago

LLMs develop their own understanding of reality as their language abilities improve

Hackworth@lemmy.world · 5 months ago

A.I. groks 66%-76% faster with data augmentation strategies.

Hackworth@lemmy.world · 6 months ago

Posit: In the future, generative A.I. will be thought of as the unconscious part of a general A.I.'s mind.

Hackworth@lemmy.world · edit-2 8 months ago

The Future of Large Language Model Pre-training is Federated