2026, Year of Reinforcement Learning?

kingstnap a day ago

I think the next big thing will be will actually be test time training. It will represent another unbelievable increase in compute but it will produce an even bigger jump than what thinking models provided.

Some food for thought is this: If you think AGI should dynamically learn and get better at arbitrary skills on the fly then LLMs + SGD is already a sort of slow moving AGI.

thtgrisdjdjdh 2 days ago

Works only for verifiable rewards, since humans (thankfully) don't have a good theory of knowledge (epistemology).

There's only so far that these agents can go.