3 Comments
User's avatar
Imran's avatar

Just to be clear the dollar figures are not comparable.

Altman - scaling to a new frontier of parameter sizes

Wenfeng - copying OpenAI outputs to train a similar model

Pan - using the DeepSeek technique to train a toy model

Chen - using DeepSeek technique to train an even smaller toy model

Expand full comment
Gale Pooley's avatar

The question is do we really want to commit $500 billion to an AI cartel? Looks like there are many others ways to develop this technology that costs dramatically less.

Expand full comment
swiley's avatar

It looks to me like they're just tuning Qwen2-VL on counting and a couple other visual tasks.

I don't know if you've tried but it takes surprisingly few batches to tune a language model.

Expand full comment