Another week, another decimal point to the left.
Just to be clear the dollar figures are not comparable.
Altman - scaling to a new frontier of parameter sizes
Wenfeng - copying OpenAI outputs to train a similar model
Pan - using the DeepSeek technique to train a toy model
Chen - using DeepSeek technique to train an even smaller toy model
The question is do we really want to commit $500 billion to an AI cartel? Looks like there are many others ways to develop this technology that costs dramatically less.
It looks to me like they're just tuning Qwen2-VL on counting and a couple other visual tasks.
I don't know if you've tried but it takes surprisingly few batches to tune a language model.
Just to be clear the dollar figures are not comparable.
Altman - scaling to a new frontier of parameter sizes
Wenfeng - copying OpenAI outputs to train a similar model
Pan - using the DeepSeek technique to train a toy model
Chen - using DeepSeek technique to train an even smaller toy model
The question is do we really want to commit $500 billion to an AI cartel? Looks like there are many others ways to develop this technology that costs dramatically less.
It looks to me like they're just tuning Qwen2-VL on counting and a couple other visual tasks.
I don't know if you've tried but it takes surprisingly few batches to tune a language model.