Wednesday, June 18, 2025
HomeCloud ComputingOpenAI’s o3 worth plunge modifications all the pieces for vibe coders

OpenAI’s o3 worth plunge modifications all the pieces for vibe coders

On June 10, OpenAI slashed the record worth of its flagship reasoning mannequin, o3, by roughly 80% from $10 per million enter tokens and $40 per million output tokens to $2 and $8, respectively. API resellers reacted instantly: Cursor now counts one o3 request the identical as a GPT-4o name, and Windsurf lowered the “o3-reasoning” tier to a single credit score as nicely. For Cursor customers, that’s a ten-fold value minimize in a single day.

Latency improved in parallel. OpenAI hasn’t printed new latency metrics; third-party dashboards nonetheless see time to first token (TTFT) within the 15s to 20s vary for lengthy prompts. Because of contemporary Nvidia GB200 clusters and a revamped scheduler that shards lengthy prompts throughout extra GPUs, o3 feels snappier in actual use. o3 continues to be slower than light-weight fashions, however not coffee-break sluggish.

Claude 4 is quick but sloppy

A lot of the neighborhood’s oxygen has gone to Claude 4. It’s undeniably fast, and its 200k context window feels luxurious. But, in day-to-day coding, I, together with many Reddit and Discord posters, hold tripping over Claude’s motion bias: It fortunately invents stubbed capabilities as an alternative of actual implementations, fakes unit exams, or rewrites mocks that have been instructed to go away alone. The velocity is nice; the follow-through usually isn’t.

RELATED ARTICLES

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Most Popular

Recent Comments