Was this newsletter forwarded to you? Sign up to get it in your inbox.
When OpenAI’s new reasoning model o3 came out, Every’s CEO Dan Shipper and OpenAI’s Sam Altman agreed that AI is changing the future of learning: If you aren’t using it to learn every day, they said, you’re “not going to make it.”
OK, I thought, I’ve got a challenge for o3: Make me physically stronger. Ten times stronger, in fact.
It’s been a life goal of mine to improve my chinups. I started 2024 unable to do even one, and months of working out alone got me nowhere. It wasn’t until I started working with a calisthenics trainer, Silvia, that I finally, after half a dozen focused sessions, got my first shaky repetition.
Now I want to do ten.
What better way to test AI’s capacity for teaching people in the real world than to ask it to help me achieve a goal I’ve never even come close to?
The more I thought about it, the more I liked this plan. I’d pit GPT-4o against o3 and see which model gave me a better chance of progressing from one to 10 unassisted chin-ups. I wanted to know which one would be a better teacher: 4o, the fast and reliable model I’ve been using as my daily driver, or o3, the more advanced reasoning model. Would either be up to the task? Would one emerge victorious? Let’s find out.
What I’m going to judge GPT-4o and o3 on
I would use OpenAI’s older standard model GPT-4o and o3 separately to generate a training plan. I created a set of rubrics against which to evaluate the models, based on what I think matters when you’re trying to learn something in the real world: quick feedback so you don’t make the same mistake over and over again, advice that’s tailored to your specific situation, incremental progress, and the motivation to keep going.
- Responsiveness: How quickly do I get feedback?
- Personalization: Is the advice tailored to me?
- Progress: Does it help me get closer to my goal?
- Motivation: How excited am I to keep showing up and putting in the work?
To judge the LLMs’ training plans, I also needed to define what “good” looks like. I trust my trainer, and she’s already delivered real results—so her guidance and the techniques she uses with me will serve as my baseline, the standard against which I’ll measure everything else.
GPT-4o’s training plan
Alright, first up: GPT-4o. Language models are only as good as the context you give them, so I made sure to be specific. In my prompt, I included my age, height, weight, the number of chin-ups I can currently do, available equipment, and training schedule. I also attached videos of me doing one unassisted and one assisted chin-up.
What worked
It set an achievable target
4o starts by telling me that going from one to 10 unassisted chin-ups in just one month is an unrealistic goal (I did this exercise a few days before OpenAI pushed the infamous update that made 4o disingenuously agreeable toward users). This tracks with what Silvia told me, and just like her, 4o gave me an interim goal of four to six ones to keep me motivated. OpenAI’s flagship model is off to a great start.
It picked up on key details in the video
The video I uploaded records my clenched face and pursed lips in the struggle to get my chin the last few inches over the bar. If you saw it, you’d definitely notice how hard I was trying—and to my surprise, GPT-4o did too. It said that I was getting over the bar “with solid effort” and even called out the slight tension in my shoulders before I started the chin-up (ideal form would have me doing a chin-up from a dead hang; in other words, no tension in my shoulders). Props to the model for pulling out such granular detail, comparable to the advice of a personal trainer.
It structured the training well
The model split my training into two parts: strength and control on one day, volume and endurance on the other. Every week in the plan followed this structure, which lined up closely with how Silvia designs my workouts—a good sign.
What didn’t work
The Only Subscription
You Need to
Stay at the
Edge of AI
The essential toolkit for those shaping the future
"This might be the best value you
can get from an AI subscription."
- Jay S.
Join 100,000+ leaders, builders, and innovators
Email address
Already have an account? Sign in
What is included in a subscription?
Daily insights from AI pioneers + early access to powerful AI tools



Comments
Don't have an account? Sign up!