I was asked if I could help elaborate on this twitter post and thought I'd share my attempt to contextualize Erik's idea to a larger audience.
A theory I have about product development is that you can totally do gradient descent all the time and build a really good product! The main reason teams that get stuck in a “local maxima” is that they believe that every change has to be a Pareto improvement.
— Erik Bernhardsson (@bernhardsson) November 1, 2022
So, imagine X is time and Y is good/ bad. you’re going along getting feedback and things are going great. Then they start to go really bad. Maybe you found some feature people like and other tested features didn’t excite customers.

It’s very easy to think - I found the best thing. Let's focus in this one thing. I measured and stuff gets bad.
But you’re missing all the other ups and downs and the general trend upwards if you do that

In math terms, specifically in calculus when you’re trying to create an approximate curve to match measurements, and in statistics when you’re trying to predict something (and therefore in AI all the time) there’s the idea called “over fitting” and it’s the tendency to replicate the blue line too closely and get stuck in a peak or valley (local maxima or minima) and the system either decides that spot is the best spot or it stops considering the rest of the data.
What you really want to create is something more generalized so it captures that whole dotted line.
They have lots of techniques to do this. I’m a bit fuzzy on math of gradient descent specifically but usually what they mean is they artificially apply some small amount in the opposite direction based on the speed and direction of change - the vector or acceleration amount. If the change in direction is subtle they follow the curve more precisely, if it’s very sharp they apply more effort in the opposite direction to smooth the curve

Of course, this can go too far. That’s why there are techniques like gradient descent to try to appropriately find the right amount… for lack of better terms … faith or skepticism.

I would say it’s basically the idea of “be skeptical” and “keep and open mind” and “always be vigilant” combined and backed my math and helps with finding the appropriate amounts of skepticism, vigilance, and faith for the situation.
Just cause one customers is super passionate doesn’t mean bet the whole company in it. Just because no one read your article the first time you posted in hackernews doesn’t mean there’s no interest. And just because you hit big once doesn’t mean you have all the answer and can replicate it.
(I'm specifically thinking about a particular person at Meta who thinks because he won the lottery at 23 that he has the winning formula to own the entire Metaverse and has lost 71% of the stock value as a result of that kind of thinking.)
Gotta keep measuring the whole graph over time and constantly be learning and adapting.
In general, I think my spaceship metaphor helps me avoid some of these pitfalls without needing to do that math on the day-to-day and for calculating the next round of effort and duration of the next push before a new pivot, but this kind of thinking is very critical when you are reviewing data and need to find the direction to pivot, or if one is needed at all.
I originally picked up these concepts from a ML and data Scientist and also organic farmer friend of mine Tyler Morgan and asked him how far off my Jessified take on this was from reality.
Rather than edits, I got these very interesting additions:


I was floored when he said momentum is used to overcome plateaus - it's a key part of my spaceship metaphor, and the the idea of using random points to pivot around sounds very much like how I suggest using random as a tool to run experiments to search for a new life.
In particular I'm interested in defining a strong signal so that you can be more confident in making bold moves.
I like bold moves. In the past I have benefitted from bold-move adventures but since my return to the game have found it difficult to identify ones worth taking.