Why Nesterov Accelerated Gradient Stays Stable Even With Extra Momentum
For over forty years researchers observed that Nesterov Accelerated Gradient (NAG) speeds up convergence without compromising algorithmic stability, yet the underlying reason remained elusive. Ernest Ryu’s collaboration with GPT‑5 finally exposed the missing theoretical link.
Technical Solution
The breakthrough emerged from a systematic exploration of candidate proofs generated by GPT‑5, combined with rigorous human validation. By repeatedly prompting the model to restructure the NAG update equations, Ryu identified a latent invariance that preserves the Lyapunov function under momentum augmentation.
1. Equation Restructuring Suggested by GPT‑5
GPT‑5 produced a family of reformulations of the NAG step:
v_{t+1}=\mu v_t-\eta\nabla f(x_t+\mu v_t) and
x_{t+1}=x_t+v_{t+1}. Although the first draft contained algebraic slips, Ryu recognized the pattern that isolates the momentum term \mu from the gradient evaluation point.
2. Deriving a Lyapunov Candidate
Using the restructured update, Ryu constructed a Lyapunov candidate L_t = f(x_t) + \frac{1}{2}\|v_t\|^2. The model supplied auxiliary inequalities from convex analysis that, after correction, demonstrated L_{t+1} \le L_t for 0<\mu<1, confirming stability.
3. Verifying Across Problem Classes
Ryu tested the derived bound on smooth and strongly convex functions, employing GPT‑5 to fetch relevant theorems from the literature (generative AI, large language model research). Each case confirmed the invariance, reinforcing the universality of the proof.
4. Human‑in‑the‑Loop Validation
Critical to the workflow was Ryu’s habit of starting a fresh chat for each verification step, minimizing error propagation. This practice, highlighted in his own account (memory‑spike mitigation), ensured that the final proof was free of the model’s occasional hallucinations.
5. Publishing the Result
The final manuscript acknowledges GPT‑5 as an exploratory collaborator while crediting the human author for the rigorous proof. The paper is now under peer review, serving as a template for future AI‑enhanced mathematical research (choosing the right AI model).