Lesson overview | Lesson overview | Next part
Geodesics: Part 1: Intuition
1. Intuition
Intuition develops the part of geodesics specified by the approved Chapter 25 table of contents. The treatment is geometry-first and AI-facing.
1.1 Straightest vs shortest paths
Straightest vs shortest paths belongs to the canonical scope of Geodesics. The goal is to make curved-space reasoning concrete enough for ML practice without turning the section into a pure topology course.
Working scope for this subsection: curves, velocities, geodesic equation, exponential and logarithm maps, Christoffel symbols, sphere geodesics, parallel transport, and geodesic convexity preview. The recurring pattern is localize, linearize, measure, move, and return to the manifold.
Operational definition.
A geodesic is a curve whose acceleration vanishes under the connection; locally, it is the curved-space analogue of a straight line.
Worked reading.
On a unit sphere, geodesics are great circles. The spherical interpolation formula stays on the sphere while linear interpolation cuts through the ambient ball.
| Geometric object | Meaning | AI interpretation |
|---|---|---|
| Manifold | Curved space with local coordinates | Data manifold, latent space, constraint set, parameter space |
| Chart | Local coordinate map | Local representation or embedding coordinates |
| Tangent space | Linearized directions at | Local perturbations, gradients, velocities |
| Metric | Inner product on | Geometry-aware length, angle, steepest descent |
| Geodesic | Straightest curved-space path | Latent interpolation, shortest motion, curved optimization path |
| Retraction | Practical map from tangent step back to | Efficient constrained update in training loops |
Three examples of straightest vs shortest paths:
- Great-circle path between normalized embeddings.
- Hyperbolic path through hierarchy embeddings.
- Exponential-map step from a tangent vector.
Two non-examples clarify the boundary:
- Ambient straight-line interpolation between two sphere points.
- A shortest path across a discontinuous graph called a smooth geodesic.
Proof or verification habit for straightest vs shortest paths:
Check the geodesic equation or use known symmetry of the manifold to characterize the path.
global object -> curved manifold or constraint set
local object -> chart, tangent space, or coordinate patch
linear operation -> derivative, gradient, velocity, Hessian approximation
geometric measure -> metric, length, distance, curvature
algorithmic move -> tangent step followed by geodesic or retraction
In AI systems, straightest vs shortest paths matters because learned representations and constrained parameter spaces are rarely globally flat. A local linear approximation may be useful, but it must be attached to the point where it is valid.
Geodesics make latent interpolation, representation distances, and motion planning respect the actual geometry.
Mini derivation lens.
- Choose a point on the manifold and name the local representation used near .
- Move the question into a chart, tangent space, or embedded constraint where first-order calculus is available.
- Compute the local object: derivative, tangent projection, metric-weighted gradient, path velocity, or retraction step.
- Translate the result back into coordinate-free language so the answer is not tied to one chart by accident.
- Check the invariant: the point remains on , the direction remains in , or the distance/gradient uses the stated metric.
Implementation lens.
A practical ML implementation should store both the ambient array representation and the geometric contract attached to it. For example, a normalized embedding is not just a vector; it is a point on a sphere. An orthogonal weight matrix is not just a matrix; it is a point on a Stiefel-type constraint. A covariance matrix is not just a symmetric array; it must stay positive definite.
The clean computational pattern is: encode the state, compute an ambient derivative if needed, convert it into a tangent or metric-aware object, take a small local step, and then return to the manifold with a geodesic formula or retraction. This is the same pattern used in the companion notebooks, just scaled down to visible two- and three-dimensional examples.
The important warning is that coordinate code can pass shape checks while still violating geometry. Differential geometry adds checks that are semantic: tangentness, smooth compatibility, metric choice, path validity, and constraint preservation.
Practical checklist:
- State the manifold and whether it is abstract, embedded, or quotient-like.
- State the local coordinates or tangent representation being used.
- Separate ambient vectors from tangent vectors.
- Name the metric before computing distances, angles, or gradients.
- Use geodesics or retractions when moving on the manifold.
- For ML claims, identify whether geometry is data geometry, parameter geometry, or statistical geometry.
Local diagnostic: Verify the path stays on the manifold and has the right initial velocity.
The companion notebook uses low-dimensional synthetic examples: circles, spheres, tangent projections, spherical interpolation, SPD matrices, and orthogonality constraints. These examples keep geometry visible while preserving the same update logic used in higher-dimensional ML systems.
| Compact ML phrase | Differential-geometric reading |
|---|---|
| local linearization | tangent-space approximation at a point |
| normalized embedding | point on a sphere with tangent constraints |
| natural gradient | Riemannian gradient under Fisher metric |
| orthogonal weights | point on a Stiefel-type manifold |
| latent interpolation | path that may need geodesic structure |
| covariance geometry | SPD manifold rather than arbitrary matrices |
A useful learning move is to compute everything first on a sphere. The sphere has visible curvature, simple tangent spaces, closed-form geodesics, and practical retractions. Once those are clear, Stiefel, Grassmann, SPD, and information-geometric examples become less mysterious.
For implementation, the main discipline is to avoid leaving the manifold silently. If a gradient step violates a constraint, either project the gradient into the tangent space before stepping or use a method whose update is intrinsic by design.
The final question for this subsection is whether a Euclidean formula is being used as an approximation, a coordinate expression, or a mistaken replacement for geometry. Differential geometry is the habit of telling those cases apart.
1.2 Great circles on spheres
Great circles on spheres belongs to the canonical scope of Geodesics. The goal is to make curved-space reasoning concrete enough for ML practice without turning the section into a pure topology course.
Working scope for this subsection: curves, velocities, geodesic equation, exponential and logarithm maps, Christoffel symbols, sphere geodesics, parallel transport, and geodesic convexity preview. The recurring pattern is localize, linearize, measure, move, and return to the manifold.
Operational definition.
A geodesic is a curve whose acceleration vanishes under the connection; locally, it is the curved-space analogue of a straight line.
Worked reading.
On a unit sphere, geodesics are great circles. The spherical interpolation formula stays on the sphere while linear interpolation cuts through the ambient ball.
| Geometric object | Meaning | AI interpretation |
|---|---|---|
| Manifold | Curved space with local coordinates | Data manifold, latent space, constraint set, parameter space |
| Chart | Local coordinate map | Local representation or embedding coordinates |
| Tangent space | Linearized directions at | Local perturbations, gradients, velocities |
| Metric | Inner product on | Geometry-aware length, angle, steepest descent |
| Geodesic | Straightest curved-space path | Latent interpolation, shortest motion, curved optimization path |
| Retraction | Practical map from tangent step back to | Efficient constrained update in training loops |
Three examples of great circles on spheres:
- Great-circle path between normalized embeddings.
- Hyperbolic path through hierarchy embeddings.
- Exponential-map step from a tangent vector.
Two non-examples clarify the boundary:
- Ambient straight-line interpolation between two sphere points.
- A shortest path across a discontinuous graph called a smooth geodesic.
Proof or verification habit for great circles on spheres:
Check the geodesic equation or use known symmetry of the manifold to characterize the path.
global object -> curved manifold or constraint set
local object -> chart, tangent space, or coordinate patch
linear operation -> derivative, gradient, velocity, Hessian approximation
geometric measure -> metric, length, distance, curvature
algorithmic move -> tangent step followed by geodesic or retraction
In AI systems, great circles on spheres matters because learned representations and constrained parameter spaces are rarely globally flat. A local linear approximation may be useful, but it must be attached to the point where it is valid.
Geodesics make latent interpolation, representation distances, and motion planning respect the actual geometry.
Mini derivation lens.
- Choose a point on the manifold and name the local representation used near .
- Move the question into a chart, tangent space, or embedded constraint where first-order calculus is available.
- Compute the local object: derivative, tangent projection, metric-weighted gradient, path velocity, or retraction step.
- Translate the result back into coordinate-free language so the answer is not tied to one chart by accident.
- Check the invariant: the point remains on , the direction remains in , or the distance/gradient uses the stated metric.
Implementation lens.
A practical ML implementation should store both the ambient array representation and the geometric contract attached to it. For example, a normalized embedding is not just a vector; it is a point on a sphere. An orthogonal weight matrix is not just a matrix; it is a point on a Stiefel-type constraint. A covariance matrix is not just a symmetric array; it must stay positive definite.
The clean computational pattern is: encode the state, compute an ambient derivative if needed, convert it into a tangent or metric-aware object, take a small local step, and then return to the manifold with a geodesic formula or retraction. This is the same pattern used in the companion notebooks, just scaled down to visible two- and three-dimensional examples.
The important warning is that coordinate code can pass shape checks while still violating geometry. Differential geometry adds checks that are semantic: tangentness, smooth compatibility, metric choice, path validity, and constraint preservation.
Practical checklist:
- State the manifold and whether it is abstract, embedded, or quotient-like.
- State the local coordinates or tangent representation being used.
- Separate ambient vectors from tangent vectors.
- Name the metric before computing distances, angles, or gradients.
- Use geodesics or retractions when moving on the manifold.
- For ML claims, identify whether geometry is data geometry, parameter geometry, or statistical geometry.
Local diagnostic: Verify the path stays on the manifold and has the right initial velocity.
The companion notebook uses low-dimensional synthetic examples: circles, spheres, tangent projections, spherical interpolation, SPD matrices, and orthogonality constraints. These examples keep geometry visible while preserving the same update logic used in higher-dimensional ML systems.
| Compact ML phrase | Differential-geometric reading |
|---|---|
| local linearization | tangent-space approximation at a point |
| normalized embedding | point on a sphere with tangent constraints |
| natural gradient | Riemannian gradient under Fisher metric |
| orthogonal weights | point on a Stiefel-type manifold |
| latent interpolation | path that may need geodesic structure |
| covariance geometry | SPD manifold rather than arbitrary matrices |
A useful learning move is to compute everything first on a sphere. The sphere has visible curvature, simple tangent spaces, closed-form geodesics, and practical retractions. Once those are clear, Stiefel, Grassmann, SPD, and information-geometric examples become less mysterious.
For implementation, the main discipline is to avoid leaving the manifold silently. If a gradient step violates a constraint, either project the gradient into the tangent space before stepping or use a method whose update is intrinsic by design.
The final question for this subsection is whether a Euclidean formula is being used as an approximation, a coordinate expression, or a mistaken replacement for geometry. Differential geometry is the habit of telling those cases apart.
1.3 Why interpolation in latent space may be curved
Why interpolation in latent space may be curved belongs to the canonical scope of Geodesics. The goal is to make curved-space reasoning concrete enough for ML practice without turning the section into a pure topology course.
Working scope for this subsection: curves, velocities, geodesic equation, exponential and logarithm maps, Christoffel symbols, sphere geodesics, parallel transport, and geodesic convexity preview. The recurring pattern is localize, linearize, measure, move, and return to the manifold.
Operational definition.
The manifold hypothesis says high-dimensional observations often concentrate near a lower-dimensional structure.
Worked reading.
Images may live in pixel space, but small semantic changes such as pose or lighting often vary along far fewer directions than the number of pixels.
| Geometric object | Meaning | AI interpretation |
|---|---|---|
| Manifold | Curved space with local coordinates | Data manifold, latent space, constraint set, parameter space |
| Chart | Local coordinate map | Local representation or embedding coordinates |
| Tangent space | Linearized directions at | Local perturbations, gradients, velocities |
| Metric | Inner product on | Geometry-aware length, angle, steepest descent |
| Geodesic | Straightest curved-space path | Latent interpolation, shortest motion, curved optimization path |
| Retraction | Practical map from tangent step back to | Efficient constrained update in training loops |
Three examples of why interpolation in latent space may be curved:
- Autoencoder latent spaces.
- Embedding neighborhoods with low local rank.
- Diffusion trajectories following learned score geometry.
Two non-examples clarify the boundary:
- Uniform noise in every ambient direction.
- A dataset whose classes occupy disconnected structures but are forced into one manifold.
Proof or verification habit for why interpolation in latent space may be curved:
Evidence is empirical, not theorem-level: estimate local dimension, reconstruction error, neighborhood stability, and tangent consistency.
global object -> curved manifold or constraint set
local object -> chart, tangent space, or coordinate patch
linear operation -> derivative, gradient, velocity, Hessian approximation
geometric measure -> metric, length, distance, curvature
algorithmic move -> tangent step followed by geodesic or retraction
In AI systems, why interpolation in latent space may be curved matters because learned representations and constrained parameter spaces are rarely globally flat. A local linear approximation may be useful, but it must be attached to the point where it is valid.
This hypothesis motivates representation learning, dimensionality reduction, and geometry-aware generative modeling.
Mini derivation lens.
- Choose a point on the manifold and name the local representation used near .
- Move the question into a chart, tangent space, or embedded constraint where first-order calculus is available.
- Compute the local object: derivative, tangent projection, metric-weighted gradient, path velocity, or retraction step.
- Translate the result back into coordinate-free language so the answer is not tied to one chart by accident.
- Check the invariant: the point remains on , the direction remains in , or the distance/gradient uses the stated metric.
Implementation lens.
A practical ML implementation should store both the ambient array representation and the geometric contract attached to it. For example, a normalized embedding is not just a vector; it is a point on a sphere. An orthogonal weight matrix is not just a matrix; it is a point on a Stiefel-type constraint. A covariance matrix is not just a symmetric array; it must stay positive definite.
The clean computational pattern is: encode the state, compute an ambient derivative if needed, convert it into a tangent or metric-aware object, take a small local step, and then return to the manifold with a geodesic formula or retraction. This is the same pattern used in the companion notebooks, just scaled down to visible two- and three-dimensional examples.
The important warning is that coordinate code can pass shape checks while still violating geometry. Differential geometry adds checks that are semantic: tangentness, smooth compatibility, metric choice, path validity, and constraint preservation.
Practical checklist:
- State the manifold and whether it is abstract, embedded, or quotient-like.
- State the local coordinates or tangent representation being used.
- Separate ambient vectors from tangent vectors.
- Name the metric before computing distances, angles, or gradients.
- Use geodesics or retractions when moving on the manifold.
- For ML claims, identify whether geometry is data geometry, parameter geometry, or statistical geometry.
Local diagnostic: Ask whether the data are on, near, or only metaphorically described by a manifold.
The companion notebook uses low-dimensional synthetic examples: circles, spheres, tangent projections, spherical interpolation, SPD matrices, and orthogonality constraints. These examples keep geometry visible while preserving the same update logic used in higher-dimensional ML systems.
| Compact ML phrase | Differential-geometric reading |
|---|---|
| local linearization | tangent-space approximation at a point |
| normalized embedding | point on a sphere with tangent constraints |
| natural gradient | Riemannian gradient under Fisher metric |
| orthogonal weights | point on a Stiefel-type manifold |
| latent interpolation | path that may need geodesic structure |
| covariance geometry | SPD manifold rather than arbitrary matrices |
A useful learning move is to compute everything first on a sphere. The sphere has visible curvature, simple tangent spaces, closed-form geodesics, and practical retractions. Once those are clear, Stiefel, Grassmann, SPD, and information-geometric examples become less mysterious.
For implementation, the main discipline is to avoid leaving the manifold silently. If a gradient step violates a constraint, either project the gradient into the tangent space before stepping or use a method whose update is intrinsic by design.
The final question for this subsection is whether a Euclidean formula is being used as an approximation, a coordinate expression, or a mistaken replacement for geometry. Differential geometry is the habit of telling those cases apart.
1.4 Energy minimization and path length
Energy minimization and path length belongs to the canonical scope of Geodesics. The goal is to make curved-space reasoning concrete enough for ML practice without turning the section into a pure topology course.
Working scope for this subsection: curves, velocities, geodesic equation, exponential and logarithm maps, Christoffel symbols, sphere geodesics, parallel transport, and geodesic convexity preview. The recurring pattern is localize, linearize, measure, move, and return to the manifold.
Operational definition.
A Riemannian metric assigns an inner product to every tangent space smoothly.
Worked reading.
If a coordinate metric is , then length of a velocity is .
| Geometric object | Meaning | AI interpretation |
|---|---|---|
| Manifold | Curved space with local coordinates | Data manifold, latent space, constraint set, parameter space |
| Chart | Local coordinate map | Local representation or embedding coordinates |
| Tangent space | Linearized directions at | Local perturbations, gradients, velocities |
| Metric | Inner product on | Geometry-aware length, angle, steepest descent |
| Geodesic | Straightest curved-space path | Latent interpolation, shortest motion, curved optimization path |
| Retraction | Practical map from tangent step back to | Efficient constrained update in training loops |
Three examples of energy minimization and path length:
- Euclidean metric on a sphere inherited from ambient space.
- Fisher metric on statistical models.
- Affine-invariant metric on SPD matrices.
Two non-examples clarify the boundary:
- A distance formula with no tangent-space inner product.
- A fixed Euclidean metric used after nonlinear reparameterization without checking geometry.
Proof or verification habit for energy minimization and path length:
Check symmetry, bilinearity, positive definiteness, and smooth variation with the base point.
global object -> curved manifold or constraint set
local object -> chart, tangent space, or coordinate patch
linear operation -> derivative, gradient, velocity, Hessian approximation
geometric measure -> metric, length, distance, curvature
algorithmic move -> tangent step followed by geodesic or retraction
In AI systems, energy minimization and path length matters because learned representations and constrained parameter spaces are rarely globally flat. A local linear approximation may be useful, but it must be attached to the point where it is valid.
The metric determines what steepest descent, distance, and regularization mean for a representation or parameter space.
Mini derivation lens.
- Choose a point on the manifold and name the local representation used near .
- Move the question into a chart, tangent space, or embedded constraint where first-order calculus is available.
- Compute the local object: derivative, tangent projection, metric-weighted gradient, path velocity, or retraction step.
- Translate the result back into coordinate-free language so the answer is not tied to one chart by accident.
- Check the invariant: the point remains on , the direction remains in , or the distance/gradient uses the stated metric.
Implementation lens.
A practical ML implementation should store both the ambient array representation and the geometric contract attached to it. For example, a normalized embedding is not just a vector; it is a point on a sphere. An orthogonal weight matrix is not just a matrix; it is a point on a Stiefel-type constraint. A covariance matrix is not just a symmetric array; it must stay positive definite.
The clean computational pattern is: encode the state, compute an ambient derivative if needed, convert it into a tangent or metric-aware object, take a small local step, and then return to the manifold with a geodesic formula or retraction. This is the same pattern used in the companion notebooks, just scaled down to visible two- and three-dimensional examples.
The important warning is that coordinate code can pass shape checks while still violating geometry. Differential geometry adds checks that are semantic: tangentness, smooth compatibility, metric choice, path validity, and constraint preservation.
Practical checklist:
- State the manifold and whether it is abstract, embedded, or quotient-like.
- State the local coordinates or tangent representation being used.
- Separate ambient vectors from tangent vectors.
- Name the metric before computing distances, angles, or gradients.
- Use geodesics or retractions when moving on the manifold.
- For ML claims, identify whether geometry is data geometry, parameter geometry, or statistical geometry.
Local diagnostic: State the metric before computing lengths or gradients.
The companion notebook uses low-dimensional synthetic examples: circles, spheres, tangent projections, spherical interpolation, SPD matrices, and orthogonality constraints. These examples keep geometry visible while preserving the same update logic used in higher-dimensional ML systems.
| Compact ML phrase | Differential-geometric reading |
|---|---|
| local linearization | tangent-space approximation at a point |
| normalized embedding | point on a sphere with tangent constraints |
| natural gradient | Riemannian gradient under Fisher metric |
| orthogonal weights | point on a Stiefel-type manifold |
| latent interpolation | path that may need geodesic structure |
| covariance geometry | SPD manifold rather than arbitrary matrices |
A useful learning move is to compute everything first on a sphere. The sphere has visible curvature, simple tangent spaces, closed-form geodesics, and practical retractions. Once those are clear, Stiefel, Grassmann, SPD, and information-geometric examples become less mysterious.
For implementation, the main discipline is to avoid leaving the manifold silently. If a gradient step violates a constraint, either project the gradient into the tangent space before stepping or use a method whose update is intrinsic by design.
The final question for this subsection is whether a Euclidean formula is being used as an approximation, a coordinate expression, or a mistaken replacement for geometry. Differential geometry is the habit of telling those cases apart.
1.5 Geodesics as geometry-aware motion
Geodesics as geometry-aware motion belongs to the canonical scope of Geodesics. The goal is to make curved-space reasoning concrete enough for ML practice without turning the section into a pure topology course.
Working scope for this subsection: curves, velocities, geodesic equation, exponential and logarithm maps, Christoffel symbols, sphere geodesics, parallel transport, and geodesic convexity preview. The recurring pattern is localize, linearize, measure, move, and return to the manifold.
Operational definition.
A geodesic is a curve whose acceleration vanishes under the connection; locally, it is the curved-space analogue of a straight line.
Worked reading.
On a unit sphere, geodesics are great circles. The spherical interpolation formula stays on the sphere while linear interpolation cuts through the ambient ball.
| Geometric object | Meaning | AI interpretation |
|---|---|---|
| Manifold | Curved space with local coordinates | Data manifold, latent space, constraint set, parameter space |
| Chart | Local coordinate map | Local representation or embedding coordinates |
| Tangent space | Linearized directions at | Local perturbations, gradients, velocities |
| Metric | Inner product on | Geometry-aware length, angle, steepest descent |
| Geodesic | Straightest curved-space path | Latent interpolation, shortest motion, curved optimization path |
| Retraction | Practical map from tangent step back to | Efficient constrained update in training loops |
Three examples of geodesics as geometry-aware motion:
- Great-circle path between normalized embeddings.
- Hyperbolic path through hierarchy embeddings.
- Exponential-map step from a tangent vector.
Two non-examples clarify the boundary:
- Ambient straight-line interpolation between two sphere points.
- A shortest path across a discontinuous graph called a smooth geodesic.
Proof or verification habit for geodesics as geometry-aware motion:
Check the geodesic equation or use known symmetry of the manifold to characterize the path.
global object -> curved manifold or constraint set
local object -> chart, tangent space, or coordinate patch
linear operation -> derivative, gradient, velocity, Hessian approximation
geometric measure -> metric, length, distance, curvature
algorithmic move -> tangent step followed by geodesic or retraction
In AI systems, geodesics as geometry-aware motion matters because learned representations and constrained parameter spaces are rarely globally flat. A local linear approximation may be useful, but it must be attached to the point where it is valid.
Geodesics make latent interpolation, representation distances, and motion planning respect the actual geometry.
Mini derivation lens.
- Choose a point on the manifold and name the local representation used near .
- Move the question into a chart, tangent space, or embedded constraint where first-order calculus is available.
- Compute the local object: derivative, tangent projection, metric-weighted gradient, path velocity, or retraction step.
- Translate the result back into coordinate-free language so the answer is not tied to one chart by accident.
- Check the invariant: the point remains on , the direction remains in , or the distance/gradient uses the stated metric.
Implementation lens.
A practical ML implementation should store both the ambient array representation and the geometric contract attached to it. For example, a normalized embedding is not just a vector; it is a point on a sphere. An orthogonal weight matrix is not just a matrix; it is a point on a Stiefel-type constraint. A covariance matrix is not just a symmetric array; it must stay positive definite.
The clean computational pattern is: encode the state, compute an ambient derivative if needed, convert it into a tangent or metric-aware object, take a small local step, and then return to the manifold with a geodesic formula or retraction. This is the same pattern used in the companion notebooks, just scaled down to visible two- and three-dimensional examples.
The important warning is that coordinate code can pass shape checks while still violating geometry. Differential geometry adds checks that are semantic: tangentness, smooth compatibility, metric choice, path validity, and constraint preservation.
Practical checklist:
- State the manifold and whether it is abstract, embedded, or quotient-like.
- State the local coordinates or tangent representation being used.
- Separate ambient vectors from tangent vectors.
- Name the metric before computing distances, angles, or gradients.
- Use geodesics or retractions when moving on the manifold.
- For ML claims, identify whether geometry is data geometry, parameter geometry, or statistical geometry.
Local diagnostic: Verify the path stays on the manifold and has the right initial velocity.
The companion notebook uses low-dimensional synthetic examples: circles, spheres, tangent projections, spherical interpolation, SPD matrices, and orthogonality constraints. These examples keep geometry visible while preserving the same update logic used in higher-dimensional ML systems.
| Compact ML phrase | Differential-geometric reading |
|---|---|
| local linearization | tangent-space approximation at a point |
| normalized embedding | point on a sphere with tangent constraints |
| natural gradient | Riemannian gradient under Fisher metric |
| orthogonal weights | point on a Stiefel-type manifold |
| latent interpolation | path that may need geodesic structure |
| covariance geometry | SPD manifold rather than arbitrary matrices |
A useful learning move is to compute everything first on a sphere. The sphere has visible curvature, simple tangent spaces, closed-form geodesics, and practical retractions. Once those are clear, Stiefel, Grassmann, SPD, and information-geometric examples become less mysterious.
For implementation, the main discipline is to avoid leaving the manifold silently. If a gradient step violates a constraint, either project the gradient into the tangent space before stepping or use a method whose update is intrinsic by design.
The final question for this subsection is whether a Euclidean formula is being used as an approximation, a coordinate expression, or a mistaken replacement for geometry. Differential geometry is the habit of telling those cases apart.