Exercises Notebook
Converted from
exercises.ipynbfor web reading.
Limits and Continuity - Exercises
10 graded exercises covering the full section arc, from core calculus mechanics to ML-facing applications.
| Format | Description |
|---|---|
| Problem | Markdown cell with task description |
| Your Solution | Code cell for learner work |
| Solution | Reference solution with checks |
Difficulty: straightforward -> moderate -> challenging.
Code cell 2
import numpy as np
import matplotlib.pyplot as plt
import matplotlib as mpl
try:
import seaborn as sns
sns.set_theme(style="whitegrid", palette="colorblind")
HAS_SNS = True
except ImportError:
plt.style.use("seaborn-v0_8-whitegrid")
HAS_SNS = False
mpl.rcParams.update({
"figure.figsize": (10, 6),
"figure.dpi": 120,
"font.size": 13,
"axes.titlesize": 15,
"axes.labelsize": 13,
"xtick.labelsize": 11,
"ytick.labelsize": 11,
"legend.fontsize": 11,
"legend.framealpha": 0.85,
"lines.linewidth": 2.0,
"axes.spines.top": False,
"axes.spines.right": False,
"savefig.bbox": "tight",
"savefig.dpi": 150,
})
np.random.seed(42)
print("Plot setup complete.")
Code cell 3
import numpy as np
import numpy.linalg as la
from scipy import integrate, special, stats
from math import factorial
import matplotlib.patches as patches
COLORS = {
"primary": "#0077BB",
"secondary": "#EE7733",
"tertiary": "#009988",
"error": "#CC3311",
"neutral": "#555555",
"highlight": "#EE3377",
}
HAS_MPL = True
np.set_printoptions(precision=8, suppress=True)
np.random.seed(42)
def header(title):
print("\n" + "=" * len(title))
print(title)
print("=" * len(title))
def check_true(name, cond):
ok = bool(cond)
print(f"{'PASS' if ok else 'FAIL'} - {name}")
return ok
def check_close(name, got, expected, tol=1e-8):
ok = np.allclose(got, expected, atol=tol, rtol=tol)
print(f"{'PASS' if ok else 'FAIL'} - {name}: got {got}, expected {expected}")
return ok
def centered_diff(f, x, h=1e-6):
return (f(x + h) - f(x - h)) / (2 * h)
def forward_diff(f, x, h=1e-6):
return (f(x + h) - f(x)) / h
def backward_diff(f, x, h=1e-6):
return (f(x) - f(x - h)) / h
def grad_check(f, x, analytic_grad, h=1e-6):
x = np.asarray(x, dtype=float)
analytic_grad = np.asarray(analytic_grad, dtype=float)
numeric_grad = np.zeros_like(x, dtype=float)
for idx in np.ndindex(x.shape):
x_plus = x.copy(); x_minus = x.copy()
x_plus[idx] += h; x_minus[idx] -= h
numeric_grad[idx] = (f(x_plus) - f(x_minus)) / (2 * h)
denom = la.norm(analytic_grad) + la.norm(numeric_grad) + 1e-12
return la.norm(analytic_grad - numeric_grad) / denom
def check(name, got, expected, tol=1e-8):
return check_close(name, got, expected, tol=tol)
print("Chapter helper setup complete.")
Exercise 1 (★): Basic Limit Computation
Compute the following limits analytically and verify numerically.
(a)
(b)
(c)
Hint for (a): Factor the numerator. Hint for (b): Use . Hint for (c): Divide numerator and denominator by .
Code cell 5
# Your Solution
# Exercise 1 - learner workspace
# Write your solution here, then run the reference solution below to compare.
print("Learner workspace ready for Exercise 1.")
Code cell 6
# Solution
# Exercise 1 - reference solution
import numpy as np
# (a) lim_{x->3} (x^2-9)/(x-3) = lim (x-3)(x+3)/(x-3) = lim (x+3) = 6
limit_a = 6.0
# Numerical verification
f_a = lambda x: (x**2 - 9) / (x - 3)
h_vals = [1e-1, 1e-3, 1e-6, 1e-9]
a_numerical = np.mean([(f_a(3+h) + f_a(3-h))/2 for h in h_vals])
# (b) lim_{x->0} sin(5x)/(3x) = (5/3)*lim sin(u)/u = 5/3
limit_b = 5/3
f_b = lambda x: np.sin(5*x)/(3*x) if abs(x) > 1e-15 else 5/3
b_numerical = np.mean([(f_b(h) + f_b(-h))/2 for h in [1e-2, 1e-4, 1e-6]])
# (c) lim_{x->inf} (3x^2+2x-1)/(x^2+5)
# = lim (3 + 2/x - 1/x^2)/(1 + 5/x^2) = 3/1 = 3
limit_c = 3.0
f_c = lambda x: (3*x**2 + 2*x - 1)/(x**2 + 5)
c_numerical = f_c(1e8)
header('Exercise 1: Basic Limit Computation')
print(f'(a) lim (x^2-9)/(x-3) as x->3 = {limit_a}')
check_close('(a) analytic = 6', a_numerical, limit_a, tol=1e-6)
print(f'(b) lim sin(5x)/(3x) as x->0 = {limit_b:.6f}')
check_close('(b) analytic = 5/3', b_numerical, limit_b, tol=1e-6)
print(f'(c) lim (3x^2+2x-1)/(x^2+5) as x->inf = {limit_c}')
check_close('(c) analytic = 3', c_numerical, limit_c, tol=1e-4)
print('\nTakeaway: Three core techniques — factoring, fundamental sin limit, leading coefficient ratio.')
print("Exercise 1 solution complete.")
Exercise 2 (★): One-Sided Limits
Analyze the one-sided limits of the following functions and determine whether the two-sided limit exists.
(a) at
(b) at
For each: compute left and right limits, state whether the two-sided limit exists, and classify any discontinuity.
Code cell 8
# Your Solution
# Exercise 2 - learner workspace
# Write your solution here, then run the reference solution below to compare.
print("Learner workspace ready for Exercise 2.")
Code cell 9
# Solution
# Exercise 2 - reference solution
import numpy as np
# (a) f(x) = |x-2|/(x-2)
# For x < 2: |x-2| = -(x-2), so f(x) = -1 => left limit = -1
# For x > 2: |x-2| = (x-2), so f(x) = +1 => right limit = +1
left_a = -1.0
right_a = +1.0
# Numerical
f = lambda x: np.abs(x - 2) / (x - 2)
h = 1e-8
left_a_num = f(2 - h)
right_a_num = f(2 + h)
# (b) Piecewise: g(x) = x^2+1 for x<1, 3x-1 for x>=1
# Left limit (x->1^-): 1^2+1 = 2
# Right limit (x->1^+): 3(1)-1 = 2
# g(1) = 3(1)-1 = 2
left_b = 2.0
right_b = 2.0
g = lambda x: x**2 + 1 if x < 1 else 3*x - 1
left_b_num = g(1 - h)
right_b_num = g(1 + h)
header('Exercise 2: One-Sided Limits')
print('(a) |x-2|/(x-2) at x=2:')
check_close('Left limit = -1', left_a_num, left_a)
check_close('Right limit = +1', right_a_num, right_a)
check_true('Two-sided limit DNE (left ≠ right)', left_a != right_a)
print(' Discontinuity type: JUMP (left=-1, right=+1)')
print()
print('(b) Piecewise function at x=1:')
check_close('Left limit = 2', left_b_num, left_b, tol=1e-6)
check_close('Right limit = 2', right_b_num, right_b, tol=1e-6)
check_true('Two-sided limit exists (left = right = 2)', abs(left_b - right_b) < 1e-12)
check_close('g(1) = 2 (continuous!)', g(1), 2.0)
print(' g is continuous at x=1 — both one-sided limits equal g(1)')
print('\nTakeaway: Two-sided limit exists iff both one-sided limits agree.')
print("Exercise 2 solution complete.")
Exercise 3 (★): L'Hôpital's Rule
Apply L'Hôpital's Rule to resolve the following indeterminate forms. Identify the form before applying the rule.
(a) []
(b) []
(c) [, convert first]
For (c): Rewrite as to get form.
Code cell 11
# Your Solution
# Exercise 3 - learner workspace
# Write your solution here, then run the reference solution below to compare.
print("Learner workspace ready for Exercise 3.")
Code cell 12
# Solution
# Exercise 3 - reference solution
import numpy as np
# (a) lim_{x->0} (e^x - 1 - x)/x^2
# First application (0/0): (e^x - 1)/(2x) [still 0/0]
# Second application: e^x/2 -> 1/2
limit_a = 0.5
# Verify numerically via Taylor: e^x = 1 + x + x^2/2 + ...
# (e^x-1-x)/x^2 = (x^2/2 + x^3/6 + ...)/x^2 = 1/2 + x/6 + ... -> 1/2
f_a = lambda x: (np.expm1(x) - x) / x**2 # use expm1 for stability
a_vals = [f_a(h) for h in [1e-1, 1e-2, 1e-4, 1e-6]]
# (b) lim_{x->inf} ln(x)/x
# L'Hopital: (1/x)/1 = 1/x -> 0
limit_b = 0.0
b_vals = [np.log(x)/x for x in [1e2, 1e4, 1e6, 1e8]]
# (c) lim_{x->0+} x*ln(x)
# Rewrite: ln(x)/(1/x), form -inf/+inf
# L'Hopital: (1/x)/(-1/x^2) = -x -> 0
limit_c = 0.0
c_vals = [x * np.log(x) for x in [1e-1, 1e-2, 1e-4, 1e-8]]
header("Exercise 3: L'Hopital's Rule")
print('(a) lim (e^x-1-x)/x^2 as x->0:')
print(f' Numerical: {a_vals}')
check_close('Converges to 1/2', a_vals[-1], limit_a, tol=1e-5)
print('(b) lim ln(x)/x as x->inf:')
print(f' Numerical: {b_vals}')
check_close('Converges to 0', b_vals[-1], limit_b, tol=1e-5)
print('(c) lim x*ln(x) as x->0+:')
print(f' Numerical: {c_vals}')
check_close('Converges to 0', c_vals[-1], limit_c, tol=1e-5)
print("\nTakeaway: L'Hopital requires 0/0 or inf/inf form. Convert 0*(-inf) by rewriting.")
print("Exercise 3 solution complete.")
Exercise 4 (★): Continuity Analysis
For each function, determine all points of discontinuity and classify each as removable, jump, or essential (infinite). Where applicable, fix removable discontinuities.
(a)
(b)
(c)
Code cell 14
# Your Solution
# Exercise 4 - learner workspace
# Write your solution here, then run the reference solution below to compare.
print("Learner workspace ready for Exercise 4.")
Code cell 15
# Solution
# Exercise 4 - reference solution
import numpy as np
h_step = 1e-9
# (a) f(x) = (x^2-1)/(x-1) = (x-1)(x+1)/(x-1)
# Undefined at x=1; limit = lim (x+1) = 2
# Removable discontinuity: fix by setting f(1) = 2
f_a = lambda x: (x**2 - 1) / (x - 1)
lim_at_1 = (f_a(1 + h_step) + f_a(1 - h_step)) / 2
# (b) g(x) = 1/(x^2-4) = 1/((x-2)(x+2))
# Undefined at x=2 and x=-2
# As x->2: denominator->0, numerator->1 => g->+/-inf (essential)
# As x->-2: same
g = lambda x: 1 / (x**2 - 4)
g_right_2 = g(2 + h_step)
g_left_2 = g(2 - h_step)
# (c) h(x): at x=0
# Left limit (x->0^-): x+1 -> 0+1 = 1
# Right limit (x->0^+): x^2-1 -> 0-1 = -1
# h(0) = 0 (defined)
# Left != Right => jump discontinuity
left_c = 1.0
right_c = -1.0
h_at_0 = 0.0
header('Exercise 4: Continuity Analysis')
print('(a) f(x) = (x^2-1)/(x-1):')
check_close('Limit at x=1 = 2 (removable)', lim_at_1, 2.0, tol=1e-6)
print(' Type: REMOVABLE. Fix: set f(1) = 2')
print()
print('(b) g(x) = 1/(x^2-4):')
check_true('g(2+) > 0 and large (positive side)', g_right_2 > 1e6)
check_true('g(2-) < 0 and large (negative side)', g_left_2 < -1e6)
print(' Type at x=±2: ESSENTIAL (infinite). Vertical asymptotes.')
print()
print('(c) Piecewise h(x) at x=0:')
check_close('Left limit = 1', left_c, 1.0)
check_close('Right limit = -1', right_c, -1.0)
check_true('Left ≠ Right => JUMP discontinuity', abs(left_c - right_c) > 1)
print(f' h(0) = {h_at_0} (defined but neither limit equals h(0))')
print('\nTakeaway: Continuity requires all three: f(a) defined, limit exists, they agree.')
print("Exercise 4 solution complete.")
Exercise 5 (★★): Squeeze Theorem and IVT
(a) Prove using the Squeeze Theorem that .
Verify numerically and plot the squeeze.
(b) Use the IVT to show that has a root in . Find the root numerically using bisection (tolerance ).
Code cell 17
# Your Solution
# Exercise 5 - learner workspace
# Write your solution here, then run the reference solution below to compare.
print("Learner workspace ready for Exercise 5.")
Code cell 18
# Solution
# Exercise 5 - reference solution
import numpy as np
import matplotlib.pyplot as plt
COLORS = {'primary': '#0077BB', 'secondary': '#EE7733', 'tertiary': '#009988'}
# (a) Squeeze Theorem
# -1 <= cos(1/x) <= 1 for all x != 0
# => -x^2 <= x^2*cos(1/x) <= x^2
# Both -x^2 and x^2 -> 0 as x -> 0
# By Squeeze: x^2*cos(1/x) -> 0
x_vals = np.linspace(-0.5, 0.5, 5000)
x_vals = x_vals[np.abs(x_vals) > 1e-10]
f_squeeze = x_vals**2 * np.cos(1/x_vals)
upper = x_vals**2
lower = -x_vals**2
# Verify squeeze
check_true('Lower bound holds: -x^2 <= x^2*cos(1/x)', np.all(lower <= f_squeeze + 1e-15))
check_true('Upper bound holds: x^2*cos(1/x) <= x^2', np.all(f_squeeze <= upper + 1e-15))
x_small = np.array([1e-1, 1e-2, 1e-4, 1e-6, 1e-8])
f_small = x_small**2 * np.cos(1/x_small)
print(f'x^2*cos(1/x) at small x: {f_small}')
check_close('lim_{x->0} x^2*cos(1/x) = 0', f_small[-1], 0.0, tol=1e-14)
if HAS_MPL:
fig, ax = plt.subplots(figsize=(10, 5))
x_plot = np.linspace(-0.4, 0.4, 2000)
x_plot = x_plot[np.abs(x_plot) > 1e-10]
ax.fill_between(x_plot, -x_plot**2, x_plot**2, alpha=0.2, color=COLORS['secondary'], label='Squeeze bounds')
ax.plot(x_plot, x_plot**2*np.cos(1/x_plot), color=COLORS['primary'], lw=1.5, label=r'$x^2\cos(1/x)$')
ax.plot(x_plot, x_plot**2, color=COLORS['tertiary'], lw=1.5, ls='--', label=r'$x^2$')
ax.plot(x_plot, -x_plot**2, color=COLORS['tertiary'], lw=1.5, ls='--', label=r'$-x^2$')
ax.set_xlabel('x'); ax.set_ylabel('f(x)')
ax.set_title(r'Squeeze: $-x^2 \leq x^2\cos(1/x) \leq x^2$, all $\to 0$')
ax.legend(); ax.set_ylim(-0.2, 0.2)
fig.tight_layout(); plt.show()
# (b) IVT and bisection
def f_b(x): return x**3 + 2*x - 5
print(f'f(1) = {f_b(1)}, f(2) = {f_b(2)}')
check_true('IVT applies: f(1) < 0 < f(2)', f_b(1) < 0 < f_b(2))
a, b = 1.0, 2.0
for _ in range(60):
m = (a + b) / 2
if f_b(a) * f_b(m) < 0:
b = m
else:
a = m
if b - a < 1e-12: break
root = (a + b) / 2
check_close('Root of x^3+2x-5 in [1,2]', f_b(root), 0.0, tol=1e-8)
print(f'Root found: {root:.10f}')
print('\nTakeaway: Squeeze proves limits by bounding; IVT proves root existence by sign change.')
print("Exercise 5 solution complete.")
Exercise 6 (★★): ε-δ Proof
Give a formal ε-δ proof that .
(a) Find an explicit expression for in terms of .
(b) Verify: for , compute and confirm that .
(c) Generalize: prove for any . What is the explicit ?
Code cell 20
# Your Solution
# Exercise 6 - learner workspace
# Write your solution here, then run the reference solution below to compare.
print("Learner workspace ready for Exercise 6.")
Code cell 21
# Solution
# Exercise 6 - reference solution
import numpy as np
# (a) |(3x-1) - 5| = |3(x-2)| = 3|x-2|
# Need 3|x-2| < eps => |x-2| < eps/3
# So delta(eps) = eps/3
def delta_for_epsilon(eps):
return eps / 3
def general_delta(eps, m):
if abs(m) < 1e-15:
return float('inf') # constant function: any delta works
return eps / abs(m)
header('Exercise 6: epsilon-delta Proof')
# (a) Show delta formula
print('(a) Proof:')
print(' |(3x-1) - 5| = |3x - 6| = 3|x - 2|')
print(' For 3|x-2| < eps: choose delta = eps/3')
print(' Then |x-2| < delta => 3|x-2| < 3*(eps/3) = eps. QED.')
# (b) Verify numerically
eps = 0.3
delta = delta_for_epsilon(eps)
x_test = np.linspace(2 - delta + 1e-12, 2 + delta - 1e-12, 10000)
f_vals = 3*x_test - 1
errors = np.abs(f_vals - 5)
print(f'\n(b) eps={eps}, delta=eps/3={delta:.6f}')
check_close('delta formula', delta, eps/3)
check_true(f'All |f(x)-5| < eps={eps}', np.all(errors < eps))
print(f'Max error in delta-window: {errors.max():.6f} < {eps}')
# (c) General linear limit
print('\n(c) General: lim_{x->a}(mx+b) = ma+b')
print(' |(mx+b) - (ma+b)| = |m||x-a|')
print(' Choose delta = eps/|m| (for m != 0)')
for m_val in [1, 2, 5, 10]:
d = general_delta(0.1, m_val)
check_close(f'delta(eps=0.1, m={m_val}) = 0.1/{m_val}', d, 0.1/m_val)
print('\nTakeaway: epsilon-delta proofs follow a pattern — bound |f(x)-L| by C|x-a|, then set delta=eps/C.')
print("Exercise 6 solution complete.")
Exercise 7 (★★★): Gradient as a Limit
The derivative is defined as a limit of difference quotients:
(a) Implement finite_diff_1(f, a, h) (one-sided) and finite_diff_c(f, a, h) (centered). Compare their errors on at for .
(b) Show that centered differences have error while one-sided have error . Verify by fitting and checking vs .
(c) Implement grad_check(f, theta, grad_fn, h=1e-5) that computes the relative error between an analytic gradient and the centered finite difference. Test on with .
Code cell 23
# Your Solution
# Exercise 7 - learner workspace
# Write your solution here, then run the reference solution below to compare.
print("Learner workspace ready for Exercise 7.")
Code cell 24
# Solution
# Exercise 7 - reference solution
import numpy as np
import matplotlib.pyplot as plt
COLORS = {'primary': '#0077BB', 'secondary': '#EE7733', 'tertiary': '#009988'}
def finite_diff_1(f, a, h):
return (f(a + h) - f(a)) / h
def finite_diff_c(f, a, h):
return (f(a + h) - f(a - h)) / (2 * h)
def grad_check(f, theta, grad_fn, h=1e-5):
grad_an = grad_fn(theta)
grad_fd = np.zeros_like(theta)
for i in range(len(theta)):
tp = theta.copy(); tp[i] += h
tm = theta.copy(); tm[i] -= h
grad_fd[i] = (f(tp) - f(tm)) / (2*h)
rel_err = np.linalg.norm(grad_an - grad_fd) / (np.linalg.norm(grad_an) + np.linalg.norm(grad_fd) + 1e-12)
return rel_err, grad_an, grad_fd
f = np.sin
a = np.pi / 4
true_deriv = np.cos(a)
header('Exercise 7: Gradient as a Limit')
# (a) Error comparison
h_vals = np.logspace(-1, -14, 50)
err_1 = np.abs([finite_diff_1(f, a, h) - true_deriv for h in h_vals])
err_c = np.abs([finite_diff_c(f, a, h) - true_deriv for h in h_vals])
opt_idx_1 = np.argmin(err_1)
opt_idx_c = np.argmin(err_c)
print(f'Best one-sided: h={h_vals[opt_idx_1]:.2e}, error={err_1[opt_idx_1]:.2e}')
print(f'Best centered: h={h_vals[opt_idx_c]:.2e}, error={err_c[opt_idx_c]:.2e}')
check_true('Centered is more accurate than one-sided', err_c[opt_idx_c] < err_1[opt_idx_1])
# (b) Convergence rate
# Use intermediate h where FP errors haven't dominated
mask = h_vals > 1e-8
h_fit = h_vals[mask]
e1_fit = err_1[mask]; ec_fit = err_c[mask]
alpha_1 = np.polyfit(np.log10(h_fit), np.log10(np.maximum(e1_fit, 1e-16)), 1)[0]
alpha_c = np.polyfit(np.log10(h_fit), np.log10(np.maximum(ec_fit, 1e-16)), 1)[0]
print(f'\nConvergence rate (one-sided): {alpha_1:.2f} (expected ~1.0)')
print(f'Convergence rate (centered): {alpha_c:.2f} (expected ~2.0)')
check_true('One-sided rate ~1', abs(alpha_1 - 1.0) < 0.3)
check_true('Centered rate ~2', abs(alpha_c - 2.0) < 0.4)
if HAS_MPL:
fig, ax = plt.subplots(figsize=(10, 6))
ax.loglog(h_vals, err_1, color=COLORS['primary'], lw=2, label=r'One-sided: $O(h)$')
ax.loglog(h_vals, err_c, color=COLORS['secondary'], lw=2, label=r'Centered: $O(h^2)$')
h_ref = np.logspace(-1, -8, 20)
ax.loglog(h_ref, 0.3*h_ref, color=COLORS['primary'], ls=':', lw=1)
ax.loglog(h_ref, 0.05*h_ref**2, color=COLORS['secondary'], ls=':', lw=1)
ax.set_xlabel('h'); ax.set_ylabel('|error|')
ax.set_title('Finite Difference Accuracy: Centered vs One-Sided')
ax.legend(); fig.tight_layout(); plt.show()
# (c) Gradient check on matrix loss
W = np.array([[1., 2.], [3., 4.]])
f_loss = lambda t: 0.5 * np.dot(W @ t, W @ t)
grad_fn = lambda t: W.T @ (W @ t) # analytic gradient: W^T W t
theta = np.array([1.5, -0.7])
rel_err, grad_an, grad_fd = grad_check(f_loss, theta, grad_fn)
print(f'\n(c) Gradient check:')
print(f' Analytic: {grad_an}')
print(f' FD: {grad_fd}')
print(f' Relative error: {rel_err:.2e}')
check_true('Gradient check passed (rel_err < 1e-5)', rel_err < 1e-5)
print('\nTakeaway: Centered FD approximates gradient with O(h^2) error; use h~1e-5 for gradient checking.')
print("Exercise 7 solution complete.")
Exercise 8 (★★★): Cross-Entropy Limit and Entropy Continuity
The Shannon entropy of a Bernoulli distribution is:
with the convention (justified by the limit ).
(a) Prove analytically that using L'Hôpital's Rule.
(b) Implement using the stable formula (using log1p where appropriate) and verify it is continuous at and to full floating-point precision.
(c) The cross-entropy between true distribution and predicted is:
Show that (infinite loss when predicting 0 for a certain event). Verify numerically and explain the ML implication.
Code cell 26
# Your Solution
# Exercise 8 - learner workspace
# Write your solution here, then run the reference solution below to compare.
print("Learner workspace ready for Exercise 8.")
Code cell 27
# Solution
# Exercise 8 - reference solution
import numpy as np
import matplotlib.pyplot as plt
COLORS = {'primary': '#0077BB', 'secondary': '#EE7733', 'tertiary': '#009988', 'error': '#CC3311'}
# (a) lim_{p->0+} p*ln(p)
# Rewrite: p*ln(p) = ln(p) / (1/p) -- form -inf / +inf
# L'Hopital: (1/p) / (-1/p^2) = -p -> 0 as p->0+
# Therefore lim p*ln(p) = 0
p_vals = np.array([1e-1, 1e-2, 1e-4, 1e-8, 1e-12, 1e-15])
xlnx_vals = p_vals * np.log(p_vals)
header('Exercise 8: Cross-Entropy Limit')
print('(a) p*ln(p) -> 0 as p->0+')
print(f'Rewrite: p*ln(p) = ln(p)/(1/p). L Hopital: (1/p)/(-1/p^2) = -p -> 0')
for p, v in zip(p_vals, xlnx_vals):
print(f' p={p:.0e}: p*ln(p) = {v:.6e}')
check_close('lim p*ln(p) = 0 at p=1e-15', xlnx_vals[-1], 0.0, tol=1e-12)
# (b) Stable binary entropy
def entropy(p):
p = float(p)
if p <= 0 or p >= 1:
return 0.0
# For p near 0: -p*ln(p) uses xlnx
# For p near 1: -(1-p)*ln(1-p) uses log1p
term1 = -p * np.log(p) # stable for p not too small
term2 = -(1-p) * np.log1p(-p) # stable for p near 1
return term1 + term2
print()
p_arr = np.array([0.0, 1e-10, 0.1, 0.3, 0.5, 0.7, 0.9, 1-1e-10, 1.0])
H_vals = np.array([entropy(p) for p in p_arr])
print('(b) Binary entropy H(p):')
for p, h in zip(p_arr, H_vals):
print(f' H({p:.1e}) = {h:.8f}')
check_close('H(0) = 0 (convention)', entropy(0.0), 0.0)
check_close('H(1) = 0 (convention)', entropy(1.0), 0.0)
check_close('H(0.5) = ln(2) (maximum)', entropy(0.5), np.log(2), tol=1e-10)
check_true('H is continuous: H(1e-10) close to H(0)', abs(entropy(1e-10) - entropy(0.0)) < 1e-8)
# (c) Cross-entropy limit
def ce(p, q, eps=1e-300):
q = max(q, eps) # numerical floor
q1 = max(1-q, eps)
if p == 1:
return -np.log(q)
elif p == 0:
return -np.log(q1)
return -p*np.log(q) - (1-p)*np.log(q1)
q_small = np.array([1e-1, 1e-2, 1e-4, 1e-8, 1e-15])
ce_vals = [ce(1.0, q) for q in q_small]
print()
print('(c) CE(p=1, q) as q->0+:')
for q, cv in zip(q_small, ce_vals):
print(f' CE(1, {q:.0e}) = {cv:.4f} = -ln({q:.0e}) = {-np.log(q):.4f}')
check_true('CE(1, q) -> inf as q->0+ (grows without bound)', ce_vals[-1] > ce_vals[0])
check_true('CE(1, q) = -ln(q) diverges', ce_vals[-1] > 30)
if HAS_MPL:
fig, axes = plt.subplots(1, 2, figsize=(14, 5))
p_plot = np.linspace(0, 1, 1000)
H_plot = np.array([entropy(p) for p in p_plot])
axes[0].plot(p_plot, H_plot, color=COLORS['primary'], lw=2.5)
axes[0].set_xlabel('p'); axes[0].set_ylabel('H(p)')
axes[0].set_title('Binary Entropy H(p) = $-p\\ln p - (1-p)\\ln(1-p)$')
axes[0].annotate('H(0)=0', (0, 0), fontsize=11, ha='left')
axes[0].annotate('H(0.5)=ln2', (0.5, np.log(2)+0.02), fontsize=11, ha='center')
q_plot = np.linspace(1e-3, 0.999, 1000)
ce_plot = [-np.log(q) for q in q_plot]
axes[1].plot(q_plot, ce_plot, color=COLORS['error'], lw=2.5)
axes[1].set_xlabel('q (predicted probability)'); axes[1].set_ylabel('CE(p=1, q)')
axes[1].set_title('CE(p=1,q) = -ln(q): diverges as q->0')
axes[1].set_ylim(0, 10)
fig.tight_layout(); plt.show()
print('\nTakeaway: p*ln(p)->0 by L Hopital; entropy is continuous everywhere;')
print('CE(1,q)->inf as q->0 -- penalizes confidently wrong predictions infinitely.')
print("Exercise 8 solution complete.")
Exercise 9 (★★★): Temperature Limits of Softmax
Let and define the temperature-scaled softmax
- Show that as the distribution concentrates on the largest logit when the maximum is unique.
- Show that as the distribution approaches the uniform distribution.
- Implement a stable temperature softmax and verify both limits numerically.
Code cell 29
# Your Solution
# Exercise 9 - learner workspace
# Implement the stable temperature softmax and test the two limits here.
print("Learner workspace ready for Exercise 9.")
Code cell 30
# Solution
# Exercise 9 - temperature softmax limits
header("Exercise 9: temperature softmax limits")
logits = np.array([1.0, 2.5, -0.5, 0.0])
def softmax_temperature(z, T):
z = np.asarray(z, dtype=float) / T
z = z - np.max(z)
e = np.exp(z)
return e / e.sum()
low_T = softmax_temperature(logits, 0.02)
high_T = softmax_temperature(logits, 1e6)
argmax_dist = np.eye(len(logits))[np.argmax(logits)]
uniform = np.ones_like(logits) / len(logits)
print("low temperature:", low_T)
print("high temperature:", high_T)
check_close("T -> 0 concentrates on argmax", low_T, argmax_dist, tol=1e-8)
check_close("T -> infinity approaches uniform", high_T, uniform, tol=1e-6)
print("Takeaway: generation temperature is a continuity/limit control on categorical uncertainty.")
Exercise 10 (★★★): Huber Loss Continuity and Smoothness
The Huber loss with threshold is
- Prove the two branches meet continuously at .
- Prove the first derivative also matches at .
- Verify this numerically for .
Code cell 32
# Your Solution
# Exercise 10 - learner workspace
# Check the values and slopes of both branches at +/- delta.
print("Learner workspace ready for Exercise 10.")
Code cell 33
# Solution
# Exercise 10 - Huber continuity
header("Exercise 10: Huber loss continuity")
def huber(r, delta=1.5):
r = np.asarray(r, dtype=float)
return np.where(np.abs(r) <= delta, 0.5 * r**2, delta * (np.abs(r) - 0.5 * delta))
def huber_grad(r, delta=1.5):
r = np.asarray(r, dtype=float)
return np.where(np.abs(r) <= delta, r, delta * np.sign(r))
delta = 1.5
for point in [-delta, delta]:
left_val = huber(point - 1e-8, delta)
right_val = huber(point + 1e-8, delta)
left_grad = huber_grad(point - 1e-8, delta)
right_grad = huber_grad(point + 1e-8, delta)
print(f"r={point:+.1f}: values {left_val:.10f}, {right_val:.10f}; slopes {left_grad:.10f}, {right_grad:.10f}")
check_close(f"continuous at {point:+.1f}", left_val, right_val, tol=1e-6)
check_close(f"C1 at {point:+.1f}", left_grad, right_grad, tol=1e-6)
print("Takeaway: Huber loss is robust to outliers while keeping a continuous gradient for optimization.")
What to Review After Finishing
- Exercise 1: Can you compute all three limits without looking at the solution? Practise until algebraic manipulation feels automatic.
- Exercise 2: Do you understand why left ≠ right implies the two-sided limit doesn't exist?
- Exercise 3: Can you identify the indeterminate form first, then apply L'Hôpital correctly?
- Exercise 4: Do you know all three discontinuity types and when each applies?
- Exercise 5: Can you state the Squeeze Theorem and apply it? Did you understand the bisection proof?
- Exercise 6: Can you write a complete ε-δ proof from scratch?
- Exercise 7: Do you understand why centered differences are more accurate? Can you implement gradient checking?
- Exercise 8: Do you see why
0*log(0)=0is the right convention for entropy?
References
- Stewart, J. (2015). Calculus: Early Transcendentals (8th ed.). Cengage.
- Spivak, M. (2006). Calculus (4th ed.). Publish or Perish.
- Goodfellow, I., Bengio, Y., & Courville, A. (2016). Deep Learning. MIT Press. (Ch. 4)
- Robbins, H., & Monro, S. (1951). A stochastic approximation method. Ann. Math. Stat.
- Hochreiter, S., & Schmidhuber, J. (1997). Long Short-Term Memory. Neural Computation.