Skip to main content

Index

The first month of experiments focuses on small synthetic systems that expose when fixed-point online learning becomes unstable and whether dynamic closed-loop throttling prevents divergence while preserving update geometry.

Experiment Table

IDNameStatusWorkspace workspace/ablationsMain Question
000ABOne-layer global throttle sanity checkValid.../000_global_throttle_sanityCan a global controller throttle the total learning rate and prevent one-layer divergence?
000COne-layer global throttle with quantizationPlanned.../000_global_throttle_sanityDoes throttling still help when the one-layer loop includes fake-fixed-point weights, updates, activations, and rails?
001AOne-layer no-bias float scale driftPlannednotes.mdDoes the controller enforce the known Hessian stability boundary under pure scale drift?
001BOne-layer fake fixed-point railsPlannedTBDDoes throttling still help when instability is caused by quantization and saturation?
002ATwo-layer linear networkPlannedTBDCan global throttling stabilize inter-layer coupling without activation nonlinearities?
002BTwo-layer ReLU teacher/studentPlannednotes.mdCan global throttling stabilize coupled nonlinear layer dynamics without rotating the update?

Shared Hypothesis

There exists a region of drift parameters (alpha, beta) where ordinary fixed-point online training fails because some combination of activations, gradients, updates, or weights saturates. A useful dynamic controller should keep training bounded in this region while preserving the global descent direction.

Drift Model

The starting drift model is affine input drift:

x_drift = alpha x + beta

Two versions should be tested:

x_drift = alpha x + beta
x_drift = clip(alpha x + beta, x_min, x_max)

The unclipped version isolates range expansion. The clipped version models sensor rails and information loss.

Common Metrics

Every experiment should report:

MetricWhy it matters
Loss curveShows recovery, divergence, or training death.
Saturation countDirect evidence of fixed-point failure.
Activation rangeShows forward rail pressure.
Gradient rangeShows backward rail pressure.
Weight normShows parameter growth.
Gradient normShows update-field magnitude.
Update normShows effective step size.
Curvature proxy C(t)Estimates local closed-loop gain.
Global throttle alpha(t)Shows controller intervention.
Update cosineMeasures whether budgeting preserves descent direction.
Hessian metricsValidates whether C(t) tracks true curvature in toy models.

Near-Term Roadmap

StepGoalAdds
000Sanity-check instrumentation and controllerOne layer, no bias, float, exploratory drift.
000BAdd the first fake-fixed-point pathQuantized weights, updates, activations, saturation counters, and update underflow checks.
001AClean curvature-only stability testPure scale drift, Hessian-selected learning rates.
001BNumerical hardware-style failureFake fixed-point quantization, rails, saturation counters.
001CMore realistic affine driftBias and scale+shift drift.
002ACoupled linear dynamicsTwo Dense layers, no activation.
002BNonlinear hidden representationTwo Dense layers with ReLU.
003ENABOL comparisonLoose kappa rails, legacy row/column projection if useful.

Initial Variant Matrix

VariantPurpose
Floating referenceEstablish expected behavior without fixed-point limits.
Fixed-point baselineFind drift regimes where online learning fails.
Dynamic global throttleTest closed-loop stabilization with a single shared update scalar.
Loose kappa + throttleTest static safety rails plus dynamic update control.
Global static kappa scaleTest representational gain control without row/layer direction changes.
Legacy row/column projectionOptional comparison only; do not rebuild first.

Documentation Rule

Each experiment workspace must contain:

  • config.yaml: settings, seed, precision, drift grid, and enabled controller mechanisms.
  • notes.md: hypothesis, procedure, status, interpretation, and links to results.
  • notebooks/analysis.ipynb: exploratory run and plots.
  • results/: exported logs, CSV files, figures, or summaries.

When an exploratory notebook becomes stable, add run.py for reproducible batch sweeps.