Backprop Is Missing a Phase
New ideas—especially those that challenge dominant paradigms—tend to meet resistance. Not always because they’re wrong, but because they’re unfamiliar. This paper is not offered as a refutation of deep learning, nor as a manifesto against neural networks. It is offered in the spirit of exploration, with a simple invitation:
Keep your eyes open. The path you’ve been walking isn’t the only one that leads forward.
What follows is a story about a narrow trail through the woods—cleared not with compute and scale, but with insight, structure, and evolution. It is the story of L7A, and what it may teach us about the true substrate of intelligence.
Most current approaches to artificial intelligence rest on the idea that intelligence can be trained into a system. Build a large enough architecture, feed it vast amounts of data, and let backpropagation tune the weights. Generalization, we're told, will emerge.
But L7A—a forecasting system built not on backprop, but on evolved structure—offers a counterpoint. It succeeds in one of the most unforgiving domains known: short-term financial forecasting, where noise swamps signal and where most models fail.
And it succeeds without any training phase at all.
This paper asks a simple but disruptive question:
What if intelligence doesn’t arise from training—but from structure?
Backpropagation is an optimizer. It assumes that the architecture it inhabits is already capable of representing useful solutions, and then adjusts the weights to approximate those solutions. But there’s an unspoken assumption baked in:
That the structure of the network is good enough—and that intelligence is just a matter of finding the right parameters.
This assumption collapses under scrutiny.
We don’t assume that a stone block can become a violin just by carving. It must be the right material in the right shape to begin with. The same is true of intelligence: if the substrate and structure are wrong, no amount of tuning will make it generalize.
In real-world observation, neural networks can perfectly map inputs to outputs in training—and still fail utterly when given novel data. Why? Because many internal weight configurations can fit the same data, but only some of them generalize. Backprop doesn’t care which one it finds.
The L7A system approaches the problem from a different angle. Instead of training a network to mimic behavior, it evolves a structure that accumulates and interprets behavior directly.
The architecture uses:
- Binary histogram surfaces, acting as spatial memory maps.
- Bayesian updates, accumulating directional outcomes (+1 / –1).
- Genetic algorithms, evolving map topology, bin sizes, and spread parameters
based solely on walk-forward generalization performance.
There is no training phase. No weight tuning. Just structure, designed to survive in a noisy, adversarial environment.
And it works. With a long-term walk-forward win/loss points ratio of ~72% and a Sharpe ratio exceeding 3.0, L7A outperforms deep learning methods in a domain where generalization—not memorization—is the currency of success.
Why? Because the structure was evolved to be robust to noise, not just accurate in hindsight.
If intelligence depends on structure, then backprop is incomplete.
It tunes parameters, but assumes the architecture is fixed. That’s like tuning a radio dial without first building an antenna. Backpropagation is good at polishing—but terrible at inventing shape.
Backprop is missing its evolutionary prelude—a phase that searches for the structures capable of generalizing before any tuning begins.
This is what L7A provides. Its surface maps are not optimized to reduce loss—they are evolved under direct pressure to generalize across unseen data. Fitness is not defined by training accuracy, but by walk-forward survivability.
This is what backprop lacks: a structural filter that prunes architectures incapable of generalizing before they are ever trained.
Biological intelligence doesn’t suffer the same overfitting failure modes as neural networks—because its structure evolved under constant pressure to survive, not just to learn. Brains are not blank slates.
Mechanisms include:
- Developmental pruning – eliminating fragile synapses early
- Neuromodulators – gating when and how learning occurs
- Sleep – consolidating robust memories and discarding noise
- Sparse firing – limiting activation and avoiding gradient delusion
- Region specialization – encoding inductive biases into physical form
These are not learned. They are evolved constraints, built into the very architecture of cognition.
L7A mimics this, not by replicating biology, but by adopting its core principle: structure first, adaptation later.
L7A proves that evolved structure can produce generalization without training. But it also raises a challenge: how do we broaden this method beyond one domain?
Ways forward include:
- Alternate Statistical Surfaces
Move beyond binary histograms to
ternary, quantile, or kernel-based surfaces.
- Layered Architectures
Stack maps across spatial or temporal
dimensions, forming evolved hierarchies.
- Genetic Topology Search
Let the GA explore not just parameters,
but entire topological layouts—structures that weren’t human-designed at all.
- Hybrid Models
Use backprop within evolved structures
for local refinement—but only inside constraints proven to generalize.
- Cross-Domain Transfer
Apply L7A-style evolution to noisy
domains beyond finance: threat detection, navigation, multi-agent coordination.
“I’ve shown a light at the end of the tunnel. Now all we need to do is open the tunnel.”
All intelligence is shaped by the environment that selects for it.
Humans evolved in a world of gravity, light, scarcity, predators, and cooperation. Our intelligence reflects that world. An intelligence evolved elsewhere—on a different planet, or in a digital landscape—would not resemble us. It might not be verbal, social, or even symbolic.
What we call intelligence is a reflection—not the mirror.
This has implications for AGI. If we want to build systems that generalize in any environment, we must evolve them within that environment—not merely train them with past data.
There may be a deeper substrate beneath all intelligences—regardless of their environment or expression.
That substrate is not language. It is not backpropagation. It is not architecture.
It is evolution, entropy, and energy.
- Evolution guides structure toward survival
- Entropy forces information to compress meaningfully
- Energy constrains what’s possible in any system
Any intelligence that survives must contend with these forces. L7A succeeds not because it emulates human cognition, but because it embodies the logic of survival under structure.
L7A didn’t arise from scale. It arose from constraint.
It works—not because it was trained better, but because it was structured right from the start. That’s the real lesson. Backprop can tune, but only evolution can shape. Until we embrace this missing phase, we will continue to build ever-larger models chasing generalization they were never designed to support.
The path forward is clear. Not easy—but clear.
Intelligence is not what you train. It’s what survives.
Two quotes from The Who that echo the heart of this project:
1. “We won’t get fooled again.”
— The call to resist overfitted
optimism and look deeper.
2. “Who are you? Who, who, who, who?”
— The timeless question of self-aware
systems… and their makers.