PathNet: Evolution Channels Gradient Descent in Super Neural Networks
arxiv.orgIn short, this architecture freezes the parameters and pathways used for previously learned tasks, and can learn new parameters and use new pathways for new tasks, with each new task learned faster than previous ones by leveraging all previously learned parameters and pathways (more efficient transfer learning).
It's a general neural net architecture.
Very cool.
"During learning, a tournament selection genetic algorithm is used to select pathways through the neural network for replication and mutation."
Trying to think of another 'tournament' like process that would allow for a massive distributed network where each node already has a decent GPU, where something like this could be successfully run. Maybe someone could help me out here...
I assume you're being sarcastic; they do point out in the intro and at the end that a deep RL agent could be trained to do the topology selections, but that would be more work to get going than some simple evolutionary operators, and is left to future work. Don't worry, I'm sure it'll be A3C all the way down eventually...
Well yes, you could use neural net for the tournament selection, but I was thinking of a much dumber competition that involves a whole lot more distributed GPU power.
I think you might have to actually explain what you're thinking of
loss function tournament winner replacing hash winner in something like bitcoin mining