PathNet: Evolution Channels Gradient Descent in Super Neural Networks

60 points by jweissman 9 years ago · 6 comments

Reader

cs702 9 years ago

In short, this architecture freezes the parameters and pathways used for previously learned tasks, and can learn new parameters and use new pathways for new tasks, with each new task learned faster than previous ones by leveraging all previously learned parameters and pathways (more efficient transfer learning).

It's a general neural net architecture.

Very cool.

divbit 9 years ago

"During learning, a tournament selection genetic algorithm is used to select pathways through the neural network for replication and mutation."

Trying to think of another 'tournament' like process that would allow for a massive distributed network where each node already has a decent GPU, where something like this could be successfully run. Maybe someone could help me out here...

gwern 9 years ago

I assume you're being sarcastic; they do point out in the intro and at the end that a deep RL agent could be trained to do the topology selections, but that would be more work to get going than some simple evolutionary operators, and is left to future work. Don't worry, I'm sure it'll be A3C all the way down eventually...
- divbit 9 years ago
  
  Well yes, you could use neural net for the tournament selection, but I was thinking of a much dumber competition that involves a whole lot more distributed GPU power.
  - MBlume 9 years ago
    
    I think you might have to actually explain what you're thinking of
    
    divbit 9 years ago
    
    loss function tournament winner replacing hash winner in something like bitcoin mining

Settings

PathNet: Evolution Channels Gradient Descent in Super Neural Networks

Keyboard Shortcuts