ES-MAML
Paper: "ES-MAML: Simple Hessian-Free Meta Learning"
This was also used in "Rapidly Adaptable Legged Robots via Evolutionary Meta-Learning" with associated Google AI Blog Post.
Overview[Abstract]
We introduce ES-MAML, a new framework for solving the model agnostic meta learning (MAML) problem based on Evolution Strategies (ES). Existing algorithms for MAML are based on policy gradients, and incur significant difficulties when attempting to estimate second derivatives using backpropagation on stochastic policies. We show how ES can be applied to MAML to obtain an algorithm which avoids the problem of estimating second derivatives, and is also conceptually simple and easy to implement. Moreover, ES-MAML can handle new types of non-smooth adaptation operators, and other techniques for improving performance and estimation of ES methods become applicable. We show empirically that ES-MAML is competitive with existing methods and often yields better adaptation with fewer queries.
Citation
@inproceedings{es_maml,
author = {Xingyou Song and
Wenbo Gao and
Yuxiang Yang and
Krzysztof Choromanski and
Aldo Pacchiano and
Yunhao Tang},
title = {{ES-MAML:} Simple Hessian-Free Meta Learning},
booktitle = {8th International Conference on Learning Representations, {ICLR} 2020,
Addis Ababa, Ethiopia, April 26-30, 2020},
year = {2020},
url = {https://openreview.net/forum?id=S1exA2NtDB},
}
@article{rapidly,
author = {Xingyou Song and
Yuxiang Yang and
Krzysztof Choromanski and
Ken Caluwaerts and
Wenbo Gao and
Chelsea Finn and
Jie Tan},
title = {Rapidly Adaptable Legged Robots via Evolutionary Meta-Learning},
booktitle = {International Conference on Intelligent Robots and Systems, {IROS} 2020},
year = {2020},
url = {https://arxiv.org/abs/2003.01239},
}
Usage
In order to run the algorithm, you must launch both the binaries es_maml_client
(which produces the central 'aggregator') and multiple launches of es_maml_server
(which produces the 'workers').
This depends on your particular distributed communication infrastructure, but we by default use GRPC. In order to use the default GRPC method of client-server communication, you must first create the proper pb2.py
and pb2_grpc.py
libraries from the .proto
's for both zero_order
and first_order
. This can be done via the commands (see discussion):
$ pip install protobuf
$ pip install grpcio-tools==1.32
$ pip install googleapis-common-protos
$ python -m grpc_tools.protoc --proto_path=. --python_out=. --grpc_python_out=. first_order.proto
$ python -m grpc_tools.protoc --proto_path=. --python_out=. --grpc_python_out=. zero_order.proto
Algorithms
The hyperparameters are all contained in config.py
.
There are two algorithms:
- Zero Order
- First Order
Zero Order:
-
Uses custom adaptation operators, built using blackbox algorithms such as MCBlackboxOptimizer, DPP sampling, and Hill-Climbing.
-
Collects state normalization data from all workers.
First Order:
-
Uses local-worker state normalization.
-
Allows Hessian computation.