Metadata-Version: 2.1
Name: optax_shampoo
Version: 0.0.5
Summary: Shampoo (Second order Optimizer for Deep Learning) Optax Optimizer
Home-page: https://github.com/google-research/google-research/tree/master/scalable_shampoo
Author: rohananil
Author-email: rohan.anil@gmail.com
License: Apache Software License
Keywords: optax shampoo
Platform: UNKNOWN
Classifier: Programming Language :: Python :: 3
Classifier: License :: OSI Approved :: Apache Software License
Classifier: Operating System :: OS Independent
Classifier: Development Status :: 3 - Alpha
Requires-Python: >=3.6
Description-Content-Type: text/markdown
License-File: LICENSE

Optimization in machine learning, both theoretical and applied, is presently dominated by first-order gradient methods such as stochastic gradient descent. Second-order optimization methods, that involve second derivatives and/or second order statistics of the data, are far less prevalent despite strong theoretical properties, due to their prohibitive computation, memory and communication costs.

Here we present a scalable implementation of a second-order preconditioning method (concretely, a variant of full-matrix Adagrad) that provides significant convergence and wall-clock time improvements compared to conventional first-order methods on state-of-the-art deep models.

Paper preprints: https://arxiv.org/abs/2002.09018


