I realize I can make my own module/optimizer to do this, but can existing mxnet modules be told to optimize only a subset of variables?
Along those same lines, how does a module determine which symbols to optimize as it is? For example, unlike tensorflow in MXNet, both data and variables to be optimized are just "Variable" symbols, but somehow MXNet only affects the NDArrays for the actual variables and not data NDArrays. How does it check? Is there a naming convention it uses? If so, what is that convention? (Any symbol with a name containing 'data' in it is not optimized?)