2using System.Collections.Generic;
24 double m_dfDetachedWeightDecayRate = 0.0f;
42 public AdamWSolver(
CudaDnn<T> cuda,
Log log,
SolverParameter p,
CancelEvent evtCancel, AutoResetEvent evtForceSnapshot, AutoResetEvent evtForceTest,
IXDatabaseBase db,
IXPersist<T> persist,
int nSolverCount = 1,
int nSolverRank = 0,
Net<T> shareNet =
null, onGetWorkspace getws =
null, onSetWorkspace setws =
null)
43 : base(cuda, log, p, evtCancel, evtForceSnapshot, evtForceTest, db, persist, nSolverCount, nSolverRank, shareNet, getws, setws)
58 for (
int i = 0; i < colNetParams.
Count; i++)
60 List<int> rgShape = colNetParams[i].shape();
72 public override void ComputeUpdateValue(
int param_id,
double dfRate,
int nIterationOverride = -1)
76 if (!colNetParams[param_id].DiffExists)
79 if (nIterationOverride == -1)
82 List<double?> net_params_lr =
m_net.params_lr;
83 double dfLocalRate = dfRate * net_params_lr[param_id].GetValueOrDefault(0);
84 List<double?> net_params_decay =
net.params_weight_decay;
85 double dfLocalDecay = m_dfDetachedWeightDecayRate * net_params_decay[param_id].GetValueOrDefault(0);
87 T fBeta1 =
Utility.ConvertVal<T>(dfBeta1);
89 T fBeta2 =
Utility.ConvertVal<T>(dfBeta2);
92 int nUpdateHistoryOffset = colNetParams.
Count;
96 int nT = nIterationOverride + 1;
98 int nN = colNetParams[param_id].count();
103 colNetParams[param_id].mutable_gpu_diff,
108 Utility.ConvertVal<T>(dfEpsHat),
109 Utility.ConvertVal<T>(dfLocalRate),
110 Utility.ConvertVal<T>(dfLocalDecay),
111 colNetParams[param_id].mutable_gpu_data,
The CancelEvent provides an extension to the manual cancel event that allows for overriding the manua...
The Log class provides general output in text form.
The Utility class provides general utility funtions.
The BlobCollection contains a list of Blobs.
int Count
Returns the number of items in the collection.
The Blob is the main holder of data that moves through the Layers of the Net.
long mutable_gpu_data
Returns the data GPU handle used by the CudaDnn connection.
The CudaDnn object is the main interface to the Low-Level Cuda C++ DLL.
Connects Layer's together into a direct acrylic graph (DAG) specified by a NetParameter
The SolverParameter is a parameter for the solver, specifying the train and test networks.
double delta
Numerical stability for RMSProp, AdaGrad, AdaDelta, Adam and AdamW solvers (default = 1e-08).
double momentum2
An additional momentum property for the Adam and AdamW solvers (default = 0.999).
double adamw_decay
Specifies the 'AdamW' detached weight decay value used by the 'AdamW' solver (default = 0....
double momentum
Specifies the momentum value - used by all solvers EXCEPT the 'AdaGrad' and 'RMSProp' solvers....
Use AdamW Solver which uses gradient based optimization like Adam with a decoupled weight decay.
AdamWSolver(CudaDnn< T > cuda, Log log, SolverParameter p, CancelEvent evtCancel, AutoResetEvent evtForceSnapshot, AutoResetEvent evtForceTest, IXDatabaseBase db, IXPersist< T > persist, int nSolverCount=1, int nSolverRank=0, Net< T > shareNet=null, onGetWorkspace getws=null, onSetWorkspace setws=null)
The AdamSolver constructor.
override void ComputeUpdateValue(int param_id, double dfRate, int nIterationOverride=-1)
Compute the AdamWSolver update value that will be applied to a learnable blobs in the training Net.
virtual void AdamPreSolve()
Runs the AdamSolver pre-solve which parpares the Solver to start Solving.
Stochastic Gradient Descent solver with momentum updates weights by a linear combination of the negat...
BlobCollection< T > m_colHistory
History maintains the historical momentum data.
SolverParameter m_param
Specifies the SolverParameter that defines how the Solver operates.
CudaDnn< T > m_cuda
Specifies the instance of CudaDnn used by the Solver that provides a connection to Cuda.
Net< T > net
Returns the main training Net.
int m_nIter
Specifies the current iteration.
Net< T > m_net
Specifies the training Net.
Log m_log
Specifies the Log for output.
The IXDatabaseBase interface defines the general interface to the in-memory database.
The IXPersist interface is used by the CaffeControl to load and save weights.
The MyCaffe.basecode contains all generic types used throughout MyCaffe.
The MyCaffe.common namespace contains common MyCaffe classes.
The MyCaffe.db.image namespace contains all image database related classes.
The MyCaffe.param namespace contains parameters used to create models.
The MyCaffe.solvers namespace contains all solver classes, including the base Solver.
The MyCaffe namespace contains the main body of MyCaffe code that closesly tracks the C++ Caffe open-...