The InputData is an abstract class used to get training data and tokenize input data.
More...
|
| InputData (int? nRandomSeed=null) |
| The constructor. More...
|
|
abstract bool | GetDataAvailabilityAt (int nIdx, bool bIncludeSrc, bool bIncludeTrg) |
| Returns true if data is available at the given index. More...
|
|
abstract Tuple< float[], float[]> | GetData (int nBatchSize, int nBlockSize, InputData trgData, out int[] rgnIdx) |
| Gets a set of randomly selected source/target data, where the target may be null. More...
|
|
abstract Tuple< float[], float[]> | GetDataAt (int nBatchSize, int nBlockSize, int[] rgnIdx) |
| Gets a set of source/target data from a specific index. More...
|
|
abstract List< int > | Tokenize (string str, bool bAddBos, bool bAddEos) |
| Tokenize an input string using the internal vocabulary. More...
|
|
abstract string | Detokenize (int nTokIdx, bool bIgnoreBos, bool bIgnoreEos) |
| Detokenize a single token. More...
|
|
abstract string | Detokenize (float[] rgf, int nStartIdx, int nCount, bool bIgnoreBos, bool bIgnoreEos) |
| Detokenize an array into a string. More...
|
|
|
Random | m_random |
| Specifies the random object made available to the derived classes. More...
|
|
|
abstract List< string > | RawData [get] |
| Returns the raw data. More...
|
|
abstract uint | TokenSize [get] |
| Returns the size of a single token (e.g. 1 for character data) More...
|
|
abstract uint | VocabularySize [get] |
| Returns the size of the vocabulary. More...
|
|
abstract char | BOS [get] |
| Return the special begin of sequence character. More...
|
|
abstract char | EOS [get] |
| Return the special end of sequence character. More...
|
|
The InputData is an abstract class used to get training data and tokenize input data.
Definition at line 112 of file Interfaces.cs.
◆ InputData()
MyCaffe.layers.gpt.InputData.InputData |
( |
int? |
nRandomSeed = null | ) |
|
The constructor.
- Parameters
-
nRandomSeed | Optionally, specifies the seed to use for testing. |
Definition at line 123 of file Interfaces.cs.
◆ Detokenize() [1/2]
abstract string MyCaffe.layers.gpt.InputData.Detokenize |
( |
float[] |
rgf, |
|
|
int |
nStartIdx, |
|
|
int |
nCount, |
|
|
bool |
bIgnoreBos, |
|
|
bool |
bIgnoreEos |
|
) |
| |
|
pure virtual |
◆ Detokenize() [2/2]
abstract string MyCaffe.layers.gpt.InputData.Detokenize |
( |
int |
nTokIdx, |
|
|
bool |
bIgnoreBos, |
|
|
bool |
bIgnoreEos |
|
) |
| |
|
pure virtual |
◆ GetData()
abstract Tuple< float[], float[]> MyCaffe.layers.gpt.InputData.GetData |
( |
int |
nBatchSize, |
|
|
int |
nBlockSize, |
|
|
InputData |
trgData, |
|
|
out int[] |
rgnIdx |
|
) |
| |
|
pure virtual |
Gets a set of randomly selected source/target data, where the target may be null.
- Parameters
-
nBatchSize | Specifies the number of blocks in the batch. |
nBlockSize | Specifies the size of each block. |
trgData | Specifies the target data used to see if data at index has data. |
rgnIdx | Returns an array of the indexes of the data returned. |
- Returns
- A tuple containing the data and target is returned.
Implemented in MyCaffe.layers.gpt.TextInputData, MyCaffe.layers.gpt.TextListData, and MyCaffe.layers.gpt.CustomListData.
◆ GetDataAt()
abstract Tuple< float[], float[]> MyCaffe.layers.gpt.InputData.GetDataAt |
( |
int |
nBatchSize, |
|
|
int |
nBlockSize, |
|
|
int[] |
rgnIdx |
|
) |
| |
|
pure virtual |
◆ GetDataAvailabilityAt()
abstract bool MyCaffe.layers.gpt.InputData.GetDataAvailabilityAt |
( |
int |
nIdx, |
|
|
bool |
bIncludeSrc, |
|
|
bool |
bIncludeTrg |
|
) |
| |
|
pure virtual |
◆ Tokenize()
abstract List< int > MyCaffe.layers.gpt.InputData.Tokenize |
( |
string |
str, |
|
|
bool |
bAddBos, |
|
|
bool |
bAddEos |
|
) |
| |
|
pure virtual |
◆ m_random
Random MyCaffe.layers.gpt.InputData.m_random |
|
protected |
Specifies the random object made available to the derived classes.
Definition at line 117 of file Interfaces.cs.
◆ BOS
abstract char MyCaffe.layers.gpt.InputData.BOS |
|
get |
Return the special begin of sequence character.
Definition at line 197 of file Interfaces.cs.
◆ EOS
abstract char MyCaffe.layers.gpt.InputData.EOS |
|
get |
Return the special end of sequence character.
Definition at line 201 of file Interfaces.cs.
◆ RawData
abstract List<string> MyCaffe.layers.gpt.InputData.RawData |
|
get |
◆ TokenSize
abstract uint MyCaffe.layers.gpt.InputData.TokenSize |
|
get |
Returns the size of a single token (e.g. 1 for character data)
Definition at line 138 of file Interfaces.cs.
◆ VocabularySize
abstract uint MyCaffe.layers.gpt.InputData.VocabularySize |
|
get |
Returns the size of the vocabulary.
Definition at line 142 of file Interfaces.cs.
The documentation for this class was generated from the following file:
- C:/Data/Data/SS_Projects/Intelligence/GitHub/MyCaffe/MyCaffe.layers.gpt/layers.gpt/Interfaces.cs