The InputData is an abstract class used to get training data and tokenize input data.
More...
|
| | InputData (int? nRandomSeed=null) |
| | The constructor. More...
|
| |
| abstract bool | GetDataAvailabilityAt (int nIdx, bool bIncludeSrc, bool bIncludeTrg) |
| | Returns true if data is available at the given index. More...
|
| |
| abstract Tuple< float[], float[]> | GetData (int nBatchSize, int nBlockSize, InputData trgData, out int[] rgnIdx) |
| | Gets a set of randomly selected source/target data, where the target may be null. More...
|
| |
| abstract Tuple< float[], float[]> | GetDataAt (int nBatchSize, int nBlockSize, int[] rgnIdx) |
| | Gets a set of source/target data from a specific index. More...
|
| |
| abstract List< int > | Tokenize (string str, bool bAddBos, bool bAddEos) |
| | Tokenize an input string using the internal vocabulary. More...
|
| |
| abstract string | Detokenize (int nTokIdx, bool bIgnoreBos, bool bIgnoreEos) |
| | Detokenize a single token. More...
|
| |
| abstract string | Detokenize (float[] rgf, int nStartIdx, int nCount, bool bIgnoreBos, bool bIgnoreEos) |
| | Detokenize an array into a string. More...
|
| |
|
| Random | m_random |
| | Specifies the random object made available to the derived classes. More...
|
| |
|
| abstract List< string > | RawData [get] |
| | Returns the raw data. More...
|
| |
| abstract uint | TokenSize [get] |
| | Returns the size of a single token (e.g. 1 for character data) More...
|
| |
| abstract uint | VocabularySize [get] |
| | Returns the size of the vocabulary. More...
|
| |
| abstract char | BOS [get] |
| | Return the special begin of sequence character. More...
|
| |
| abstract char | EOS [get] |
| | Return the special end of sequence character. More...
|
| |
The InputData is an abstract class used to get training data and tokenize input data.
Definition at line 112 of file Interfaces.cs.
◆ InputData()
| MyCaffe.layers.gpt.InputData.InputData |
( |
int? |
nRandomSeed = null | ) |
|
The constructor.
- Parameters
-
| nRandomSeed | Optionally, specifies the seed to use for testing. |
Definition at line 123 of file Interfaces.cs.
◆ Detokenize() [1/2]
| abstract string MyCaffe.layers.gpt.InputData.Detokenize |
( |
float[] |
rgf, |
|
|
int |
nStartIdx, |
|
|
int |
nCount, |
|
|
bool |
bIgnoreBos, |
|
|
bool |
bIgnoreEos |
|
) |
| |
|
pure virtual |
◆ Detokenize() [2/2]
| abstract string MyCaffe.layers.gpt.InputData.Detokenize |
( |
int |
nTokIdx, |
|
|
bool |
bIgnoreBos, |
|
|
bool |
bIgnoreEos |
|
) |
| |
|
pure virtual |
◆ GetData()
| abstract Tuple< float[], float[]> MyCaffe.layers.gpt.InputData.GetData |
( |
int |
nBatchSize, |
|
|
int |
nBlockSize, |
|
|
InputData |
trgData, |
|
|
out int[] |
rgnIdx |
|
) |
| |
|
pure virtual |
Gets a set of randomly selected source/target data, where the target may be null.
- Parameters
-
| nBatchSize | Specifies the number of blocks in the batch. |
| nBlockSize | Specifies the size of each block. |
| trgData | Specifies the target data used to see if data at index has data. |
| rgnIdx | Returns an array of the indexes of the data returned. |
- Returns
- A tuple containing the data and target is returned.
Implemented in MyCaffe.layers.gpt.TextInputData, MyCaffe.layers.gpt.TextListData, and MyCaffe.layers.gpt.CustomListData.
◆ GetDataAt()
| abstract Tuple< float[], float[]> MyCaffe.layers.gpt.InputData.GetDataAt |
( |
int |
nBatchSize, |
|
|
int |
nBlockSize, |
|
|
int[] |
rgnIdx |
|
) |
| |
|
pure virtual |
◆ GetDataAvailabilityAt()
| abstract bool MyCaffe.layers.gpt.InputData.GetDataAvailabilityAt |
( |
int |
nIdx, |
|
|
bool |
bIncludeSrc, |
|
|
bool |
bIncludeTrg |
|
) |
| |
|
pure virtual |
◆ Tokenize()
| abstract List< int > MyCaffe.layers.gpt.InputData.Tokenize |
( |
string |
str, |
|
|
bool |
bAddBos, |
|
|
bool |
bAddEos |
|
) |
| |
|
pure virtual |
◆ m_random
| Random MyCaffe.layers.gpt.InputData.m_random |
|
protected |
Specifies the random object made available to the derived classes.
Definition at line 117 of file Interfaces.cs.
◆ BOS
| abstract char MyCaffe.layers.gpt.InputData.BOS |
|
get |
Return the special begin of sequence character.
Definition at line 197 of file Interfaces.cs.
◆ EOS
| abstract char MyCaffe.layers.gpt.InputData.EOS |
|
get |
Return the special end of sequence character.
Definition at line 201 of file Interfaces.cs.
◆ RawData
| abstract List<string> MyCaffe.layers.gpt.InputData.RawData |
|
get |
◆ TokenSize
| abstract uint MyCaffe.layers.gpt.InputData.TokenSize |
|
get |
Returns the size of a single token (e.g. 1 for character data)
Definition at line 138 of file Interfaces.cs.
◆ VocabularySize
| abstract uint MyCaffe.layers.gpt.InputData.VocabularySize |
|
get |
Returns the size of the vocabulary.
Definition at line 142 of file Interfaces.cs.
The documentation for this class was generated from the following file:
- C:/Data/Data/SS_Projects/Intelligence/GitHub/MyCaffe/MyCaffe.layers.gpt/layers.gpt/Interfaces.cs