ChatGPT architecture now supported with Encoder/Decoder Transformer Models using CUDA 11.8 and cuDNN 8.8!

In our latest release, version 1.12.0.60, we now support ChatGPT type architectures with Encoder/Decoder Transformer Models based on the open-source transformer-translator-pytorch GitHub project by Song [1]. ChatGPT uses encoder/decoder transformer models to learn the context of the input query, the context of the likely responses and a mapping between the two via attention layers.  The …

Data Flow when Training an Encoder/Decoder Model for Language Translation

When training any deep learning AI model, knowing your data is critical.  This is especially important when training a transformer-based encoder/decoder model for data ordering is important. In this post, we analyze the Python open-source language translation encoder/decoder transformer model by Jaewoo (Kyle) Song [1] which is based on the ‘Attention Is All You Need‘ …

Converting a GPT Model into a full Encoder/Decoder Transformer Model

GPT is a great transformer model used to solve many natural language problems, however GPT only implements the encoder side of a full encoder/decoder transformer model as described by Vaswani et al. [1]. Only a few changes are needed to implement a full encoder/decoder transformer model as shown below (GPT portion inspired by the minGPT …

GPT now supported with Transformer Models using CUDA 11.8 and cuDNN 8.6!

In our latest release, version 1.11.8.27, we now support GPT and Transformer Models based on the open source minGPT GitHub project by Karpathy [1]. GPT uses transformer models to learn the context of the input data via attention layers.  Stacking up a set of transformer blocks tends to learn context at several different levels from …

New Release with New Samples

In our latest release, version 1.11.7.7, we showcase several new loss samples that demonstrate binary classification, multi-class classification, multi-label classification and regression with the new MSE and MAE layers – all using the latest NVIDIA CUDA 11.7.1 / cuDNN 8.4.1 release. Binary Classification The binary classification sample solves a simple 2-class classification problem, where the …

Using MyCaffe AI Platform in real-time inferencing

The SignalPop Trading Studio is Windows Store App that provides short-term option traders with real-time analytics geared to help the trader better understand what the market is doing during each intra-day trading session.  Part of the analytics provided by the SignalPop Trading Studio include real-time, AI driven price directional predictions which when taken together give …