In our latest release, version 1.12.0.60, we now support ChatGPT type architectures with Encoder/Decoder Transformer Models based on the open-source transformer-translator-pytorch GitHub project by Song [1]. ChatGPT uses encoder/decoder transformer models to learn the context of the input query, the context of the likely responses and a mapping between the two via attention layers. The …
Visually Walking Through a Transformer Model
With GPT and ChatGPT, transformer models have been proven to be very powerful AI models. However, how do they work on the inside? With this post, we use the SignalPop AI Designer to visually walk through the forward pass of a transformer model used for language translation. Before showing a visual walk-through we wanted to …
Continue reading “Visually Walking Through a Transformer Model”
Debugging Difficult AI Models
While completing the MyCaffe implementation of the transformer encoder/decoder model for language translation, we ran into a very difficult bug to fix – in fact it was the kind of bug feared most when developing a model for with this bug everything appeared to work as expected when training. Yet, the model would train up …
Data Flow when Training an Encoder/Decoder Model for Language Translation
When training any deep learning AI model, knowing your data is critical. This is especially important when training a transformer-based encoder/decoder model for data ordering is important. In this post, we analyze the Python open-source language translation encoder/decoder transformer model by Jaewoo (Kyle) Song [1] which is based on the ‘Attention Is All You Need‘ …
Continue reading “Data Flow when Training an Encoder/Decoder Model for Language Translation”
Converting a GPT Model into a full Encoder/Decoder Transformer Model
GPT is a great transformer model used to solve many natural language problems, however GPT only implements the encoder side of a full encoder/decoder transformer model as described by Vaswani et al. [1]. Only a few changes are needed to implement a full encoder/decoder transformer model as shown below (GPT portion inspired by the minGPT …
Continue reading “Converting a GPT Model into a full Encoder/Decoder Transformer Model”
GPT now supported with Transformer Models using CUDA 11.8 and cuDNN 8.6!
In our latest release, version 1.11.8.27, we now support GPT and Transformer Models based on the open source minGPT GitHub project by Karpathy [1]. GPT uses transformer models to learn the context of the input data via attention layers. Stacking up a set of transformer blocks tends to learn context at several different levels from …
Continue reading “GPT now supported with Transformer Models using CUDA 11.8 and cuDNN 8.6!”
New Release with New Samples
In our latest release, version 1.11.7.7, we showcase several new loss samples that demonstrate binary classification, multi-class classification, multi-label classification and regression with the new MSE and MAE layers – all using the latest NVIDIA CUDA 11.7.1 / cuDNN 8.4.1 release. Binary Classification The binary classification sample solves a simple 2-class classification problem, where the …
Using MyCaffe AI Platform in real-time inferencing
The SignalPop Trading Studio is Windows Store App that provides short-term option traders with real-time analytics geared to help the trader better understand what the market is doing during each intra-day trading session. Part of the analytics provided by the SignalPop Trading Studio include real-time, AI driven price directional predictions which when taken together give …
Continue reading “Using MyCaffe AI Platform in real-time inferencing”
minGPT – How It Works
minGPT, created by Andrej Karpathy, is a simplified implementation of the original OpenAI GPT-2 open-source project. GPT has proven very useful in solving many Natural Language Processing problems (NLP) and as shown by Karpathy and others, also used to solve tasks outside of the NLP domain such as generative image processing and classification. One of …
Three Big Version 1.0 Releases!
The MyCaffe AI Platform, SignalPop AI Designer and new SignalPop Trading Studio all release as 1.+ versions! All of our products use the MyCaffe AI Platform to provide fast AI inferencing solutions on low-cost NVIDIA GPUs, some of these GPUs can be purchased for under $250 yet still run AI inferencing loads very quickly! For …