Here, I share everything from projects I’ve worked on, things I’ve learned, my personal interests, and even notes Iāve jotted down along the way. šāØ
Feel free to reach out and share any feedbackāIām always open to it! š¬
Here, I share everything from projects I’ve worked on, things I’ve learned, my personal interests, and even notes Iāve jotted down along the way. šāØ
Feel free to reach out and share any feedbackāIām always open to it! š¬
LMs are trained to predict the next word based on the context of the previous words. However, to make accurate predictions, LMs need to understand the relationship between words in the sentence. This is the objective of attention mechanism ā it helps the LM to focus on the most relevant words to that context to make predictions. ! In this post, weāll implement scaled dot-product attention in a simple way. Back in the day, RNNs were the standard for sequence-to-sequence tasks, but everything changed when attention mechanisms came into the picture. Then, the groundbreaking paperĀ āAttention Is All You NeedāĀ shook things up even more, showing that RNNs werenāt necessary at allāattention alone could handle it. Since then, attention has become the backbone of modern architectures like Transformers.
Implementing Multihead attention from scratch with pytorch In our previous article, we built Self-Attention from scratch using PyTorch. If you havenāt checked that out yet, I highly recommend giving it a read before reading this one! Now, letās take things a step further and implement Multi-Head Attention from scratch. This post focuses more on the implementation rather than the theory, so I assume youāre already familiar with how self-attention works. Letās get started!
LSTM from Scratch In this post, we will implement a simple next word predictor LSTM from scratch using torch. A gentle Introduction to LSTM Long Short Term Memory networks ā usually just called āLSTMsā ā are a special kind of RNN, capable of learning long-term dependencies. They were introduced byĀ Hochreiter & Schmidhuber (1997). As LSTMs are also a type of Recurrent Neural Network, they too have a hidden state, but they have another memory cell called the cell state as well.
Jenkins š What is Jenkins? Jenkins is an open-source automation server used primarily for continuous integration (CI) and continuous deployment (CD). Continuous Integration is an integral part of DevOps, and Jenkins is the most famous Continuous Integration tool. In this article, I will focus on Jenkins architecture then, Iāll walk you through writing a Jenkins pipeline to automate CI/CD for a project. In this Blog, Iāll cover: Jenkins architecture Writing a Jenkins pipeline for automating CI/CD Setting up webhook to trigger deployments automatically. Jenkins Architecture