LLMs for Generative AI exploration and studies
Reflecting and discussing my experience studying the generative AI engineering with LLMs specialization from IBM online and highlighting the most interesting parts and outcomes.
Online education and deep learning
Online education is one of the most exciting products of technology, open science and open source software. This makes gaining more information and learning about a topic easier to access and usually self-paced or with a flexible schedule. Besides having many options for study programs, such as a project (short course), a course, specialisation, a bachelor's degree, and a master's degree, in addition to other types of programs that can be found on websites such as Coursera, edX, DeepLearning.ai, LinkedIn Learning, FutureLearning, and other websites. And when exploring the study programs on one of these websites, it is good to know that each and every one of them is built with a group of smaller programs. For example, a master’s degree consists of a group of specialisations, and a single specialisation is a group of courses.
Resuming on my previous education in AI, machine learning and deep learning, where the most advanced certificate I used to have was the Deep Learning Specialisation from DeepLearning.ai, and after a suggestion from the Coursera LinkedIn page for me to study the “Generative AI engineering with LLMs“ by IBM, I decided to start studying this program, to learn about LLMs and transformer based models and the techniques more advanced and recent than the “Attention is all you need“[1] research paper.
Specialization overview
Starting to study the specialisation was so exciting, since it is perfectly fit to my previous education and works as a continuation of it, where it consists of 6 courses, each of which is split into two modules focused on a single topic regarding the LLMs architectures and engineering.
In the following few sections, I will describe the modules and the main outcomes I had from auditing them. Besides providing some example research papers and supplementary materials from the latest in the field, and most relevant to the topic.
Note: The names of the sections are representative of the topic of the notes mentioned, NOT the corresponding Coursera courses.
Generative AI and LLMs: Architecture and data preparation
This course introduces the simpler concepts of architecture and data preparation for LLMs, where the two main applications for the GenAI are Computer Vision and NLP, while audio is less popular. The evolution stages of Gen AI architectures for NLP:
Rule-based system. e.g. using an if-then statement.
Statistical machine learning approach. e.g. KNN, and SVM algorithms.
Deep learning architectures. e.g. CNN and RNN.
Transformer. e.g. GPT, LLaMA.


