AI/ML CheatSheet : Must-Know Tips & Tricks For AI Engineers

In this article, we'll take a look at Show

A curated collection of quick-reference Stanford guides covering core topics in Artificial Intelligence, Machine Learning, and Deep Learning. AI/ML CheatSheets are designed to help students, developers, and researchers quickly recall important concepts and formulas.

1. INTRODUCTION TO AI AND ML

1.1) What is Artificial Intelligence (AI)?

Artificial Intelligence (AI) is the ability of machines or computer programs to perform tasks that typically require human intelligence. These tasks can include understanding language, recognizing patterns, solving problems, and making decisions.
Simple explanation: AI is when machines are made smart enough to think and act like humans.
Examples:

Voice assistants like Alexa
Image recognition systems
Chatbots
Self-driving cars

1.2) Types of AI: Narrow AI vs. General AI

Narrow AI (Weak AI):

Designed to perform one specific task
Cannot do anything beyond its programming
Examples: Email spam filters, facial recognition, recommendation systems

General AI (Strong AI):

Still under research and development
Can learn and perform any intellectual task a human can do
Would have reasoning, memory, and decision-making abilities similar to a human

1.3) What is Machine Learning (ML)?

Machine Learning is a subfield of AI that allows machines to learn from data and improve their performance over time without being explicitly programmed.
Simple explanation: Instead of writing rules for everything, we give the machine data, and it figures out the rules on its own.
Example: A machine learns to identify spam emails by studying thousands of examples.

1.4) Supervised vs. Unsupervised vs. Reinforcement Learning

Supervised Learning:

The training data includes both the input and the correct output (labels)
The model learns by comparing its output with the correct output
Example: Predicting house prices based on features like size, location, etc.

Unsupervised Learning:

The data has no labels
The model tries to find patterns or groupings in the data
Example: Grouping customers based on purchasing behavior

Reinforcement Learning:

The model learns through trial and error
It receives rewards or penalties based on its actions
Example: A robot learning to walk or a program learning to play chess

1.5) Key Terminologies

Model: A program or function that makes predictions or decisions based on data.
Feature: An input variable used in making predictions (e.g., age, income, temperature).
Target: The value the model is trying to predict (e.g., house price, spam or not).
Algorithm: A step-by-step method or set of rules used to train the model.
Training: The process of teaching the model using a dataset.
Testing: Evaluating the trained model on new, unseen data to measure its accuracy.

1.6) Applications of AI and ML

Recommendation systems (YouTube, Amazon, Netflix)
Fraud detection in banking
Language translation
Healthcare diagnosis
Self-driving vehicles
Stock market prediction
Chatbots and customer support
Social media content moderation

2. MATHEMATICS FOR ML/AI

Mathematics is the foundation of AI and Machine Learning. It helps us understand how algorithms work under the hood and how to fine-tune models for better performance.

2.1) Linear Algebra

Linear Algebra deals with numbers organized in arrays and how these arrays interact. It is used in almost every ML algorithm.

2.1.1) Vectors, Matrices, and Tensors

Vector: A 1D array of numbers. Example: [3, 5, 7]
Used to represent features like height, weight, and age.
Matrix: A 2D array (rows and columns). Example: Used to store datasets or model weights.
Tensor: A generalization of vectors and matrices to more dimensions (3D or higher). Example: Used in deep learning models like images (3D tensor: width, height, color channels).

2.1.2) Matrix Operations

Addition/Subtraction: Add or subtract corresponding elements of two matrices.
Multiplication: Used to combine weights and inputs in ML models.
Transpose: Flip a matrix over its diagonal.
Dot Product: Fundamental in calculating output in neural networks.

2.1.3) Eigenvalues and Eigenvectors

Eigenvector: A direction that doesn’t change during a transformation.
Eigenvalue: Tells how much the eigenvector is stretched or shrunk.
These are used in algorithms like Principal Component Analysis (PCA) for dimensionality reduction.

2.2) Probability and Statistics

Probability helps machines make decisions under uncertainty, and statistics helps us understand data and model performance.

2.2.1) Mean, Variance, Standard Deviation

Mean: The average value.
Variance: How spread out the values are from the mean.
Standard Deviation: The square root of variance. It measures the extent to which values vary.
Used to understand the distribution and behavior of features in datasets.

2.2.2) Bayes Theorem

A mathematical formula to calculate conditional probability:


P(A|B) = [P(B|A) * P(A)] / P(B)

P(A|B) = [P(B|A) * P(A)] / P(B)

Used in Naive Bayes classifiers for spam detection, document classification, etc.

2.2.3) Conditional Probability

The probability of one event occurring given that another event has already occurred. Example: Probability that a user clicks an ad given that they are between 20-30 years old.

2.2.4) Probability Distributions

Normal Distribution: Bell-shaped curve. Common in real-world data, like height, exam scores.
Binomial Distribution: Used for yes/no type outcomes. Example: Flipping a coin 10 times.
Poisson Distribution: For events happening over a time period. Example: Number of customer calls per hour.
These distributions help in modeling randomness in data.

2.3) Calculus for Optimization

Calculus helps in training models by optimizing them to reduce errors.

2.3.1) Derivatives and Gradients

Derivative: Measures how a function changes as its input changes.
Gradient: A vector of derivatives that tells the slope of a function in multi-dimensions.
Used to find the direction in which the model should adjust its weights.