Mathematics is beautiful and imperative for problem solving with control. Important concepts to be comfortable around with : Linear Algebra, Matrix Decomposition, Analytical Geometry, Calculus, Probability and Statistics.
Practical Mathematics ( c++ / py )
![]() [ code ] |
Optimizations
![]() [ code ] |
---|
book : Mathematics for ML | series : Weights & Biases, StatQuest, 3Blue1Brown
> LINEAR ALGEBRA
: M4ML - Linear Algebra, 3b1b - Essence of Linear Algebra, MIT 18.06 Linear Algebra - Gilbert Strang
{ linear_algebra_cs229.pdf, linear_algebra_review.pdf, vip_referesher_linear_algebra.pdf } 📖
Linear algebra forms the backbone of many machine learning algorithms, particularly in handling high-dimensional data. Key concepts include vectors, matrices, matrix operations (addition, multiplication), and linear transformations. Examples:
- Representation of datasets as matrices.
- Matrix multiplication for linear transformations in neural networks.
Eigenvalues and eigenvectors are used in algorithms like Principal Component Analysis (PCA), which reduces the dimensionality of the data.
# Eigenvalues and Eigenvectors
A = np.array([[3, 1], [1, 3]])
# Compute eigenvalues and eigenvectors
eigenvalues, eigenvectors = np.linalg.eig(A)
print("Eigenvalues:", eigenvalues)
print("Eigenvectors:\n", eigenvectors)
In machine learning, the gradient is the vector of partial derivatives of a function. In linear regression, gradients are used for updating weights during training.
# Example data (features and labels)
X = np.array([[1, 1], [1, 2], [2, 2], [2, 3]]) # Features
y = np.dot(X, np.array([1, 2])) + 3 # Labels
# Initial weights
w = np.array([0.1, 0.2])
# Predicted values
y_pred = np.dot(X, w)
# Compute gradients (partial derivatives of Mean Squared Error Loss)
gradient = np.dot(X.T, (y_pred - y)) / len(y)
print("Gradient:", gradient)
In Robotics [SLAM] : Lie Group
: A Lie group is a group that is also a differentiable manifold, meaning it has a smooth structure that allows for calculus to be performed. The group operations (multiplication and inversion) are smooth maps. Lie groups are used to study continuous symmetries.
Lie Algebra
: Associated with every Lie group is a Lie algebra, which is a vector space equipped with a binary operation called the Lie bracket. The Lie algebra captures the infinitesimal structure of the Lie group and is crucial in understanding its properties.
More on Lie Group and Lie Algebra from my notes @slam/lie
Matrix Decomposition: MIT 18.065 Matrix Methods in Data Analysis, Signal Processing, and ML
> ANALYTIC GEOMETRY
: MIT Graph and Geometry Reading Group [ Discrete Differential Geometry - CMU 15-458/858 ]
{ Geometric foundations of deep learning, @github/awesome-neural-geometry, AMMI 2022 Course "Geometric Deep Learning" , Geometric Deep Learning, mathematical_methods_computervision_robotics_graphics.pdf }
> CALCULUS
: Calculus I (Limits, Derivative, Integrals), Calculus II (Integration Methods, Series, Parametric/Polar, Vectors), Calculus III: Multivariable Calculus (Vectors, Curves, Partial Derivatives, Integration, Mathematics for Machine Learning - Multivariate Calculus; VECTOR CALCULUS: Calculus IV: Vector Calculus.
Calculus provides tools for optimization, which is fundamental in training machine learning models. Concepts such as derivatives, gradients, and optimization techniques like gradient descent are indispensable. Examples:
- Calculating gradients for updating model parameters in gradient descent.
- Using second derivatives for optimization methods like Newton's method.
In optimization and machine learning, the Jacobian matrix represents the gradient of a vector-valued function, and the Hessian matrix represents the second-order partial derivatives of a scalar function. Let's compute the Hessian matrix for :
import autograd.numpy as np
from autograd import jacobian, hessian
# Define a multivariable function f(x, y) = x^2 + y^2
def f_xy(params):
x, y = params
return x**2 + y**2
# Compute the Jacobian and Hessian of f
jacobian_f = jacobian(f_xy)
hessian_f = hessian(f_xy)
# Evaluate at (x, y) = (1, 2)
params = np.array([1.0, 2.0])
print("Jacobian at (1, 2):", jacobian_f(params))
print("Hessian at (1, 2):\n", hessian_f(params))
> PROBABILITY AND DISTRIBUTION
: Random Variable and Probability Distribution, Discrete Probability
{ prob_theory_review_for_ml.pdf, review_probability_theory.pdf, vip_refresher_probability_statistics.pdf } 📖
Probability theory deals with uncertainty and randomness. Concepts include probability distributions, conditional probability, and Bayes' theorem. Examples:
- Gaussian distribution in Gaussian Naive Bayes classifier.
- Conditional probability in Hidden Markov Models.
> CONTINUOUS OPTIMIZATION
: Optimization Methods for Machine Learning and Engineering, CS769 - Optimization in Machine Learning
Optimization techniques are crucial for training machine learning models by finding the optimal parameters that minimize a cost function. Concepts include convex optimization, constrained optimization, and stochastic optimization. Examples:
- Gradient descent for optimizing neural network weights.
- L-BFGS (Limited-memory Broyden–Fletcher–Goldfarb–Shanno) for constrained optimization.
Information theory provides insights into the representation and transmission of information. Concepts include entropy, mutual information, and the Kullback-Leibler divergence. Examples:
- Cross-entropy loss function in classification tasks.
- Mutual information in feature selection.
I had UG math classes @AEC : Advanced Mathematics and Numerical Analysis (MA 401), Discrete Mathematics (MA 477), Graph Theory (CS 7751EL), Calculus, Probability, Statistics and Linear Algebra. Completed "Mathematics for ML - ICL" and in 2017 wrote articles on phyllotaxis and several others. I participated in IMO in grade 7 (2009) and love being around logic & patterns.
Class : EPFL - Optimization: principles and algorithms : Linear optimization, Unconstrained nonlinear optimization, Network and discrete optimization; Python engineering animations: Bring math & data to life, Quaternions and 3d rotation, explained interactively, Visualizing quaternions (4d numbers) with stereographic projection, 18.657 | Mathematics Of Machine Learning.
Books: #Concrete Mathematics [2], #The Princeton Companion to Mathematics [3], #The Princeton Companion to Applied Mathematics [4], #The Joy Of X: A Guided Tour of Math, from One to Infinity [5], #Prime Obsession [6], #Infinite Powers: How Calculus Reveals the Secrets of the Universe [7], #Mathematics for Human Flourishing [8]. Journals : MDPI - Mathematics, arXiv, NATURE, Springer.
Youtube - @3blue1brown, @Tibees, @ArvinAsh, @DennisDavisEdu, @Mathologer, @FlammableMaths, @Numberphile @Veritasium, @StatQuest, @MindYourDecisions, @mathemaniac, @EigenSteve, sound of mandelbrot, Creating The Never-Ending Bloom, The math major (part 2), The map of mathematics, Higher Level Math Classes, Math's Fundamental Flaw, The Mathematics of our Universe, The things you'll find in higher dimensions.
Resources: [NOTES (.pdf)], @github/mathematics-for-ml, @github/awesome-math; Stanford Mathematics courses : mathematics.stanford; LaTeX in markdown, fields medal; European/ American Mathematical Society : euro math soc ; popmath, ama, SIAM ; Math societies / institutes DE: ddgtc, mis.mpg.de, DMV, A beautiful mind [1] ; International Mathematical Olympiad - IMO; IOI; manim, manim.community, mml-book-solution, EGMO; IYMC ; Grigori Perelman - Poincaré Conjecture /story ; claymath.org/millennium-problems ; Project Euler : projecteuler; project LENIA, SINDy-py; github: penrose, texme, sagemath, root, pen-paper, igraph, pycm, manim, casadi, mathjs ; programs : TUM topmath, math.mit, maths.cam, cms.caltech, pma.caltech, seas.harvard, maths.ox, math.ethz, crypto.stanford ; Mathematics of Machine Learning Summer School, @github/mathematics-for-ml, MathHistory: A course in the History of Mathematics, Royal Institute - Mathematics, geometricdeeplearning, MIT 18.657 | Fall 2015 | Mathematics Of Machine Learning, deeplearning.AI - Mathematics for Machine Learning, University of Cambridge - Mathematics for ML, Stanford CS229T/STAT231: Statistical Learning Theory, The Complete Mathematics of Neural Networks and Deep Learning, Watching Neural Networks Learn, The Most Important Algorithm in Machine Learning.