How AI and the brain work (BCS.40041/DS.50014)
[Important notice - Jan 7th, 2026]
Due to classroom capacity constraints and a limited number of TAs, enrollment will be capped at 45 students (BCS + DS combined).
This course is the first part of the Brain x AI lecture series:
How AI and the brain work (undergraduate course; previously BiS 429):
focuses on inverse problems. It covers linear methods, deep learning, and neuroscience of deep learning.Brain-inspired artificial intelligence (graduate course; previously BCE772):
focuses on control problems. It covers temporal credit assignment problems, error backpropagation through time, reinforcement learning theory, algorithms, and neuroscience.I no longer teach BiS 429 and BCE 772.
Summary. A blend of machine learning with brain science
Questions. This course focuses on the six fundamental questions in AI and brain science: How does the machine/brain
translate an infinite amount of experience into a finite set of representations? (Forward-backward computation)
predict the future from the current events? (Structural-functional complexity)
implement sensitive and invariant perception? (Specificity-invariance dilemma)
learn from subjective experience objectively? (Encoding-decoding problem)
encode temporal information in spatial networks? (Episodic memory problem)
backpropagate information through time? (Error backpropagation through time)
Goal. The course consists of four modules: linear models, shallow networks in a vector space and Hilbert space, key elements of deep learning, and neuroscience of deep learning. Students in biology or brain science aim to develop the ability to explore big questions in science and relate them to the context of machine learning. Students interested in AI and computational neuroscience aim to gain biological insight into various engineering problems.
Expectation. Students are expected to understand commonalities and differences between artificial and biological neural networks (how they work). In the long run, students would gain a better insight into the theoretical issues in AI (why they work).
Disclaimer. This course emphasizes fundamental concepts and theories in machine learning (ML) and neural information processing, rather than covering recent trends in ML (apologies!). If you're interested in exploring the latest ML techniques or participating in hands-on coding sessions, I highly recommend enrolling in other courses offered by the CS, DS, or AI graduate programs.
Instructor: Sang Wan Lee (sangwan@kaist.ac.kr)
Web page: https://aibrain.kaist.ac.kr/class-aibrain
Credit: 3 units (3:0:0)
Lecture Room: E2-2, #1501 (updated on Feb 18th!)
Time: Monday and Wednesday 10:30-12:00
Prerequisite: Linear algebra and probability (or equivalent). Enrollment is not recommended for freshmen.
Assessment: Attendance(30%), Mid-term exam(30%), Coding session(10%), Final exam(30%)
Textbook: Lecture materials (70%) + a few chapters of the following book (30%):
I. Goodfellow, et al. Deep learning, MIT press. (chapter 1-10)
1. Linear models
1.1. Matrix algebra basics
1.1.1. Data matrix
1.1.2. Singular value decomposition
1.1.3. Quadratic forms and maximization lemma
1.2. Linear methods in a vector space
1.2.1. Least squares estimator
1.2.2. Component analysis
1.2.3. Manifold learning
2. Shallow neural networks
2.1. Forward-backward computation
2.1.1. Forward computation
2.1.2. Error backpropagation
2.2. Generalization
2.2.1. Structural risk minimization
2.2.2. Regularization
2.2.3. Support vector machine
2.3. Linear methods in Hilbert space
2.3.1. Reproducing kernel Hilbert space
2.3.2. Kernel methods
2.3.3. Supplementary: theoretical issues
3. Deep learning
3.1. Convolutional neural networks
3.1.1. Specificity-invariance dilemma
3.1.2. Convolution and subsampling
3.1.3. Training issues
3.1.4. Information bottleneck principle
3.2. Generative models
3.2.1. Hopfield network and Boltzmann machine
3.2.2. Autoencoder
3.2.3. Generative adversarial network
3.2.4. Diffusion model - intro
3.3. Episodic memory networks
3.3.1. Recurrent neural networks
3.3.2. Long short-term memory
3.3.3. Attention control
3.3.4. Self-attention
4. Neuroscience of deep learning
4.1. Neural computation
4.1.1. Hodgkin–Huxley model
4.1.2. Bilinear differential equation approach
4.1.3. Cortical information processing
4.1.4. Cortex vs. deep learning models
4.2. Generalization
4.2.1. Structural complexity
4.2.2. Functional complexity
4.2.3. Sparse coding
4.2.4. Dendritic normalization
4.3. Brain-like AI
4.3.1. Weight transport
4.3.2. Predictive coding
4.3.3. Dendritic computation