SML 354
Artificial Intelligence: A hands-on introduction from basics to ChatGPT
Fall 2024
Lecture: Tuesday/Thursday 11-11:50am
Sarah-Jane Leslie
sjleslie@princeton.edu
sarahjaneleslie.org
We are in the midst of an AI revolution, but gaining a solid understanding of this technology can be challenging for students without strong mathematical and computational backgrounds. This course offers an introduction to deep learning, which is the core technology behind most modern AI applications (including ChatGPT), aimed at students with little to no coding experience and no mathematical background beyond basic differential calculus. Emphasis will be placed on gaining a conceptual understanding of deep learning models and on practicing the basic coding skills required to use them in simple contexts. By the end of the course, students will be able to understand, code and train a variety of basic deep learning models, including basic neural nets, image recognition models, and natural language processing models. As a capstone, students will complete a structured project in which they will build their own (tiny) GPT-style text generator. Throughout the semester, we will build a rich and nuanced understanding of how AI models operate, what their strengths and limitations are, and correspondly how they can be most effectively deployed across a range of contexts.
Readings: The primary text for the course is Francois Chollet, 2021, Deep learning with Python. Manning. We will work our way through the book in a lot of detail. In the second half of the course, once we have worked through the basics, there will be supplemental readings assigned as applicable.
Evaluation and assignments: Students are required to complete weekly coding assignments and a final structured project (coding and training a tiny GPT-style text generator). In addition, there are in-class midterm and final exams that test conceptual understanding of course content.
Final grades are determined in the following way: 25% homeworks, 25% midterm, 25% final, 15% final project, 10% class/precept participation.
Prerequisites
Please note: Students who satisfy the math and computer science requirements (MAT 103, 202 & COS 226) for COS 324: Introduction to Machine Learning should take that course instead. This course is specifically aimed at students who do not meet those requirements and is correspondingly less rigorous in its treatment of the subject. No credit will be given to students who have already completed COS 324 or equivalent. Please check with me if you have any questions about your eligibility.
Math prerequisites: MAT 100: Calculus Foundations or equivalent high school course. In particular, students should be familiar with: the concept of a derivative; logarithms and exponents; basic trigonometry, especially cosines; graphs of functions. I do my very best to explain the mathematical concepts that are needed to understand deep learning as we go, but I will assume basic knowledge of those concepts in particular.
Coding prerequisities: Prior coding experience is not required for this class, but students without Python experience will need to complete a crash course in the language. In particular, I have made video lectures covering Python basics which are available via Canvas for anyone with a Princeton netid. To view the videos, please self-enroll here.
The first several homeworks will assume that you have watched certain portions of the crash course, so you can do this concurrently with the start of classes, however if you are new to coding I strongly recommend getting started before the beginning of the semester so you can work at your own pace and do additional practice as needed. If you fall behind on this once the semester starts, it may be prohibitively difficult to catch up. The videos have accompanying code notebooks with exercises. If you can complete the exercises in the notebooks, that means you are on a good track. Students with prior experience in Python should test their understanding by making sure they can easily complete those exercises.
Please note that the video tutorial does not cover all aspects of Python; rather the focus is on the key aspects of the language required for the course.
Additional resources: A repository of good tutorials can be found here. For the W3 Schools tutorial, I recommend working through it up to Classes/Objects. You might also take a look at Princeton Research Computing's offerings, as they feature mini-courses on Python at various times throughout the year.
Schedule of Topics (subject to minor timing changes as appropriate)
Week 1: Overview of machine learning; Introduction to neural networks
Reading: Chollet, chapter 1
Week 2: Fundamentals of neural networks I: The forward pass
Reading: Chollet, chapter 2
Week 3: Fundamentals of neural networks II: Backpropagation
Reading: Chollet, chapter 2 cont.
Week 4: Classification and regression
Reading: Softmax Tutorial; Tasks, Activations, and Losses (both available on Canvas)
Week 5: Introduction to Keras; Training, validation, and test sets
Reading: Chollet, sections 3.2-3.3, 3.6, & 4.1-4.2; section 7.2.2
Week 6: Finish any outstanding topics; Review; Begin computer vision, time permitting
Midterm Exam held in class on Thursday Oct. 10
Reading: Chollet, sections 5.2-5.4; chapter 6 (optional)
break
Week 7: Computer vision I: Introduction to convolutional neural networks
Reading: Chollet, sections 8.1-8.2
Week 8: Computer vision II: classification; visualizing convnets
Reading: Chollet, sections 8.3; 9.3
Week 9: Introduction to natural language processing: Representing words as numbers
Reading: Jay Alammar, Illustrated Word2Vec
Optional: Chollet sections 11.1-11.3
Supplemental optional reading: Charlesworth, T.E.S., Caliskan, A., & Banaji, M.R., (2022) Historical representations of social groups across 200 years of word embeddings from Google Books. Proceedings of the National Academy of Sciences, 119(28).
Week 10: Transformers I: Introduction; Masked language models
Reading: Jay Alammar, The Illustrated Transformer, Chollet chapter 11.4
Supplemental optional reading:
Manning, C., Clark, K., Hewitt, J., & Levy, O. Emergent linguistic structure in artificial neural networks trained by self-supervision. PNAS 117 (48).
Week 11: Transformers II: Masked language models cont,; Generative models
Reading: Chollet chapter 11.5.3, 12.1
Supplemental optional reading:
Hutson, M. (2021). Robo-writers: the rise and risks of language-generating AI. Nature 591, 22-25.
Stokel-Walker & Van Noorden (2023). What ChatGPT and generative AI mean for science. Nature.
van Dis, E. et al. (2023). ChatGPT: Five priorities for research. Nature.
Generative transformer project distributed
Week 12: Transformers III: Generative models cont.; Putting it all together: Review and loose ends
Supplemental optional reading:
Here's what happens when your lawyer uses ChatGPT. New York Times.
Generative transformer project due on Dean's date
Final exam given in class during finals period