CIS 7000 - LLMs -

CIS 7000: Large Language Models (Fall 2024)

General Information

Instructor: Mayur Naik
Credit Units: 1 CU
Class Schedule: Lectures during 1:45-3:15 pm on Mon/Wed in WLNT 401B; occasional recitations on Fridays
Prerequisites:
- CIS 5200 or equivalent: Mathematical foundations of machine learning
- CIS 5450 or equivalent: Experience with building, training, and debugging machine learning models
Text/Required Materials: Selected papers from machine learning conferences (e.g., NeurIPS, ICML, ICLR)

Course Description

This course offers an in-depth exploration of large language models (LLMs) focusing on their design, training, and usage. It begins with the attention mechanism and transformer architectures, moves through practical aspects of pre-training and efficient deployment, and concludes with advanced techniques like prompting and neurosymbolic learning. The course aims to equip students with the skills to critically analyze LLM research and apply these concepts in real-world scenarios. A solid foundation in machine learning, programming, and deep learning is recommended.

Topics Covered

History of LLMs
Transformer Architectures
Model Training Techniques
Prompt Engineering
Ethical Considerations and Safety Measures
Advanced Integration Techniques (e.g., RAG, Agents, Neurosymbolic Learning)

Weekly Schedule

Week 1: Course Introduction
Weeks 2-3: The Transformer Architecture
Weeks 4-5: Pre-training (Data Preparation, Parallelism, Scaling Laws, Instruction Tuning, Alignment, Evaluation)
Week 6: Adaptation (Parameter-Efficient Fine-Tuning Techniques and Design Spaces)
Week 7: Prompting Techniques
Week 8: Fast and Efficient Inference (Quantization, vLLM Framework, Flash Attention, Sparsification/Distillation)
Week 9: RAG and Vector DBs
Week 10: Agent Frameworks
Week 11: Neurosymbolic Architectures
Weeks 12: Course Wrap-Up
Weeks 13-14: Project - Conception, Design, Implementation, Evaluation
Weeks 15-16: Project - Report and Presentation

Course Objectives

By the end of this course, students will be able to:

Analyze design decisions in modern and upcoming transformer architectures.
Determine the hardware, software, and data requirements for pre-training or fine-tuning an LLM for new tasks.
Understand where LLMs should and should not be used based on their capability and reliability.
Leverage a deep understanding of LLM theory and software to design prompts and applications around them.

Grading Details

The course comprises three major activities:

Homeworks: Programming assignments involving implementation of different LLM techniques.
Lectures: Covering LLM concepts supplemented with readings and guest lectures. Attendance is mandatory!
Project: Deep-dive into implementing and analyzing an LLM technique in teams of 2-3 students.

Grading Rubric:

65% Homeworks
30% Project
5% Class Participation

Homework Collaboration Policy

Working with your peers to complete the homeworks is encouraged. Here is how you can do so in an acceptable manner:

You should work on your own solutions and submit individually.
You should never directly copy solutions from other students or resources.
Do not share your code with others, and do not look at others' code. There is one exception: if you have already finished a section of the homework, you can look at another student's code to verbally help them debug.
Cite any resources that you used to develop your solutions, including articles, academic papers, and/or textbooks.
List the names of the classmates that you collaborated with to complete the homework.
Viewing or uploading solutions to/from sites likes Chegg and CourseHero is strictly prohibited.

If you have any questions about this policy or seek clarification about the acceptability of a particular resource or collaboration, we encourage you to reach out to course staff on Ed.