CIS 7000: Large Language Models (Fall 2024)

General Information
  • Instructor: Mayur Naik
  • Credit Units: 1 CU
  • Class Schedule: Lectures during 1:45-3:15 pm on Mon/Wed in WLNT 401B; occasional recitations on Fridays
  • Prerequisites:
    • CIS 5200 or equivalent: Mathematical foundations of machine learning
    • CIS 5450 or equivalent: Experience with building, training, and debugging machine learning models
  • Text/Required Materials: Selected papers from machine learning conferences (e.g., NeurIPS, ICML, ICLR)
Course Description

This course offers an in-depth exploration of large language models (LLMs) focusing on their design, training, and usage. It begins with the attention mechanism and transformer architectures, moves through practical aspects of pre-training and efficient deployment, and concludes with advanced techniques like prompting and neurosymbolic learning. The course aims to equip students with the skills to critically analyze LLM research and apply these concepts in real-world scenarios. A solid foundation in machine learning, programming, and deep learning is recommended.

Topics Covered
  • History of LLMs
  • Transformer Architectures
  • Model Training Techniques
  • Prompt Engineering
  • Ethical Considerations and Safety Measures
  • Advanced Integration Techniques (e.g., RAG, Agents, Neurosymbolic Learning)
Weekly Schedule
  • Week 1: Course Introduction
  • Weeks 2-3: The Transformer Architecture
  • Weeks 4-5: Pre-training (Data Preparation, Parallelism, Scaling Laws, Instruction Tuning, Alignment, Evaluation)
  • Week 6: Adaptation (Parameter-Efficient Fine-Tuning Techniques and Design Spaces)
  • Week 7: Prompting Techniques
  • Week 8: Fast and Efficient Inference (Quantization, vLLM Framework, Flash Attention, Sparsification/Distillation)
  • Week 9: RAG and Vector DBs
  • Week 10: Agent Frameworks
  • Week 11: Neurosymbolic Architectures
  • Weeks 12: Course Wrap-Up
  • Weeks 13-14: Project - Conception, Design, Implementation, Evaluation
  • Weeks 15-16: Project - Report and Presentation
Course Objectives

By the end of this course, students will be able to:

  • Analyze design decisions in modern and upcoming transformer architectures.
  • Determine the hardware, software, and data requirements for pre-training or fine-tuning an LLM for new tasks.
  • Understand where LLMs should and should not be used based on their capability and reliability.
  • Leverage a deep understanding of LLM theory and software to design prompts and applications around them.
Grading Details

The course comprises three major activities:

  • Homeworks: Programming assignments involving implementation of different LLM techniques.
  • Lectures: Covering LLM concepts supplemented with readings and guest lectures. Attendance is mandatory!
  • Project: Deep-dive into implementing and analyzing an LLM technique in teams of 2-3 students.

Grading Rubric:

  • 65% Homeworks
  • 30% Project
  • 5% Class Participation
Homework Collaboration Policy

Working with your peers to complete the homeworks is encouraged. Here is how you can do so in an acceptable manner:

  • You should work on your own solutions and submit individually.
  • You should never directly copy solutions from other students or resources.
  • Do not share your code with others, and do not look at others' code. There is one exception: if you have already finished a section of the homework, you can look at another student's code to verbally help them debug.
  • Cite any resources that you used to develop your solutions, including articles, academic papers, and/or textbooks.
  • List the names of the classmates that you collaborated with to complete the homework.
  • Viewing or uploading solutions to/from sites likes Chegg and CourseHero is strictly prohibited.

If you have any questions about this policy or seek clarification about the acceptability of a particular resource or collaboration, we encourage you to reach out to course staff on Ed.