# First LLM Classifier UMD Learn how journalists use large-language models to organize and analyze massive datasets ## What you will learn This class will give you hands-on experience creating a machine-learning model that can read and categorize the text recorded in newsworthy datasets. It will teach you how to: * Submit large-language model prompts with the Python programming language * Write structured prompts that can classify text into predefined categories * Submit dozens of prompts at once as part of an automated routine * Evaluate results using a rigorous, scientific approach * Improve results by training the model with rules and examples By the end, you will understand how LLM classifiers can outperform traditional machine-learning methods with significantly less code. And you will be ready to write a classifier on your own. ## Who can take it This course is free. Anyone who has dabbled with code and AI is qualified to work through the materials. A curious mind and good attitude are all that’s required, but a familiarity with Python will certainly come in handy. ## Table of contents ```{toctree} :maxdepth: 1 :name: mastertoc :numbered: our-mission llm-wtf groq prompting-with-python structured-responses bulk-prompts evaluate improve about ``` ## About this class [Ben Welsh](https://palewi.re/who-is-ben-welsh/) and [Derek Willis](https://thescoop.org/about/) prepared this guide for [a training session](https://schedules.ire.org/nicar-2025/index.html#2045) at the National Institute for Computer-Assisted Reporting’s 2025 conference in Minneapolis. Some of the copy was written with the assistance of GitHub's Copilot, an AI-powered text generator. The materials are available as free and [open source on GitHub](https://github.com/NewsAppsUMD/first-llm-classifier-umd). The project has been adapted to [run on Hugging Face](https://huggingface.co/spaces/JournalistsonHF/first-llm-classifier) by [Florent Daudens](https://www.linkedin.com/in/fdaudens/).