Files
Hands-On-Large-Language-Models/chapter01/Chapter 1 - Introduction to Language Models.ipynb
2025-04-13 07:42:43 +02:00

184 lines
5.6 KiB
Plaintext

{
"cells": [
{
"cell_type": "markdown",
"metadata": {
"id": "EDe7DsPWmEBV"
},
"source": [
"<h1>Chapter 1 - Introduction to Language Models</h1>\n",
"<i>Exploring the exciting field of Language AI</i>\n",
"\n",
"\n",
"<a href=\"https://www.amazon.com/Hands-Large-Language-Models-Understanding/dp/1098150961\"><img src=\"https://img.shields.io/badge/Buy%20the%20Book!-grey?logo=amazon\"></a>\n",
"<a href=\"https://www.oreilly.com/library/view/hands-on-large-language/9781098150952/\"><img src=\"https://img.shields.io/badge/O'Reilly-white.svg?logo=data:image/svg%2bxml;base64,PHN2ZyB3aWR0aD0iMzQiIGhlaWdodD0iMjciIHZpZXdCb3g9IjAgMCAzNCAyNyIgZmlsbD0ibm9uZSIgeG1sbnM9Imh0dHA6Ly93d3cudzMub3JnLzIwMDAvc3ZnIj4KPGNpcmNsZSBjeD0iMTMiIGN5PSIxNCIgcj0iMTEiIHN0cm9rZT0iI0Q0MDEwMSIgc3Ryb2tlLXdpZHRoPSI0Ii8+CjxjaXJjbGUgY3g9IjMwLjUiIGN5PSIzLjUiIHI9IjMuNSIgZmlsbD0iI0Q0MDEwMSIvPgo8L3N2Zz4K\"></a>\n",
"<a href=\"https://github.com/HandsOnLLM/Hands-On-Large-Language-Models\"><img src=\"https://img.shields.io/badge/GitHub%20Repository-black?logo=github\"></a>\n",
"[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/HandsOnLLM/Hands-On-Large-Language-Models/blob/main/chapter01/Chapter%201%20-%20Introduction%20to%20Language%20Models.ipynb)\n",
"\n",
"---\n",
"\n",
"This notebook is for Chapter 1 of the [Hands-On Large Language Models](https://www.amazon.com/Hands-Large-Language-Models-Understanding/dp/1098150961) book by [Jay Alammar](https://www.linkedin.com/in/jalammar) and [Maarten Grootendorst](https://www.linkedin.com/in/mgrootendorst/).\n",
"\n",
"---\n",
"\n",
"<a href=\"https://www.amazon.com/Hands-Large-Language-Models-Understanding/dp/1098150961\">\n",
"<img src=\"https://raw.githubusercontent.com/HandsOnLLM/Hands-On-Large-Language-Models/main/images/book_cover.png\" width=\"350\"/></a>\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### [OPTIONAL] - Installing Packages on <img src=\"https://colab.google/static/images/icons/colab.png\" width=100>\n",
"\n",
"If you are viewing this notebook on Google Colab (or any other cloud vendor), you need to **uncomment and run** the following codeblock to install the dependencies for this chapter:"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"---\n",
"\n",
"💡 **NOTE**: We will want to use a GPU to run the examples in this notebook. In Google Colab, go to\n",
"**Runtime > Change runtime type > Hardware accelerator > GPU > GPU type > T4**.\n",
"\n",
"---"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# %%capture\n",
"# !pip install transformers>=4.40.1 accelerate>=0.27.2"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "hXp09JFsFBXi"
},
"source": [
"# Phi-3\n",
"\n",
"The first step is to load our model onto the GPU for faster inference. Note that we load the model and tokenizer separately (although that isn't always necessary)."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "RSNalRXZyTTk"
},
"outputs": [],
"source": [
"from transformers import AutoModelForCausalLM, AutoTokenizer\n",
"\n",
"# Load model and tokenizer\n",
"model = AutoModelForCausalLM.from_pretrained(\n",
" \"microsoft/Phi-3-mini-4k-instruct\",\n",
" device_map=\"cuda\",\n",
" torch_dtype=\"auto\",\n",
" trust_remote_code=False,\n",
")\n",
"tokenizer = AutoTokenizer.from_pretrained(\"microsoft/Phi-3-mini-4k-instruct\")"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "qdyYYS0E5fEU"
},
"source": [
"Although we can now use the model and tokenizer directly, it's much easier to wrap it in a `pipeline` object:"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {
"id": "DiUi4Wu1FCyN"
},
"outputs": [],
"source": [
"from transformers import pipeline\n",
"\n",
"# Create a pipeline\n",
"generator = pipeline(\n",
" \"text-generation\",\n",
" model=model,\n",
" tokenizer=tokenizer,\n",
" return_full_text=False,\n",
" max_new_tokens=500,\n",
" do_sample=False\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "mD49kysT5mMY"
},
"source": [
"Finally, we create our prompt as a user and give it to the model:"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {
"id": "hkR7LBmiyXmY"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
" Why did the chicken join the band? Because it had the drumsticks!\n"
]
}
],
"source": [
"# The prompt (user input / query)\n",
"messages = [\n",
" {\"role\": \"user\", \"content\": \"Create a funny joke about chickens.\"}\n",
"]\n",
"\n",
"# Generate output\n",
"output = generator(messages)\n",
"print(output[0][\"generated_text\"])"
]
}
],
"metadata": {
"accelerator": "GPU",
"colab": {
"authorship_tag": "ABX9TyPCWg08aO4e8NWQuYCK5ppF",
"gpuType": "T4",
"provenance": []
},
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.12.9"
}
},
"nbformat": 4,
"nbformat_minor": 4
}