184 lines
13 KiB
Plaintext
184 lines
13 KiB
Plaintext
{
|
|
"cells": [
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {
|
|
"id": "EDe7DsPWmEBV"
|
|
},
|
|
"source": [
|
|
"<h1>Chapter 1 - Introduction to Language Models</h1>\n",
|
|
"<i>Exploring the exciting field of Language AI</i>\n",
|
|
"\n",
|
|
"\n",
|
|
"<a href=\"https://www.amazon.com/Hands-Large-Language-Models-Understanding/dp/1098150961\"><img src=\"https://img.shields.io/badge/Buy%20the%20Book!-grey?logo=amazon\"></a>\n",
|
|
"<a href=\"https://www.oreilly.com/library/view/hands-on-large-language/9781098150952/\"><img src=\"https://img.shields.io/badge/O'Reilly-white.svg?logo=\"></a>\n",
|
|
"<a href=\"https://github.com/HandsOnLLM/Hands-On-Large-Language-Models\"><img src=\"https://img.shields.io/badge/GitHub%20Repository-black?logo=github\"></a>\n",
|
|
"[](https://colab.research.google.com/github/HandsOnLLM/Hands-On-Large-Language-Models/blob/main/chapter01/Chapter%201%20-%20Introduction%20to%20Language%20Models.ipynb)\n",
|
|
"\n",
|
|
"---\n",
|
|
"\n",
|
|
"This notebook is for Chapter 1 of the [Hands-On Large Language Models](https://www.amazon.com/Hands-Large-Language-Models-Understanding/dp/1098150961) book by [Jay Alammar](https://www.linkedin.com/in/jalammar) and [Maarten Grootendorst](https://www.linkedin.com/in/mgrootendorst/).\n",
|
|
"\n",
|
|
"---\n",
|
|
"\n",
|
|
"<a href=\"https://www.amazon.com/Hands-Large-Language-Models-Understanding/dp/1098150961\">\n",
|
|
"<img src=\"https://raw.githubusercontent.com/HandsOnLLM/Hands-On-Large-Language-Models/main/images/book_cover.png\" width=\"350\"/></a>\n"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"### [OPTIONAL] - Installing Packages on <img src=\"https://colab.google/static/images/icons/colab.png\" width=100>\n",
|
|
"\n",
|
|
"If you are viewing this notebook on Google Colab (or any other cloud vendor), you need to **uncomment and run** the following codeblock to install the dependencies for this chapter:"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"---\n",
|
|
"\n",
|
|
"💡 **NOTE**: We will want to use a GPU to run the examples in this notebook. In Google Colab, go to\n",
|
|
"**Runtime > Change runtime type > Hardware accelerator > GPU > GPU type > T4**.\n",
|
|
"\n",
|
|
"---"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"# %%capture\n",
|
|
"# !pip install transformers>=4.40.1 accelerate>=0.27.2"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {
|
|
"id": "hXp09JFsFBXi"
|
|
},
|
|
"source": [
|
|
"# Phi-3\n",
|
|
"\n",
|
|
"The first step is to load our model onto the GPU for faster inference. Note that we load the model and tokenizer separately (although that isn't always necessary)."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {
|
|
"id": "RSNalRXZyTTk"
|
|
},
|
|
"outputs": [],
|
|
"source": [
|
|
"from transformers import AutoModelForCausalLM, AutoTokenizer\n",
|
|
"\n",
|
|
"# Load model and tokenizer\n",
|
|
"model = AutoModelForCausalLM.from_pretrained(\n",
|
|
" \"microsoft/Phi-3-mini-4k-instruct\",\n",
|
|
" device_map=\"cuda\",\n",
|
|
" torch_dtype=\"auto\",\n",
|
|
" trust_remote_code=False,\n",
|
|
")\n",
|
|
"tokenizer = AutoTokenizer.from_pretrained(\"microsoft/Phi-3-mini-4k-instruct\")"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {
|
|
"id": "qdyYYS0E5fEU"
|
|
},
|
|
"source": [
|
|
"Although we can now use the model and tokenizer directly, it's much easier to wrap it in a `pipeline` object:"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 2,
|
|
"metadata": {
|
|
"id": "DiUi4Wu1FCyN"
|
|
},
|
|
"outputs": [],
|
|
"source": [
|
|
"from transformers import pipeline\n",
|
|
"\n",
|
|
"# Create a pipeline\n",
|
|
"generator = pipeline(\n",
|
|
" \"text-generation\",\n",
|
|
" model=model,\n",
|
|
" tokenizer=tokenizer,\n",
|
|
" return_full_text=False,\n",
|
|
" max_new_tokens=500,\n",
|
|
" do_sample=False\n",
|
|
")"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {
|
|
"id": "mD49kysT5mMY"
|
|
},
|
|
"source": [
|
|
"Finally, we create our prompt as a user and give it to the model:"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 4,
|
|
"metadata": {
|
|
"id": "hkR7LBmiyXmY"
|
|
},
|
|
"outputs": [
|
|
{
|
|
"name": "stdout",
|
|
"output_type": "stream",
|
|
"text": [
|
|
" Why did the chicken join the band? Because it had the drumsticks!\n"
|
|
]
|
|
}
|
|
],
|
|
"source": [
|
|
"# The prompt (user input / query)\n",
|
|
"messages = [\n",
|
|
" {\"role\": \"user\", \"content\": \"Create a funny joke about chickens.\"}\n",
|
|
"]\n",
|
|
"\n",
|
|
"# Generate output\n",
|
|
"output = generator(messages)\n",
|
|
"print(output[0][\"generated_text\"])"
|
|
]
|
|
}
|
|
],
|
|
"metadata": {
|
|
"accelerator": "GPU",
|
|
"colab": {
|
|
"authorship_tag": "ABX9TyPCWg08aO4e8NWQuYCK5ppF",
|
|
"gpuType": "T4",
|
|
"provenance": []
|
|
},
|
|
"kernelspec": {
|
|
"display_name": "Python 3 (ipykernel)",
|
|
"language": "python",
|
|
"name": "python3"
|
|
},
|
|
"language_info": {
|
|
"codemirror_mode": {
|
|
"name": "ipython",
|
|
"version": 3
|
|
},
|
|
"file_extension": ".py",
|
|
"mimetype": "text/x-python",
|
|
"name": "python",
|
|
"nbconvert_exporter": "python",
|
|
"pygments_lexer": "ipython3",
|
|
"version": "3.10.14"
|
|
}
|
|
},
|
|
"nbformat": 4,
|
|
"nbformat_minor": 4
|
|
}
|