{
"cells": [
{
"cell_type": "markdown",
"metadata": {
"id": "k4kpdDCUGe_i"
},
"source": [
"# Введение в анализ данных\n",
"\n",
"\n",
"## Обработка естественного языка. Генерация текста с помощью модели LLAMA."
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "H_h6MAj1SrtP"
},
"source": [
"В предыдущем [ноутбуке](https://miptstats.github.io/courses/ad_fivt/nlp_sem.html) мы научимся строить рекуррентные нейронные сети. В этом ноутбуке мы применим большую языковую модель LLAMA-2, используя GPU.\n",
"Llama 2 — это семейство современных больших языковых моделей с открытым доступом. Почитать оригинальную статью 2023 года можно здесь.\n",
"\n",
"Модель может принимать на вход некоторый текст и продолжать его. Заметьте, что по умолчанию языковые модели не являются *conversational*, то есть их использование отличается от моделей типа Chat-GPT, которые предназначены для интерактивного взаимодействия с пользователем. Часто одну модель выкладывают в нескольких разных конфигурациях — и с обычным, и с conversational интерфейсом."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "7xeRF_hSKgzs"
},
"outputs": [],
"source": [
"%pip install --quiet bitsandbytes==0.41.1 transformers==4.34.1 accelerate==0.24.0 sentencepiece==0.1.99 optimum==1.13.2 auto-gptq==0.4.2\n",
"import torch\n",
"import transformers\n",
"\n",
"assert torch.cuda.is_available(), \"you need cuda for this part\"\n",
"device = torch.device(\"cuda\" if torch.cuda.is_available() else \"cpu\")"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "xfYEmczvjmsk"
},
"source": [
"Загрузим модель `TheBloke/Llama-2-13B-GPTQ` из Hugging Face."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 365,
"referenced_widgets": [
"0a57aa3fb8af4b6d904b879ace771551",
"475a554352744fed8da0ba9c92410898",
"0a200547d48d488f8c44c77d2f73483b",
"accd345ccfc74afaa2b4147ab7dcef9a",
"3a36131f141e4da7afc3e00a12d69a86",
"bf482a3f2e0c4305afbe918d3ede3ee0",
"845cb6c473154f8eb553aa66f9b3fa7d",
"9abb03a508d64b3ca373527ad73eee43",
"29cf52c50a1a4e82957cdca500d00f23",
"93b833aeda394f298cbff99f2b8e04e9",
"bae1737063ee40dd8ef5e6553659cbf7",
"00696440baba428490dc95ebc72e788e",
"67874a98790c41c1adc8cda98c4ca950",
"cab1d4211ec044229324c80a3b14c84c",
"4ed6854cfdf54a8ea3c8114b8b5c6b31",
"df2871c9f3ca4189b9cb7fcb18faf55c",
"d7db1cb1374644708a751e8987c7f3b6",
"1767702750324c6bb94c9d54350e5322",
"d27c5606c07c48039f32803b1bc67dc6",
"0f8939827c5a4b9b9daae69900fab24d",
"280de6ba38614e1aba1dfc73d1b6fc4b",
"e343c08ee7174cf19b4712950fd6f75d",
"ab6941b786604960a6184de5f83d6c6f",
"48fd878592b44c75b8e89012b38a7c98",
"17119b5128b24cfe89b5d6e62f1773cb",
"a6818b39c89b4cbd9d5ce85103cee07a",
"b6186aadad97458e9974cbb5c6127e00",
"393ea146c52c4a1aa96d377e4de9f801",
"13d71cd481b248398131519584911180",
"1a9a79adc5fb470793e06d9c322b6926",
"7c75f36d1f144fe09dd02351a2f74668",
"898bcad4a82642d688464550fc8025c2",
"fee0b5097c2e45a89bd9d71b56472589",
"3e400844696a47edb97fb85404390425",
"369eecf94279469980fe1449f903dc7d",
"64e707df1c954ed39f68dbb37dd5a7c6",
"23f97903fdc94f9088b157c82f05688a",
"e685dee405874ffe8986629bdb06bfa8",
"105c86d260214e68ad024ae5f7c6257e",
"f3489a41493f453fb6de30abb8c55134",
"4d1df5c6848f4cffb64763619206e629",
"ce8335014fb248d29ae14ea5d6146b3e",
"8170e01b38c04d3b988b87c0cd0b33a9",
"d8f24f11cdea4a4c9ecf3c414630dd69",
"9eebf1f4fe884b67a0b04eecc99bf6b5",
"07299e026d794c3ba5ddd41ba2a7bd2a",
"626185f4dd974afab772dea7c58964bd",
"202947386e9b4468af6b5321a1bdfb73",
"84d4624212b8410aa13bbca392dfd694",
"e51ef80b5d4241ae85789d31670ac9de",
"a16c4bbdb48241f8a88433d32ad25823",
"db91a529aa2b4eb0b5764c451a3fbb9f",
"eae35522244f4e778548775a37c9cb73",
"22d5ccba29934652a1e96888ec3741d9",
"7e1e56111053477a8086a397e57f221d",
"feff1892c44a42c787bd9467eabc1712",
"c461603036b0450490912f6741c16235",
"9c5ce1657e7444f7a542831d06fe533c",
"13f0725bffd64a98bbb6a555d4cc4464",
"a631fee570394506ada1cdcabf719fb8",
"b4118e45f3b945d9b95f942dccd815cb",
"326a5fb332be4e41bb75e026d25c9fdd",
"c93f78e236234d3a927c9ab696915f69",
"cd1400d1251d43e7be7496c3ffd67b90",
"256923e9ed844d66b7b4a490a52aefcb",
"8c72760ae196449dbbdb5f5ac81e10a2",
"df504ac591534a5bb4184e93fefafecd",
"67ec9a06b70a49d9b965e74d1fd601f7",
"0dd9103b87f94f4096c1acc76ae2b24e",
"c1dfc401974d4a20b553fbd46ec73cf1",
"e81111dd5105409ba5a71ba192d0ce79",
"bde80d3a6ac14cfcac2fd9dd7dcac1ae",
"365aec1ab7ad4d8984526f9d55d4b9d6",
"1e3c8879103146979caccb8a9265c1a7",
"f8382986155d41c29acdf217d237561f",
"0cdcd08e951844e48545d9c9b7ae2e1b",
"770fdabc09b74f50ac64473ade4548a6"
]
},
"id": "VMzFwx29Kgzu",
"outputId": "12d0cf33-a940-422b-81fe-a8d3117b729b"
},
"outputs": [
{
"data": {
"application/vnd.jupyter.widget-view+json": {
"model_id": "0a57aa3fb8af4b6d904b879ace771551",
"version_major": 2,
"version_minor": 0
},
"text/plain": [
"Downloading tokenizer_config.json: 0%| | 0.00/727 [00:00, ?B/s]"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"application/vnd.jupyter.widget-view+json": {
"model_id": "00696440baba428490dc95ebc72e788e",
"version_major": 2,
"version_minor": 0
},
"text/plain": [
"Downloading tokenizer.model: 0%| | 0.00/500k [00:00, ?B/s]"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"application/vnd.jupyter.widget-view+json": {
"model_id": "ab6941b786604960a6184de5f83d6c6f",
"version_major": 2,
"version_minor": 0
},
"text/plain": [
"Downloading (…)cial_tokens_map.json: 0%| | 0.00/411 [00:00, ?B/s]"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"application/vnd.jupyter.widget-view+json": {
"model_id": "3e400844696a47edb97fb85404390425",
"version_major": 2,
"version_minor": 0
},
"text/plain": [
"Downloading tokenizer.json: 0%| | 0.00/1.84M [00:00, ?B/s]"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"You are using the default legacy behaviour of the . This is expected, and simply means that the `legacy` (previous) behavior will be used so nothing changes for you. If you want to use the new behaviour, set `legacy=False`. This should only be set if you understand what it means, and thouroughly read the reason why this was added as explained in https://github.com/huggingface/transformers/pull/24565\n"
]
},
{
"data": {
"application/vnd.jupyter.widget-view+json": {
"model_id": "9eebf1f4fe884b67a0b04eecc99bf6b5",
"version_major": 2,
"version_minor": 0
},
"text/plain": [
"Downloading config.json: 0%| | 0.00/913 [00:00, ?B/s]"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"/usr/local/lib/python3.10/dist-packages/transformers/utils/generic.py:311: UserWarning: torch.utils._pytree._register_pytree_node is deprecated. Please use torch.utils._pytree.register_pytree_node instead.\n",
" torch.utils._pytree._register_pytree_node(\n",
"/usr/local/lib/python3.10/dist-packages/transformers/utils/generic.py:311: UserWarning: torch.utils._pytree._register_pytree_node is deprecated. Please use torch.utils._pytree.register_pytree_node instead.\n",
" torch.utils._pytree._register_pytree_node(\n"
]
},
{
"data": {
"application/vnd.jupyter.widget-view+json": {
"model_id": "feff1892c44a42c787bd9467eabc1712",
"version_major": 2,
"version_minor": 0
},
"text/plain": [
"Downloading model.safetensors: 0%| | 0.00/7.26G [00:00, ?B/s]"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"WARNING:auto_gptq.nn_modules.qlinear.qlinear_cuda_old:CUDA extension not installed.\n"
]
},
{
"data": {
"application/vnd.jupyter.widget-view+json": {
"model_id": "df504ac591534a5bb4184e93fefafecd",
"version_major": 2,
"version_minor": 0
},
"text/plain": [
"Downloading generation_config.json: 0%| | 0.00/132 [00:00, ?B/s]"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"model_name = \"TheBloke/Llama-2-13B-GPTQ\"\n",
"\n",
"# Загружаем Llama токенизатор\n",
"tokenizer = transformers.LlamaTokenizer.from_pretrained(\n",
" model_name, device_map=device\n",
")\n",
"tokenizer.pad_token_id = tokenizer.eos_token_id\n",
"\n",
"# И саму модель Llama\n",
"model = transformers.AutoModelForCausalLM.from_pretrained(\n",
" model_name,\n",
" device_map=\"auto\",\n",
" torch_dtype=torch.float16,\n",
" low_cpu_mem_usage=True,\n",
" offload_state_dict=True,\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "OL25ObCDjtCh"
},
"source": [
"Зададим промпт и выведем сгенерированное моделью продолжение."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "gGfyeM-vdq5o",
"outputId": "2b9b31ba-6824-473e-bdc4-bc68130bb37e",
"tags": []
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Input batch (encoded): {'input_ids': tensor([[ 1, 450, 937, 10943, 14436, 713, 2834, 689, 3430, 763]],\n",
" device='cuda:0'), 'attention_mask': tensor([[1, 1, 1, 1, 1, 1, 1, 1, 1, 1]], device='cuda:0')}\n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"/usr/local/lib/python3.10/dist-packages/transformers/generation/utils.py:1421: UserWarning: You have modified the pretrained model configuration to control generation. This is a deprecated strategy to control generation and will be removed soon, in a future version. Please use and modify the model generation configuration (see https://huggingface.co/docs/transformers/generation_strategies#default-text-generation-configuration )\n",
" warnings.warn(\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"Output: The first discovered martian lifeform looks like a \"fancy spheroid\"\n",
"It’s not an alien, but it’s close.\n",
"A methane-rich meteorite from Mars is the planet’s first known lifeform.\n",
"NASA / JSC / SCIENCE PHOTO LIBR\n"
]
}
],
"source": [
"prompt = \"The first discovered martian lifeform looks like\"\n",
"batch = tokenizer(prompt, return_tensors=\"pt\", return_token_type_ids=False).to(\n",
" device\n",
")\n",
"print(\"Input batch (encoded):\", batch)\n",
"\n",
"output_tokens = model.generate(\n",
" **batch, max_new_tokens=64, do_sample=True, temperature=0.8\n",
")\n",
"# greedy inference: do_sample=False)\n",
"# beam search for highest probability: num_beams=4)\n",
"\n",
"print(\"\\nOutput:\", tokenizer.decode(output_tokens[0].cpu()))"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "6SJ-c-lTj27E"
},
"source": [
"**Вывод:** В этом ноутбуке мы посмотрели, как можно генерировать текст с помощью предобученной языковой модели LLAMA."
]
}
],
"metadata": {
"accelerator": "GPU",
"colab": {
"gpuType": "T4",
"provenance": []
},
"hide_input": false,
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.6"
}
},
"nbformat": 4,
"nbformat_minor": 1
}