*** Wartungsfenster jeden ersten Mittwoch vormittag im Monat ***

Skip to content
Snippets Groups Projects
Commit 77a8b152 authored by Pfister, Martin's avatar Pfister, Martin
Browse files

Add unsloth LEONARDO results

parent 2ec4b371
No related branches found
No related tags found
No related merge requests found
......@@ -36,7 +36,7 @@ Finetune and evaluate [Mistral 7B Instruct v0.3](https://huggingface.co/mistrala
| - | - | - | - | - |
| VSC5 (Nvidia A40) | samples/s | samples/s | GB | GB |
| VSC5 (Nvidia A100) | 13.6 samples/s (+72%) | 18.4 samples/s (-7%) | 6.9 GB (-36%) | 13.7 GB (-51%) |
| Leonardo (Nvidia A100) | samples/s | samples/s | GB | GB |
| Leonardo (Nvidia A100) | 14.9 samples/s (+71%) | 21.5 samples/s (-5%) | 7.0 GB (-40%) | 13.7 GB (-51%) |
### [mistral7b-bnb](mistral7b-bnb) multi GPU training with DDP
......
Unloading profile/base
ERROR: Module evaluation aborted
+ date
Tue Sep 10 17:33:56 CEST 2024
+ hostname
lrdn2201.leonardo.local
+ nvidia-smi
Tue Sep 10 17:33:56 2024
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 530.30.02 Driver Version: 530.30.02 CUDA Version: 12.1 |
|-----------------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
| 0 NVIDIA A100-SXM-64GB On | 00000000:1D:00.0 Off | 0 |
| N/A 43C P0 63W / 474W| 0MiB / 65536MiB | 0% Default |
| | | Disabled |
+-----------------------------------------+----------------------+----------------------+
+---------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=======================================================================================|
| No running processes found |
+---------------------------------------------------------------------------------------+
+ conda run -n finetuning --no-capture-output python mistral7b_train.py
Unsloth: Will load unsloth/mistral-7b-instruct-v0.3-bnb-4bit as a legacy tokenizer.
Using the latest cached version of the dataset since medmcqa couldn't be found on the Hugging Face Hub
Found the latest cached dataset configuration 'default' at /leonardo/home/userexternal/mpfister/.cache/huggingface/datasets/medmcqa/default/0.0.0/91c6572c454088bf71b679ad90aa8dffcd0d5868 (last modified on Thu Aug 29 19:38:14 2024).
🦥 Unsloth: Will patch your computer to enable 2x faster free finetuning.
==((====))== Unsloth 2024.8: Fast Mistral patching. Transformers = 4.43.4.
\\ /| GPU: NVIDIA A100-SXM-64GB. Max memory: 63.423 GB. Platform = Linux.
O^O/ \_/ \ Pytorch: 2.2.0+cu121. CUDA = 8.0. CUDA Toolkit = 12.1.
\ / Bfloat16 = TRUE. FA [Xformers = 0.0.24. FA2 = True]
"-____-" Free Apache license: http://github.com/unslothai/unsloth
Map: 100%|██████████| 182822/182822 [00:38<00:00, 4688.50 examples/s]
Unsloth 2024.8 patched 32 layers with 32 QKV layers, 32 O layers and 32 MLP layers.
Map: 100%|██████████| 182822/182822 [01:53<00:00, 1615.62 examples/s]
Detected kernel version 4.18.0, which is below the recommended minimum of 5.5.0; this can cause the process to hang. It is recommended to upgrade the kernel to the minimum version or higher.
max_steps is given, it will override any value given in num_train_epochs
==((====))== Unsloth - 2x faster free finetuning | Num GPUs = 1
\\ /| Num examples = 182,822 | Num Epochs = 1
O^O/ \_/ \ Batch size per device = 8 | Gradient Accumulation steps = 1
\ / Total batch size = 8 | Total steps = 100
"-____-" Number of trainable parameters = 41,943,040
trainable params: 41,943,040 || all params: 7,289,966,592 || trainable%: 0.5754
100%|██████████| 100/100 [00:53<00:00, 1.86it/s]
/leonardo/home/userexternal/mpfister/.conda/envs/finetuning/lib/python3.11/site-packages/peft/utils/other.py:619: UserWarning: Unable to fetch remote file due to the following error (MaxRetryError("HTTPSConnectionPool(host='huggingface.co', port=443): Max retries exceeded with url: /unsloth/mistral-7b-instruct-v0.3-bnb-4bit/resolve/main/config.json (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x150ea3ee3d10>: Failed to establish a new connection: [Errno 101] Network is unreachable'))"), '(Request ID: f7b42e8c-3e90-4492-a1cc-1fbf976543d7)') - silently ignoring the lookup for the file config.json in unsloth/mistral-7b-instruct-v0.3-bnb-4bit.
warnings.warn(
/leonardo/home/userexternal/mpfister/.conda/envs/finetuning/lib/python3.11/site-packages/peft/utils/save_and_load.py:218: UserWarning: Could not find a config file in unsloth/mistral-7b-instruct-v0.3-bnb-4bit - will assume that the vocabulary was not modified.
warnings.warn(
{'loss': 1.6021, 'grad_norm': 1.7103830575942993, 'learning_rate': 4.7e-05, 'epoch': 0.0}
{'loss': 0.9107, 'grad_norm': 1.489850640296936, 'learning_rate': 9.7e-05, 'epoch': 0.0}
{'train_runtime': 53.6592, 'train_samples_per_second': 14.909, 'train_steps_per_second': 1.864, 'train_loss': 1.2563672256469727, 'epoch': 0.0}
Run time: 53.66 seconds
1 GPUs used.
Training speed: 14.9 samples/s (=14.9 samples/s/GPU)
Memory occupied on GPUs: 7.0 GB.
real 6m9.210s
user 3m22.330s
sys 0m20.955s
+ conda run -n finetuning --no-capture-output python mistral7b_test.py
Unsloth: Will load unsloth/mistral-7b-instruct-v0.3-bnb-4bit as a legacy tokenizer.
Unsloth 2024.8 patched 32 layers with 32 QKV layers, 32 O layers and 32 MLP layers.
Using the latest cached version of the dataset since medmcqa couldn't be found on the Hugging Face Hub
Found the latest cached dataset configuration 'default' at /leonardo/home/userexternal/mpfister/.cache/huggingface/datasets/medmcqa/default/0.0.0/91c6572c454088bf71b679ad90aa8dffcd0d5868 (last modified on Tue Sep 10 17:39:08 2024).
🦥 Unsloth: Will patch your computer to enable 2x faster free finetuning.
==((====))== Unsloth 2024.8: Fast Mistral patching. Transformers = 4.43.4.
\\ /| GPU: NVIDIA A100-SXM-64GB. Max memory: 63.423 GB. Platform = Linux.
O^O/ \_/ \ Pytorch: 2.2.0+cu121. CUDA = 8.0. CUDA Toolkit = 12.1.
\ / Bfloat16 = TRUE. FA [Xformers = 0.0.24. FA2 = True]
"-____-" Free Apache license: http://github.com/unslothai/unsloth
Map: 100%|██████████| 4183/4183 [00:00<00:00, 4712.54 examples/s]
100%|██████████| 66/66 [03:14<00:00, 2.95s/it]
45.16% (1889 out of 4183) answers correct.
Run time: 194.71 seconds
Samples/second: 21.5
Memory occupied on GPUs: 13.7 GB.
real 5m25.199s
user 2m52.396s
sys 0m41.669s
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment