### [mistral7b-bnb](mistral7b-bnb) multi GPU training with FSDP
### [llama3.1-70b-bnb](llama3.1-70b-bnb) multi GPU training with FSDP
Finetune and evaluate [Llama 3.1 70B Instruct](https://huggingface.co/meta-llama/Meta-Llama-3.1-70B-Instruct) with 4-bit [bitsandbytes quantisation](https://huggingface.co/docs/bitsandbytes/index) on the [MedMCQA](https://medmcqa.github.io) dataset on multiple GPUs on a single node using the [fully sharded data parallel (FSDP)](https://pytorch.org/tutorials/intermediate/FSDP_tutorial.html) approach.