Llama (ChatGPT @Home)

A 2 files on prem LLM. - Intro to Large Language Models

Llama vs ChatGPT

Llama.cpp

The main goal of llama.cpp is to run the LLaMA model using 4-bit integer quantization on a MacBook

$ git clone https://github.com/ggerganov/llama.cpp
$ cd llama.cpp
# https://github.com/ggerganov/llama.cpp#hipblas
$ make LLAMA_HIPBLAS=1 #  BLAS acceleration on HIP-supported AMD GPUs (ROCm) 

Download Model

Already quantized/converted model

Running Chat

$ ./main -m ./models/7B/llama-2-7b.Q4_K_M.gguf  --repeat_penalty 1.0 -ngl 100 --color -i -r "User:" -f prompts/chat-with-bob.txt

User:Where is Bratislava ?
Bob: Bratislava is the capital of Slovakia, which is a country in the European Union.
User:When was it founded ?
Bob: Bratislava was founded in 1536, by King Ferdinand I. # Beware of LLM answer...

see

UI ?

  • gpt-llama.cpp - Replace OpenAi’s GPT APIs with llama.cpp’s supported models locally
  • LlamaGPT - self-hosted, offline, ChatGPT-like chatbot, powered by Llama 2. 100% private, with no data leaving your device.

see also

Written on December 3, 2023, Last update on December 9, 2023
LLM NN