Ollama

You can easily run AI models on your local machine by using Ollama. The installation is easy, use install script or get it directly from Releases page(ollama-linux-amd64.tgz).

The way normal people use this is to just find the model in the website and simply:

$ ollama pull <model>

The problem is the download process interrupt everytime duo to internet speed and it will start from scratch. Basicly you can’t download the model in this way so we have to download it manually and tell Ollama to use that.

There are many models out there, I recommend DeepSeek-r1 but you can use whatever you want like Lamma or Mistrul or anything else.

> DeepSeek-r1

Hugging Face

This is a famous website for AI models like what Github is for Git projects.

https://huggingface.co

For example search for DeepSeek-R1 and find the official page.

https://huggingface.co/deepseek-ai/DeepSeek-R1

Companies like DeepSeek has variants of models like 7B parameters.

Models

Parameters

You should choose the number of parameters like 7 billion(7B), the more parameter means better answer and also it needs more resources. Most personal computers with a decent GPU can run 7B models. Remember that the answers will not be as good as ChatGPT or other websites, they use 70B or higher models.

That DeepSeek page that I showed you earlier is the 685B parameters model but you want the 7B one.

https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-7B

Quantizations

> 7B parameters

Quantization

It’s makes them use less recources and also less accurate answer, there is a trade off here and Q4 is what you want. Also you see K_M, _0, K_S and other types, you want K_M.

There are many people and groups that quantize models like this one:

Models

> Q4_K_M

GGUF vs Safetensors

This are two file formats to store model’s data. Models officially release their data in Safetensors but they are way larger than GGUF and use more resources.

> GGUF format

Click Files and versions and Download the desired file, in this case:

GGUF

Modelfile

Now create a file named Modefile with this content:

FROM /path/to/model.gguf

Also add more data like the template and system info:

PARAMETER temperature 1
TEMPLATE """{{- if .System }}{{ .System }}{{ end }}
{{- range $i, $_ := .Messages }}
{{- $last := eq (len (slice $.Messages $i)) 1}}
{{- if eq .Role "user" }}<|User|>{{ .Content }}
{{- else if eq .Role "assistant" }}<|Assistant|>{{ .Content }}{{- if not $last }}<|end▁of▁sentence|>{{- end }}
{{- end }}
{{- if and $last (ne .Role "assistant") }}<|Assistant|>{{- end }}
{{- end }}"""
PARAMETER stop "<|begin▁of▁sentence|>",
PARAMETER stop "<|end▁of▁sentence|>",
PARAMETER stop "<|User|>",
PARAMETER stop "<|Assistant|>"
SYSTEM """
You are DeepSeek. You are a helpful assistant.
"""

You can get this data from the Ollama website.

Params

Run!

Run the deamon:

$ ollama serve

Create the model:

$ ollama create deepseek -f Modelfile

Run:

$ ollama run deepseek

I hope you found this article useful.