Gemma-3-TAIDE-12b-Chat-2602 GGUF

GGUF quantizations of taide/Gemma-3-TAIDE-12b-Chat-2602, a 12.4B parameter model fine-tuned for Traditional Chinese and Taiwan-centric applications.

Quantized with llama.cpp b8110.

Available Quantizations

File Size Description
Gemma-3-TAIDE-12b-Chat-2602-F16.gguf 25 GB Full precision (F16)
Gemma-3-TAIDE-12b-Chat-2602-Q8_0.gguf 13 GB 8-bit, highest quality
Gemma-3-TAIDE-12b-Chat-2602-Q6_K.gguf 10 GB 6-bit
Gemma-3-TAIDE-12b-Chat-2602-Q5_K_M.gguf 8.8 GB 5-bit medium
Gemma-3-TAIDE-12b-Chat-2602-Q4_K_M.gguf 7.6 GB 4-bit medium (recommended)
Gemma-3-TAIDE-12b-Chat-2602-Q3_K_L.gguf 6.7 GB 3-bit large
Gemma-3-TAIDE-12b-Chat-2602-Q3_K_M.gguf 6.3 GB 3-bit medium
Gemma-3-TAIDE-12b-Chat-2602-Q3_K_S.gguf 5.7 GB 3-bit small
Gemma-3-TAIDE-12b-Chat-2602-Q2_K.gguf 5.0 GB 2-bit, smallest

Usage

llama-cli -m Gemma-3-TAIDE-12b-Chat-2602-Q4_K_M.gguf -p "你是一個來自台灣的AI助理,你的名字是 TAIDE" -cnv
Downloads last month
1,409
GGUF
Model size
13B params
Architecture
gemma3
Hardware compatibility
Log In to add your hardware

2-bit

3-bit

4-bit

5-bit

6-bit

8-bit

16-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for audreyt/Gemma-3-TAIDE-12b-Chat-2602-GGUF

Quantized
(2)
this model