audreyt
/

Gemma-3-TAIDE-12b-Chat-2602-GGUF

Model card Files Files and versions

Gemma-3-TAIDE-12b-Chat-2602 GGUF

GGUF quantizations of taide/Gemma-3-TAIDE-12b-Chat-2602, a 12.4B parameter model fine-tuned for Traditional Chinese and Taiwan-centric applications.

Quantized with llama.cpp b8110.

Available Quantizations

File	Size	Description
Gemma-3-TAIDE-12b-Chat-2602-F16.gguf	25 GB	Full precision (F16)
Gemma-3-TAIDE-12b-Chat-2602-Q8_0.gguf	13 GB	8-bit, highest quality
Gemma-3-TAIDE-12b-Chat-2602-Q6_K.gguf	10 GB	6-bit
Gemma-3-TAIDE-12b-Chat-2602-Q5_K_M.gguf	8.8 GB	5-bit medium
Gemma-3-TAIDE-12b-Chat-2602-Q4_K_M.gguf	7.6 GB	4-bit medium (recommended)
Gemma-3-TAIDE-12b-Chat-2602-Q3_K_L.gguf	6.7 GB	3-bit large
Gemma-3-TAIDE-12b-Chat-2602-Q3_K_M.gguf	6.3 GB	3-bit medium
Gemma-3-TAIDE-12b-Chat-2602-Q3_K_S.gguf	5.7 GB	3-bit small
Gemma-3-TAIDE-12b-Chat-2602-Q2_K.gguf	5.0 GB	2-bit, smallest

Usage

llama-cli -m Gemma-3-TAIDE-12b-Chat-2602-Q4_K_M.gguf -p "你是一個來自台灣的AI助理，你的名字是 TAIDE" -cnv

Downloads last month: 1,409

GGUF

Model size

13B params

Architecture

gemma3

Hardware compatibility

Log In to add your hardware

2-bit

3-bit

4-bit

5-bit

6-bit

8-bit

16-bit

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for audreyt/Gemma-3-TAIDE-12b-Chat-2602-GGUF

Base model

taide/Gemma-3-TAIDE-12b-Chat-2602

Quantized

(2)

this model