Qwen2.5-Coder-14B-Instruct is an instruction fine-tuning model optimized for code tasks developed by Qwen. It is suitable for code generation, reasoning, debugging and other application scenarios.
Model architecture
Contains 48 Transformer layers, using rotation position embedding (RoPE), SwiGLU activation function, RMSNorm normalization and attention mechanism with QKV bias.
Using Grouped Query Attention (GQA), there are 40 query headers and 8 key-value headers, designed for efficient code processing.
Parameter quantity
The total number of parameters is 14.7 billion, of which 13.1 billion are used for the non-embedded part.
context length
Supports context lengths up to 131,072 tokens and supports handling of large code bases and long documents through YaRN technology.
Performance
Significantly superior performance in code generation, inference, and code repair, as well as strong performance in mathematical calculations and general-purpose tasks.
The basic model provides a variety of parameter sizes, including 0.5B, 1.5B, 3B, 7B, 14B and 32B, suitable for code completion and basic tasks.
Instruction fine-tuning model Optimized for interactive tasks such as code generation and debugging, the 14B-Instruct model is ideal for chat-based application scenarios.
Python version : 3.9 or higher.
Transformers library : version 4.37.0 or higher, supports the integration of Qwen2 series models.
The sample code for loading a model using Hugging Face's transformers
library is as follows:
from transformers import AutoModelForCausalLM, AutoTokenizer model_name = "Qwen/Qwen2.5-Coder-14B-Instruct" model = AutoModelForCausalLM.from_pretrained( model_name, torch_dtype="auto", device_map="auto" ) tokenizer = AutoTokenizer.from_pretrained(model_name)
This model can efficiently complete tasks such as code generation and debugging.