dragon.ai.inference.config.ModelConfig
- class ModelConfig[source]
Bases:
objectLLM model configuration.
- __init__(model_name: str , hf_token: str , tp_size: int , dtype: str = 'bfloat16', max_tokens: int = 100, max_model_len: int = 8192, padding_side: str = 'left', truncation_side: str = 'left', top_k: int = 50, top_p: float = 0.95, system_prompt: List [str ] = <factory>, vllm_log_level: str = 'error') None
Methods
__init__(model_name, hf_token, tp_size, ...)validate(gpus_per_node)Validate model configuration.
Attributes
- validate(gpus_per_node: int ) None [source]
Validate model configuration.
- Parameters:
gpus_per_node (int ) – Number of GPUs available per node.
- Raises:
ValueError – If any configuration parameter is invalid.
- __init__(model_name: str , hf_token: str , tp_size: int , dtype: str = 'bfloat16', max_tokens: int = 100, max_model_len: int = 8192, padding_side: str = 'left', truncation_side: str = 'left', top_k: int = 50, top_p: float = 0.95, system_prompt: List [str ] = <factory>, vllm_log_level: str = 'error') None