Skip to content

Add support for tensors/heads not divisible by GPUs #8

Add support for tensors/heads not divisible by GPUs

Add support for tensors/heads not divisible by GPUs #8

Triggered via pull request February 4, 2025 15:27
Status Failure
Total duration 4m 19s
Artifacts

pre-commit.yml

on: pull_request
Fit to window
Zoom out
Zoom in

Annotations

10 errors
Ruff (F841): vllm/config.py#L706
vllm/config.py:706:9: F841 Local variable `total_num_attention_heads` is assigned to but never used
Ruff (F841): vllm/config.py#L708
vllm/config.py:708:9: F841 Local variable `tensor_parallel_size` is assigned to but never used
Ruff (F841): vllm/model_executor/layers/fused_moe/layer.py#L455
vllm/model_executor/layers/fused_moe/layer.py:455:9: F841 Local variable `tp_rank` is assigned to but never used
Ruff (E501): vllm/model_executor/layers/linear.py#L292
vllm/model_executor/layers/linear.py:292:81: E501 Line too long (93 > 80)
Ruff (F841): vllm/model_executor/layers/linear.py#L379
vllm/model_executor/layers/linear.py:379:13: F841 Local variable `shard_size` is assigned to but never used
Ruff (F841): vllm/model_executor/layers/linear.py#L680
vllm/model_executor/layers/linear.py:680:9: F841 Local variable `tp_size` is assigned to but never used
Ruff (F841): vllm/model_executor/layers/linear.py#L1129
vllm/model_executor/layers/linear.py:1129:9: F841 Local variable `tp_rank` is assigned to but never used
Ruff (F841): vllm/model_executor/layers/linear.py#L1153
vllm/model_executor/layers/linear.py:1153:13: F841 Local variable `shard_size` is assigned to but never used
Ruff (E501): vllm/model_executor/layers/quantization/base_config.py#L45
vllm/model_executor/layers/quantization/base_config.py:45:81: E501 Line too long (83 > 80)
Ruff (F841): vllm/model_executor/layers/quantization/fp8.py#L174
vllm/model_executor/layers/quantization/fp8.py:174:9: F841 Local variable `tp_chunk` is assigned to but never used