-
Notifications
You must be signed in to change notification settings - Fork 789
[P/D][Bugfix] Use config to get kv_num_head. #6405
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[P/D][Bugfix] Use config to get kv_num_head. #6405
Conversation
Signed-off-by: wangxiaoteng <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request aims to correctly configure num_kv_head for Vision-Language (VL) models. However, there is a critical bug in the implementation where self.num_kv_head is unconditionally set to 1, which negates the intended logic. My review includes a comment with a suggested fix for this issue.
| try: | ||
| if is_vl_model: | ||
| self.num_kv_head = self.vllm_config.model_config.hf_config.text_config.num_key_value_heads | ||
| else: | ||
| self.num_kv_head = self.vllm_config.model_config.hf_config.num_key_value_heads | ||
| except Exception as e: | ||
| logger.error( | ||
| f"MooncakeLayerwise Connector can not get num_key_value_heads from model config, set num_key_value_heads to 1" | ||
| ) | ||
| self.num_kv_head = 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a critical logic error in how self.num_kv_head is being set. The line self.num_kv_head = 1 is outside the try...except block, which means it will always execute and overwrite any value successfully retrieved from the model configuration. This effectively hardcodes self.num_kv_head to 1 for all cases, defeating the purpose of this change.
The assignment to the default value of 1 should only happen within the except block when fetching the configuration fails. I've also updated the error log to include the exception for better debugging.
| try: | |
| if is_vl_model: | |
| self.num_kv_head = self.vllm_config.model_config.hf_config.text_config.num_key_value_heads | |
| else: | |
| self.num_kv_head = self.vllm_config.model_config.hf_config.num_key_value_heads | |
| except Exception as e: | |
| logger.error( | |
| f"MooncakeLayerwise Connector can not get num_key_value_heads from model config, set num_key_value_heads to 1" | |
| ) | |
| self.num_kv_head = 1 | |
| try: | |
| if is_vl_model: | |
| self.num_kv_head = self.vllm_config.model_config.hf_config.text_config.num_key_value_heads | |
| else: | |
| self.num_kv_head = self.vllm_config.model_config.hf_config.num_key_value_heads | |
| except Exception as e: | |
| logger.error( | |
| f"MooncakeLayerwise Connector can not get num_key_value_heads from model config, set num_key_value_heads to 1. Error: {e}" | |
| ) | |
| self.num_kv_head = 1 |
Signed-off-by: wangxiaoteng <[email protected]>
| self.use_mla = self.vllm_config.model_config.use_mla | ||
| try: | ||
| if is_vl_model: | ||
| self.num_kv_head = self.vllm_config.model_config.hf_config.text_config.num_key_value_heads |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we use the hf_config.text_config.num_key_value_heads access method regardless of whether it's a vl model or not?
| logger.error( | ||
| f"MooncakeLayerwise Connector can not get num_key_value_heads from model config, set num_key_value_heads to 1" | ||
| ) | ||
| self.num_kv_head = 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does this line of code do?
What this PR does / why we need it?
This pull request aims to correctly configure num_kv_head for Vision-Language (VL) models.
Does this PR introduce any user-facing change?
No
How was this patch tested?
By ci