-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] vllm 跑openbmb/MiniCPM-o-2_6-int4 #765
Comments
报这个错[rank0]: ValueError: There is no module or parameter named 'resampler.kv_proj.weight' in MiniCPMV2_6 |
请问解决了吗,我vllm离线推理openbmb/MiniCPM-O-2_6-int4,也是报这个错误, ValueError: There is no module or parameter named 'resampler.kv_proj.weight' in MiniCPMV2_6 我环境如下: |
之前MiniCPM-o-2_6还没有完全支持 vllm,我们fork了一个仓库做的支持。现在我们的代码已经合进了官方的仓库中,如果你可以把官方的仓库拉下来从源代码构建,或者等待vllm官方下一个发布的wheel包。 |
Hello, I just downloaded and installed the latest vllm wheel that should support MiniCPM-o-2_6 but using openbmb/MiniCPM-o-2_6-int4 I still receive the error: There is no module or parameter named 'resampler.kv_proj.weight' in MiniCPMV2_6 Is it quantized model supported by vllm? |
是否已有关于该错误的issue或讨论? | Is there an existing issue / discussion for this?
该问题是否在FAQ中有解答? | Is there an existing answer for this in FAQ?
当前行为 | Current Behavior
我用vllm 跑openbmb/MiniCPM-o-2_6-int4,代码参考的官方的DEMO部分,卡用的是RTX 2080TI,为什么会报代码片段和出错片段如下,代码片段只是改了模型名字如下:
from transformers import AutoTokenizer
from PIL import Image
from vllm import LLM, SamplingParams
MODEL_NAME = "openbmb/MiniCPM-o-2_6-int4"
MODEL_NAME = "openbmb/MiniCPM-O-2_6"
Also available for previous models
MODEL_NAME = "openbmb/MiniCPM-Llama3-V-2_5"
MODEL_NAME = "HwwwH/MiniCPM-V-2"
image = Image.open("xxx.png").convert("RGB")
tokenizer = AutoTokenizer.from_pretrained(MODEL_NAME, trust_remote_code=True)
llm = LLM(
model=MODEL_NAME,
trust_remote_code=True,
gpu_memory_utilization=1,
max_model_len=2048,
dtype='half'
)
messages = [{
./)" +
"role":
"user",
"content":
# Number of images
"(
"\nWhat is the content of this image?"
}]
prompt = tokenizer.apply_chat_template(
messages,
tokenize=False,
add_generation_prompt=True
)
Single Inference
inputs = {
"prompt": prompt,
"multi_modal_data": {
"image": image
# Multi images, the number of images should be equal to that of
(<image>./</image>)
# "image": [image, image]
},
}
Batch Inference
inputs = [{。。。。。其余不变
出错片段:
Loading safetensors checkpoint shards: 0% Completed | 0/1 [00:00<?, ?it/s]
[rank0]: Traceback (most recent call last):
[rank0]: File "/home/openbmb-vllm/./tests.py", line 13, in
[rank0]: llm = LLM(
[rank0]: File "/home/openbmb-vllm/vllm/utils.py", line 1038, in inner
[rank0]: return fn(*args, **kwargs)
[rank0]: File "/home/openbmb-vllm/vllm/entrypoints/llm.py", line 228, in init
[rank0]: self.llm_engine = self.engine_class.from_engine_args(
[rank0]: File "/home/openbmb-vllm/vllm/engine/llm_engine.py", line 477, in from_engine_args
[rank0]: engine = cls(
[rank0]: File "/home/openbmb-vllm/vllm/engine/llm_engine.py", line 271, in init
[rank0]: self.model_executor = executor_class(vllm_config=vllm_config, )
[rank0]: File "/home/openbmb-vllm/vllm/executor/executor_base.py", line 42, in init
[rank0]: self._init_executor()
[rank0]: File "/home/openbmb-vllm/vllm/executor/uniproc_executor.py", line 34, in _init_executor
[rank0]: self.collective_rpc("load_model")
[rank0]: File "/home/openbmb-vllm/vllm/executor/uniproc_executor.py", line 48, in collective_rpc
[rank0]: answer = func(*args, **kwargs)
[rank0]: File "/home/openbmb-vllm/vllm/worker/worker.py", line 155, in load_model
[rank0]: self.model_runner.load_model()
[rank0]: File "/home/openbmb-vllm/vllm/worker/model_runner.py", line 1099, in load_model
[rank0]: self.model = get_model(vllm_config=self.vllm_config)
[rank0]: File "/home/openbmb-vllm/vllm/model_executor/model_loader/init.py", line 12, in get_model
[rank0]: return loader.load_model(vllm_config=vllm_config)
[rank0]: File "/home/openbmb-vllm/vllm/model_executor/model_loader/loader.py", line 368, in load_model
[rank0]: loaded_weights = model.load_weights(
[rank0]: File "/home/openbmb-vllm/vllm/model_executor/models/minicpmv.py", line 597, in load_weights
[rank0]: return loader.load_weights(weights)
[rank0]: File "/home/openbmb-vllm/vllm/model_executor/models/utils.py", line 233, in load_weights
[rank0]: autoloaded_weights = set(self._load_module("", self.module, weights))
[rank0]: File "/home/openbmb-vllm/vllm/model_executor/models/utils.py", line 194, in _load_module
[rank0]: yield from self._load_module(prefix,
[rank0]: File "/home/openbmb-vllm/vllm/model_executor/models/utils.py", line 194, in _load_module
[rank0]: yield from self._load_module(prefix,
[rank0]: File "/home/openbmb-vllm/vllm/model_executor/models/utils.py", line 222, in _load_module
[rank0]: raise ValueError(msg)
[rank0]: ValueError: There is no module or parameter named 'resampler.kv_proj.weight' in MiniCPMV2_6
Loading safetensors checkpoint shards: 0% Completed | 0/1 [00:33<?, ?it/s]
期望行为 | Expected Behavior
No response
复现方法 | Steps To Reproduce
No response
运行环境 | Environment
备注 | Anything else?
No response
The text was updated successfully, but these errors were encountered: