[Usage]: [V1] Misleading Error Messages #13510

robertgshaw2-redhat · 2025-02-19T03:48:22Z

Looking for help to improve error messages during startup!

Running a model that does not exist (e.g. MODEL=neuralmagic/Meta-Llama-3-8B-Instruct-FP8-dynamic << this does not exist), gives the following stack trace:

(venv-nm-vllm-abi3) rshaw@beaker:~$ VLLM_USE_V1=1 vllm serve $MODEL --disable-log-requests --no-enable-prefix-caching
INFO 02-19 03:45:16 __init__.py:190] Automatically detected platform cuda.
INFO 02-19 03:45:18 api_server.py:840] vLLM API server version 0.7.2.0
INFO 02-19 03:45:18 api_server.py:841] args: Namespace(subparser='serve', model_tag='neuralmagic/Meta-Llama-3-8B-Instruct-FP8-dynamic', config='', host=None, port=8000, uvicorn_log_level='info', allow_credentials=False, allowed_origins=['*'], allowed_methods=['*'], allowed_headers=['*'], api_key=None, lora_modules=None, prompt_adapters=None, chat_template=None, chat_template_content_format='auto', response_role='assistant', ssl_keyfile=None, ssl_certfile=None, ssl_ca_certs=None, ssl_cert_reqs=0, root_path=None, middleware=[], return_tokens_as_token_ids=False, disable_frontend_multiprocessing=False, enable_request_id_headers=False, enable_auto_tool_choice=False, enable_reasoning=False, reasoning_parser=None, tool_call_parser=None, tool_parser_plugin='', model='neuralmagic/Meta-Llama-3-8B-Instruct-FP8-dynamic', task='auto', tokenizer=None, skip_tokenizer_init=False, revision=None, code_revision=None, tokenizer_revision=None, tokenizer_mode='auto', trust_remote_code=False, allowed_local_media_path=None, download_dir=None, load_format='auto', config_format=<ConfigFormat.AUTO: 'auto'>, dtype='auto', kv_cache_dtype='auto', max_model_len=None, guided_decoding_backend='xgrammar', logits_processor_pattern=None, model_impl='auto', distributed_executor_backend=None, pipeline_parallel_size=1, tensor_parallel_size=1, max_parallel_loading_workers=None, ray_workers_use_nsight=False, block_size=None, enable_prefix_caching=False, disable_sliding_window=False, use_v2_block_manager=True, num_lookahead_slots=0, seed=0, swap_space=4, cpu_offload_gb=0, gpu_memory_utilization=0.9, num_gpu_blocks_override=None, max_num_batched_tokens=None, max_num_seqs=None, max_logprobs=20, disable_log_stats=False, quantization=None, rope_scaling=None, rope_theta=None, hf_overrides=None, enforce_eager=False, max_seq_len_to_capture=8192, disable_custom_all_reduce=False, tokenizer_pool_size=0, tokenizer_pool_type='ray', tokenizer_pool_extra_config=None, limit_mm_per_prompt=None, mm_processor_kwargs=None, disable_mm_preprocessor_cache=False, enable_lora=False, enable_lora_bias=False, max_loras=1, max_lora_rank=16, lora_extra_vocab_size=256, lora_dtype='auto', long_lora_scaling_factors=None, max_cpu_loras=None, fully_sharded_loras=False, enable_prompt_adapter=False, max_prompt_adapters=1, max_prompt_adapter_token=0, device='auto', num_scheduler_steps=1, multi_step_stream_outputs=True, scheduler_delay_factor=0.0, enable_chunked_prefill=None, speculative_model=None, speculative_model_quantization=None, num_speculative_tokens=None, speculative_disable_mqa_scorer=False, speculative_draft_tensor_parallel_size=None, speculative_max_model_len=None, speculative_disable_by_batch_size=None, ngram_prompt_lookup_max=None, ngram_prompt_lookup_min=None, spec_decoding_acceptance_method='rejection_sampler', typical_acceptance_sampler_posterior_threshold=None, typical_acceptance_sampler_posterior_alpha=None, disable_logprobs_during_spec_decoding=None, model_loader_extra_config=None, ignore_patterns=[], preemption_mode=None, served_model_name=None, qlora_adapter_name_or_path=None, otlp_traces_endpoint=None, collect_detailed_traces=None, disable_async_output_proc=False, scheduling_policy='fcfs', override_neuron_config=None, override_pooler_config=None, compilation_config=None, kv_transfer_config=None, worker_cls='auto', generation_config=None, override_generation_config=None, enable_sleep_mode=False, calculate_kv_scales=False, disable_log_requests=True, max_log_len=None, disable_fastapi_docs=False, enable_prompt_tokens_details=False, dispatch_function=<function serve at 0x74d76eb99990>)
WARNING 02-19 03:45:18 arg_utils.py:1326] Setting max_num_batched_tokens to 8192 for OPENAI_API_SERVER usage context.
Traceback (most recent call last):
  File "/home/rshaw/venv-nm-vllm-abi3/bin/vllm", line 8, in <module>
    sys.exit(main())
  File "/home/rshaw/venv-nm-vllm-abi3/lib/python3.10/site-packages/vllm/scripts.py", line 204, in main
    args.dispatch_function(args)
  File "/home/rshaw/venv-nm-vllm-abi3/lib/python3.10/site-packages/vllm/scripts.py", line 44, in serve
    uvloop.run(run_server(args))
  File "/home/rshaw/venv-nm-vllm-abi3/lib/python3.10/site-packages/uvloop/__init__.py", line 82, in run
    return loop.run_until_complete(wrapper())
  File "uvloop/loop.pyx", line 1518, in uvloop.loop.Loop.run_until_complete
  File "/home/rshaw/venv-nm-vllm-abi3/lib/python3.10/site-packages/uvloop/__init__.py", line 61, in wrapper
    return await main
  File "/home/rshaw/venv-nm-vllm-abi3/lib/python3.10/site-packages/vllm/entrypoints/openai/api_server.py", line 875, in run_server
    async with build_async_engine_client(args) as engine_client:
  File "/home/rshaw/.pyenv/versions/3.10.14/lib/python3.10/contextlib.py", line 199, in __aenter__
    return await anext(self.gen)
  File "/home/rshaw/venv-nm-vllm-abi3/lib/python3.10/site-packages/vllm/entrypoints/openai/api_server.py", line 136, in build_async_engine_client
    async with build_async_engine_client_from_engine_args(
  File "/home/rshaw/.pyenv/versions/3.10.14/lib/python3.10/contextlib.py", line 199, in __aenter__
    return await anext(self.gen)
  File "/home/rshaw/venv-nm-vllm-abi3/lib/python3.10/site-packages/vllm/entrypoints/openai/api_server.py", line 160, in build_async_engine_client_from_engine_args
    engine_client = AsyncLLMEngine.from_engine_args(
  File "/home/rshaw/venv-nm-vllm-abi3/lib/python3.10/site-packages/vllm/v1/engine/async_llm.py", line 104, in from_engine_args
    vllm_config = engine_args.create_engine_config(usage_context)
  File "/home/rshaw/venv-nm-vllm-abi3/lib/python3.10/site-packages/vllm/engine/arg_utils.py", line 1075, in create_engine_config
    model_config = self.create_model_config()
  File "/home/rshaw/venv-nm-vllm-abi3/lib/python3.10/site-packages/vllm/engine/arg_utils.py", line 998, in create_model_config
    return ModelConfig(
  File "/home/rshaw/venv-nm-vllm-abi3/lib/python3.10/site-packages/vllm/config.py", line 302, in __init__
    hf_config = get_config(self.model, trust_remote_code, revision,
  File "/home/rshaw/venv-nm-vllm-abi3/lib/python3.10/site-packages/vllm/transformers_utils/config.py", line 201, in get_config
    raise ValueError(f"No supported config format found in {model}")
ValueError: No supported config format found in neuralmagic/Meta-Llama-3-8B-Instruct-FP8-dynamic

This is confusing

Before submitting a new issue...

Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.

The text was updated successfully, but these errors were encountered:

simo-hsieh · 2025-02-19T06:37:18Z

@robertgshaw2-redhat I'd like to work on this.

simo-hsieh · 2025-02-20T08:19:57Z

@robertgshaw2-redhat I'd like to work on this.

I couldn't reproduce the same error using the same command.
Instead, I received customized error messages from Hugging Face.
I'll leave this to others who can reproduce the issue.

davidxia · 2025-04-20T13:56:19Z

same, I got

$ VLLM_USE_V1=1 vllm serve neuralmagic/Meta-Llama-3-8B-Instruct-FP8-dynamic --disable-log-requests --no-enable-prefix-caching
...

ValueError: Invalid repository ID or local directory specified: 'neuralmagic/Meta-Llama-3-8B-Instruct-FP8-dynamic'.
Please verify the following requirements:
1. Provide a valid Hugging Face repository ID.
2. Specify a local directory that contains a recognized configuration file.
   - For Hugging Face models: ensure the presence of a 'config.json'.
   - For Mistral models: ensure the presence of a 'params.json'.

davidxia · 2025-04-20T13:57:31Z

I think this issue is fixed by #13724? If so, this can be closed.

ctdavi · 2025-05-08T11:55:23Z

Agree. This is fixed, or at least entirely different now, by #13724

I get the same #13724-looking messaging as davidxia got

mengbingrock · 2025-05-10T04:50:48Z

Hi @robertgshaw2-redhat , I likes your feedback even though your issue is not reproducible for some system setup, but I encounter similiar issues, and I've make further modification from #13724 , to take care of other issue like internet connection issue.

working on a PR here #17938

robertgshaw2-redhat added the usage How to use vllm label Feb 19, 2025

robertgshaw2-redhat changed the title ~~[Usage]: Misleading Error Message for V1~~ [Usage]: [V1[ Misleading Error Messages Feb 19, 2025

robertgshaw2-redhat changed the title ~~[Usage]: [V1[ Misleading Error Messages~~ [Usage]: [V1] Misleading Error Messages Feb 19, 2025

robertgshaw2-redhat added good first issue Good for newcomers help wanted Extra attention is needed labels Feb 19, 2025

Chen-0210 mentioned this issue Feb 23, 2025

[Misc]Clarify Error Handling for Non-existent Model Paths and HF Repo IDs #13724

Merged

mengbingrock linked a pull request May 10, 2025 that will close this issue

[WIP] Fix Misleading Error Messages #17938

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Usage]: [V1] Misleading Error Messages #13510

[Usage]: [V1] Misleading Error Messages #13510

robertgshaw2-redhat commented Feb 19, 2025 •

edited

Loading

simo-hsieh commented Feb 19, 2025

simo-hsieh commented Feb 20, 2025

davidxia commented Apr 20, 2025

davidxia commented Apr 20, 2025

ctdavi commented May 8, 2025

mengbingrock commented May 10, 2025 •

edited

Loading

[Usage]: [V1] Misleading Error Messages #13510

[Usage]: [V1] Misleading Error Messages #13510

Comments

robertgshaw2-redhat commented Feb 19, 2025 • edited Loading

Before submitting a new issue...

simo-hsieh commented Feb 19, 2025

simo-hsieh commented Feb 20, 2025

davidxia commented Apr 20, 2025

davidxia commented Apr 20, 2025

ctdavi commented May 8, 2025

mengbingrock commented May 10, 2025 • edited Loading

robertgshaw2-redhat commented Feb 19, 2025 •

edited

Loading

mengbingrock commented May 10, 2025 •

edited

Loading