Conversation
Summary of ChangesHello @ErvinXie, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request refines the Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request significantly enhances the kt-kernel CLI by upgrading the run command to use click instead of typer, allowing it to accept and pass through arbitrary, unknown command-line arguments directly to the underlying SGLang server. This refactoring involved creating a new internal _run_impl function and special handling in main.py to manage click's argument parsing. Additionally, the doctor command is updated to provide detailed kt-kernel installation diagnostics, including CPU variant detection (AMX, AVX512, AVX2), version, and installation paths, and removes ktransformers from its package checks. The model path discovery logic in both run.py and model_registry.py was modified to support recursive searching up to a specified max_depth for local models. Review comments highlighted critical issues with the initial implementation of CPU variant parsing in doctor.py, noting its fragility and suggesting a more robust splitting method. The recursive model search logic in both _find_model_path and find_local_models was also identified as flawed, inefficiently using rglob and ignoring the max_depth parameter, with a more efficient glob-pattern-based approach suggested for correction.
| if "_kt_kernel_ext_" in basename: | ||
| # Extract variant from _kt_kernel_ext_amx.cpython-311-x86_64-linux-gnu.so | ||
| parts = basename.split("_") | ||
| if len(parts) >= 4: | ||
| variant = parts[3] # "amx" from "_kt_kernel_ext_amx..." | ||
| if variant.startswith("avx"): | ||
| # Normalize avx variants | ||
| if variant in ["avx512", "avx512_bf16", "avx512_vbmi", "avx512_vnni", "avx512_base"]: | ||
| variants.add("avx512") | ||
| else: | ||
| variants.add(variant) | ||
| else: | ||
| variants.add(variant) |
There was a problem hiding this comment.
The logic for parsing the CPU variant from the .so filename is incorrect. basename.split("_") is fragile, and parts[3] will not correctly extract the variant. For a filename like _kt_kernel_ext_amx.cpython-311-x86_64-linux-gnu.so, parts[3] would be 'ext', not 'amx'. This will cause incorrect variant detection and lead to wrong recommendations in the doctor command. A more robust method, such as splitting on the known prefix _kt_kernel_ext_, should be used.
| if "_kt_kernel_ext_" in basename: | |
| # Extract variant from _kt_kernel_ext_amx.cpython-311-x86_64-linux-gnu.so | |
| parts = basename.split("_") | |
| if len(parts) >= 4: | |
| variant = parts[3] # "amx" from "_kt_kernel_ext_amx..." | |
| if variant.startswith("avx"): | |
| # Normalize avx variants | |
| if variant in ["avx512", "avx512_bf16", "avx512_vbmi", "avx512_vnni", "avx512_base"]: | |
| variants.add("avx512") | |
| else: | |
| variants.add(variant) | |
| else: | |
| variants.add(variant) | |
| if "_kt_kernel_ext_" in basename: | |
| # Extract variant from _kt_kernel_ext_amx.cpython-311-x86_64-linux-gnu.so | |
| try: | |
| variant_part = basename.split("_kt_kernel_ext_")[1] | |
| variant = variant_part.split(".")[0] | |
| if variant.startswith("avx") and variant in {"avx512", "avx512_bf16", "avx512_vbmi", "avx512_vnni", "avx512_base"}: | |
| variants.add("avx512") | |
| else: | |
| variants.add(variant) | |
| except IndexError: | |
| # Could not parse variant, skip this file. | |
| pass |
| # Search recursively up to max_depth | ||
| for depth in range(max_depth): | ||
| for name in possible_names: | ||
| if depth == 0: | ||
| # Direct children: models_dir / name | ||
| search_paths = [models_dir / name] | ||
| else: | ||
| # Nested: use rglob to find directories matching the name | ||
| search_paths = list(models_dir.rglob(name)) | ||
|
|
||
| for path in possible_paths: | ||
| if path.exists() and (path / "config.json").exists(): | ||
| return path | ||
| for path in search_paths: | ||
| if path.exists() and (path / "config.json").exists(): | ||
| return path |
There was a problem hiding this comment.
The recursive search logic in _find_model_path is flawed. The else block uses models_dir.rglob(name), which searches at all depths, ignoring the max_depth parameter from the docstring. Additionally, the for depth in range(max_depth) loop causes rglob to be called multiple times redundantly if a model is not found at depth=1. This is inefficient and does not respect the max_depth constraint. A correct implementation should build a glob pattern for each depth.
| # Search recursively up to max_depth | |
| for depth in range(max_depth): | |
| for name in possible_names: | |
| if depth == 0: | |
| # Direct children: models_dir / name | |
| search_paths = [models_dir / name] | |
| else: | |
| # Nested: use rglob to find directories matching the name | |
| search_paths = list(models_dir.rglob(name)) | |
| for path in possible_paths: | |
| if path.exists() and (path / "config.json").exists(): | |
| return path | |
| for path in search_paths: | |
| if path.exists() and (path / "config.json").exists(): | |
| return path | |
| # Search recursively up to max_depth | |
| for name in possible_names: | |
| for depth in range(max_depth): | |
| pattern = "*/" * depth + name | |
| for path in models_dir.glob(pattern): | |
| if path.is_dir() and (path / "config.json").exists(): | |
| return path |
| # Search recursively up to max_depth | ||
| for depth in range(max_depth): | ||
| # Build glob pattern for current depth | ||
| # depth=0: direct children, depth=1: grandchildren, etc. | ||
| glob_pattern = "*" if depth > 0 else "" | ||
| for _ in range(depth): | ||
| glob_pattern = "*/" + glob_pattern if glob_pattern else "*" | ||
|
|
||
| for name in possible_names: | ||
| if depth == 0: | ||
| # Direct children: models_dir / name | ||
| search_paths = [models_dir / name] | ||
| else: | ||
| # Nested: use rglob to find directories matching the name | ||
| search_paths = list(models_dir.rglob(name)) | ||
|
|
||
| for path in search_paths: | ||
| if path.exists() and (path / "config.json").exists(): | ||
| results.append((model, path)) | ||
| found = True | ||
| break | ||
|
|
||
| if found: | ||
| break | ||
|
|
||
| if found: | ||
| break |
There was a problem hiding this comment.
This function has the same flawed recursive search logic as _find_model_path in run.py. The else block uses models_dir.rglob(name), which searches at all depths, ignoring the max_depth parameter. The for depth in range(max_depth) loop also leads to redundant rglob calls. This is inefficient and does not respect the max_depth constraint. The unused glob_pattern logic from lines 313-317 is also confusing and should be removed.
# Search recursively up to max_depth
for name in possible_names:
for depth in range(max_depth):
pattern = "*/" * depth + name
for path in models_dir.glob(pattern):
if path.is_dir() and (path / "config.json").exists():
results.append((model, path))
found = True
break
if found:
break* fix pypi cuda install (#1763) * Update release-pypi.yml (#1764) * fix cuda wheel build (#1766) * Cli (#1765) * [feat]: add custom option for kt run * [feat]: depth 3 * [docs]: add kt-cli doc and update corresponding website (#1768) * Remove kt-kernel-cuda, kt-kernel uses the version with cuda (#1769) * Update release-pypi.yml (#1770) * bump to 0.5.0.post1 (#1771) * [ci]: Patch ci (#1772) * [docs]: add kt-cli doc and update corresponding website * [feat]: update issue template * [fix]: fix moe hpp bug. (#1780) fix moe hpp init bug. * Fix moe bug. (#1783) * [fix]: fix moe.hpp load from file bug. * [fix]: fix all moe hpp init bug. * [fix]: fix moe & awq-moe ug. * [feat](kt-sft-refactor): load from huggingface safetensor file * [fix]: fix bugs for activation, sft forward and backward --------- Co-authored-by: Jianwei Dong <dongjw24@mails.tsinghua.edu.cn> Co-authored-by: ErvinXie <ervinxie@qq.com> Co-authored-by: ZiWei Yuan <yzwliam@126.com> Co-authored-by: Oql <1692110604@qq.com> Co-authored-by: mrhaoxx <mr.haoxx@gmail.com>
Fix Cli