Skip to content

Releases: numz/ComfyUI-SeedVR2_VideoUpscaler

v2.5.23: Merge pull request #438 from AInVFX/main

24 Dec 02:16
5a4bf42

Choose a tag to compare

  • 🔒 Security: Prevent code execution in model loading - Added protection against malicious .pth files by restricting deserialization to tensors only
  • 🎥 Fix: FFmpeg video writer reliability - Resolved ffmpeg process hanging issues by redirecting stderr and adding buffer flush, with improved error messages for debugging (thanks @thehhmdb)
  • ⚡ Fix: GGUF VAE model support - Enabled automatic weight dequantization for convolution operations, making GGUF-quantized VAE models fully functional (thanks @naxci1)
  • 🛡️ Fix: VAE slicing edge cases - Protected against division by zero crashes when using small split sizes with high temporal downsampling (thanks @naxci1)
  • 🎨 Fix: LAB color transfer precision - Resolved dtype mismatch errors during video upscaling by ensuring consistent float types before matrix operations
  • 🔧 Fix: PyTorch 2.9+ compatibility - Extended Conv3d memory workaround to all PyTorch 2.9+ versions, fixing 3x VRAM usage on newer PyTorch releases
  • 📦 Fix: Bitsandbytes compatibility - Added ValueError exception handling for Intel Gaudi version detection failures on non-Gaudi systems
  • 🍎 MPS: Memory optimization - Reduced memory usage during encode/decode operations on Apple Silicon (thanks @s-cerevisiae)

v2.5.22: Merge pull request #412 from AInVFX/main

13 Dec 05:38
d69b65f

Choose a tag to compare

  • 🎬 CLI: FFmpeg video backend with 10-bit support - New --video_backend ffmpeg and --10bit flags enable x265 encoding with 10-bit color depth, reducing banding artifacts in gradients compared to 8-bit OpenCV output (based on PR by @thehhmdb - thank you!)
  • 🍎 Fix: MPS bicubic upscaling compatibility - Added CPU fallback for bicubic+antialias interpolation on PyTorch versions before 2.8.0, resolving RGBA alpha upscaling errors on Apple Silicon
  • ⚡ Fix: Cross-platform histogram matching - Replaced scatter_ operation with argsort+index_select for improved reliability across CUDA, ROCm, and MPS backends
  • 🧹 MPS: Remove sync overhead - Reverted unnecessary torch.mps.synchronize() calls introduced in v2.5.21 for consistent behavior with CUDA pipeline

v2.5.21: Merge pull request #407 from AInVFX/main

12 Dec 16:28
32f9900

Choose a tag to compare

  • 🛠️ Fix: GGUF dequantization error on MPS - Resolved shape mismatch error introduced in 2.5.20 by skipping GGUF quantized buffers in precision conversion - these must remain in packed format for on-the-fly dequantization during inference
  • 🍎 MPS: Eliminate CPU sync overhead - Skip unnecessary CPU tensor offload on Apple Silicon unified memory architecture, preventing sync stalls that caused slowdowns. Input images and output video now stay on MPS device throughout the pipeline
  • ⚡ MPS: Preload text embeddings - Load text embeddings before Phase 1 encoding to avoid sync stall at Phase 2 start, improving timing accuracy and throughput
  • 🧹 MPS: Optimized model cleanup - Skip redundant CPU movement before model deletion on unified memory

v2.5.20: Merge pull request #402 from AInVFX/main

12 Dec 05:48
a1486a3

Choose a tag to compare

  • ⚡ Expanded attention backends - Full support for Flash Attention 2 (Ampere+), Flash Attention 3 (Hopper+), SageAttention 2, and SageAttention 3 (Blackwell/RTX 50xx), with automatic fallback chains to PyTorch SDPA when unavailable (based on PR by @naxci1 - thank you!)
  • 🍎 macOS/Apple Silicon compatibility - Replaced MPS autocast with explicit dtype conversion throughout VAE and DiT pipelines, resolving hangs and crashes on M-series Macs. BlockSwap now auto-disables with warning (unified memory makes it meaningless)
  • 🛡️ Flash Attention graceful fallback - Added compatibility shims for corrupted or partially installed flash_attn/xformers DLLs, preventing startup crashes
  • 🛡️ AMD ROCm: bitsandbytes conflict fix - Prevent kernel registration errors when diffusers attempts to re-import broken bitsandbytes installations
  • 📦 ComfyUI Manager: macOS classifier fix - Removed NVIDIA CUDA classifier causing false "GPU not supported" warnings on macOS
  • 📚 Documentation updates - Updated README with attention backend details, BlockSwap macOS notes, and clarified model caching descriptions

v2.5.19: Merge pull request #390 from AInVFX/main

10 Dec 06:56
2006fa3

Choose a tag to compare

  • 🎨 New header logo design - Refreshed ASCII art banner (thanks @naxci1)
  • 🧹 Remove dead flash attention wrapper - Removed legacy code from FP8CompatibleDiT; FlashAttentionVarlen already handles backend switching via its attention_mode attribute
  • 🛡️ Fix graceful fallback from flash-attn - Add compatibility shims for corrupted flash_attn/xformers DLLs, preventing startup crashes when CUDA extensions are broken
  • 📊 Improved VRAM tracking - Separate allocated vs reserved memory tracking, Windows-only overflow detection (WDDM paging behavior)
  • ♻️ Centralize backend detection - Unified is_mps_available(), is_cuda_available(), get_gpu_backend() helpers across codebase
  • 🔄 Revert 2.5.14 VRAM limit enforcement - Removed set_per_process_memory_fraction call; Overflow detection and warnings remain.

v2.5.18: Merge pull request #384 from AInVFX/main

09 Dec 06:05
a06afb5

Choose a tag to compare

  • 🚀 CLI: Streaming mode for long videos - New --chunk_size flag processes videos in memory-bounded chunks, enabling arbitrarily long videos without RAM limits. Works with model caching (--cache_dit/--cache_vae) for chunk-to-chunk reuse (inspired by disk02 PR contribution)
  • ⚡ CLI: Multi-GPU streaming - Each GPU now streams its segment internally with independent model caching, improving memory efficiency and enabling --temporal_overlap blending at GPU boundaries
  • 🔧 CLI: Fix large video MemoryError - Shared memory transfer replaces numpy pickling, preventing crashes on high-resolution/long video outputs (inspired by FurkanGozukara PR contribution)

v2.5.17: Merge pull request #373 from AInVFX/main

06 Dec 02:07
58bc9e8

Choose a tag to compare

  • 🔧 Fix: Older GPU compatibility (GTX 970, etc.) - Runtime bf16 CUBLAS probe replaces compute capability heuristics, correctly detecting unsupported GPUs without affecting RTX 20XX

v2.5.16: Merge pull request #371 from AInVFX/main

05 Dec 20:53
0a66006

Choose a tag to compare

  • 🔧 Fix: Older GPU compatibility (GTX 970, etc.) - Automatic fallback for GPUs without bfloat16 support
  • 🐛 Fix: Quality regression - Reverted bfloat16 detection that was causing artifact issues
  • 📋 Debug: Environment info display - Shows system info in debug mode to help with issue reporting
  • 📚 Docs: Simplified contribution workflow - Streamlined to main branch only

v2.5.15: Merge pull request #358 from AInVFX/main

03 Dec 18:17
f68fe92

Choose a tag to compare

  • 🍎 Fix: MPS compatibility - Disable antialias for MPS tensors and fix bfloat16 arange issues
  • ⚡ Fix: Autocast device type - Use proper device type attribute to prevent autocast errors
  • 📊 Memory: Accurate VRAM tracking - Use max_memory_reserved for more precise peak reporting
  • 🔧 Fix: Triton compatibility - Add shim for bitsandbytes 0.45+ / triton 3.0+ (fixes PyTorch 2.7 installation errors)

v2.5.14: Merge pull request #344 from AInVFX/main

01 Dec 05:34
d4dd5e7

Choose a tag to compare

  • 🍎 Fix: MPS device comparison - Normalize device strings to prevent unnecessary tensor movements
  • 📊 Memory: VRAM swap detection - Peak stats now show GPU+swap breakdown when overflow occurs, with warning when swap detected
  • 🛡️ Memory: Enforce physical VRAM limit - PyTorch now OOMs instead of silently swapping to shared memory (prevents extreme slowdowns on Windows)