Releases: numz/ComfyUI-SeedVR2_VideoUpscaler
Releases · numz/ComfyUI-SeedVR2_VideoUpscaler
v2.5.23: Merge pull request #438 from AInVFX/main
- 🔒 Security: Prevent code execution in model loading - Added protection against malicious .pth files by restricting deserialization to tensors only
- 🎥 Fix: FFmpeg video writer reliability - Resolved ffmpeg process hanging issues by redirecting stderr and adding buffer flush, with improved error messages for debugging (thanks @thehhmdb)
- ⚡ Fix: GGUF VAE model support - Enabled automatic weight dequantization for convolution operations, making GGUF-quantized VAE models fully functional (thanks @naxci1)
- 🛡️ Fix: VAE slicing edge cases - Protected against division by zero crashes when using small split sizes with high temporal downsampling (thanks @naxci1)
- 🎨 Fix: LAB color transfer precision - Resolved dtype mismatch errors during video upscaling by ensuring consistent float types before matrix operations
- 🔧 Fix: PyTorch 2.9+ compatibility - Extended Conv3d memory workaround to all PyTorch 2.9+ versions, fixing 3x VRAM usage on newer PyTorch releases
- 📦 Fix: Bitsandbytes compatibility - Added ValueError exception handling for Intel Gaudi version detection failures on non-Gaudi systems
- 🍎 MPS: Memory optimization - Reduced memory usage during encode/decode operations on Apple Silicon (thanks @s-cerevisiae)
v2.5.22: Merge pull request #412 from AInVFX/main
- 🎬 CLI: FFmpeg video backend with 10-bit support - New --video_backend ffmpeg and --10bit flags enable x265 encoding with 10-bit color depth, reducing banding artifacts in gradients compared to 8-bit OpenCV output (based on PR by @thehhmdb - thank you!)
- 🍎 Fix: MPS bicubic upscaling compatibility - Added CPU fallback for bicubic+antialias interpolation on PyTorch versions before 2.8.0, resolving RGBA alpha upscaling errors on Apple Silicon
- ⚡ Fix: Cross-platform histogram matching - Replaced scatter_ operation with argsort+index_select for improved reliability across CUDA, ROCm, and MPS backends
- 🧹 MPS: Remove sync overhead - Reverted unnecessary torch.mps.synchronize() calls introduced in v2.5.21 for consistent behavior with CUDA pipeline
v2.5.21: Merge pull request #407 from AInVFX/main
- 🛠️ Fix: GGUF dequantization error on MPS - Resolved shape mismatch error introduced in 2.5.20 by skipping GGUF quantized buffers in precision conversion - these must remain in packed format for on-the-fly dequantization during inference
- 🍎 MPS: Eliminate CPU sync overhead - Skip unnecessary CPU tensor offload on Apple Silicon unified memory architecture, preventing sync stalls that caused slowdowns. Input images and output video now stay on MPS device throughout the pipeline
- ⚡ MPS: Preload text embeddings - Load text embeddings before Phase 1 encoding to avoid sync stall at Phase 2 start, improving timing accuracy and throughput
- 🧹 MPS: Optimized model cleanup - Skip redundant CPU movement before model deletion on unified memory
v2.5.20: Merge pull request #402 from AInVFX/main
- ⚡ Expanded attention backends - Full support for Flash Attention 2 (Ampere+), Flash Attention 3 (Hopper+), SageAttention 2, and SageAttention 3 (Blackwell/RTX 50xx), with automatic fallback chains to PyTorch SDPA when unavailable (based on PR by @naxci1 - thank you!)
- 🍎 macOS/Apple Silicon compatibility - Replaced MPS autocast with explicit dtype conversion throughout VAE and DiT pipelines, resolving hangs and crashes on M-series Macs. BlockSwap now auto-disables with warning (unified memory makes it meaningless)
- 🛡️ Flash Attention graceful fallback - Added compatibility shims for corrupted or partially installed flash_attn/xformers DLLs, preventing startup crashes
- 🛡️ AMD ROCm: bitsandbytes conflict fix - Prevent kernel registration errors when diffusers attempts to re-import broken bitsandbytes installations
- 📦 ComfyUI Manager: macOS classifier fix - Removed NVIDIA CUDA classifier causing false "GPU not supported" warnings on macOS
- 📚 Documentation updates - Updated README with attention backend details, BlockSwap macOS notes, and clarified model caching descriptions
v2.5.19: Merge pull request #390 from AInVFX/main
- 🎨 New header logo design - Refreshed ASCII art banner (thanks @naxci1)
- 🧹 Remove dead flash attention wrapper - Removed legacy code from FP8CompatibleDiT; FlashAttentionVarlen already handles backend switching via its attention_mode attribute
- 🛡️ Fix graceful fallback from flash-attn - Add compatibility shims for corrupted flash_attn/xformers DLLs, preventing startup crashes when CUDA extensions are broken
- 📊 Improved VRAM tracking - Separate allocated vs reserved memory tracking, Windows-only overflow detection (WDDM paging behavior)
- ♻️ Centralize backend detection - Unified is_mps_available(), is_cuda_available(), get_gpu_backend() helpers across codebase
- 🔄 Revert 2.5.14 VRAM limit enforcement - Removed set_per_process_memory_fraction call; Overflow detection and warnings remain.
v2.5.18: Merge pull request #384 from AInVFX/main
- 🚀 CLI: Streaming mode for long videos - New --chunk_size flag processes videos in memory-bounded chunks, enabling arbitrarily long videos without RAM limits. Works with model caching (--cache_dit/--cache_vae) for chunk-to-chunk reuse (inspired by disk02 PR contribution)
- ⚡ CLI: Multi-GPU streaming - Each GPU now streams its segment internally with independent model caching, improving memory efficiency and enabling --temporal_overlap blending at GPU boundaries
- 🔧 CLI: Fix large video MemoryError - Shared memory transfer replaces numpy pickling, preventing crashes on high-resolution/long video outputs (inspired by FurkanGozukara PR contribution)
v2.5.17: Merge pull request #373 from AInVFX/main
- 🔧 Fix: Older GPU compatibility (GTX 970, etc.) - Runtime bf16 CUBLAS probe replaces compute capability heuristics, correctly detecting unsupported GPUs without affecting RTX 20XX
v2.5.16: Merge pull request #371 from AInVFX/main
- 🔧 Fix: Older GPU compatibility (GTX 970, etc.) - Automatic fallback for GPUs without bfloat16 support
- 🐛 Fix: Quality regression - Reverted bfloat16 detection that was causing artifact issues
- 📋 Debug: Environment info display - Shows system info in debug mode to help with issue reporting
- 📚 Docs: Simplified contribution workflow - Streamlined to main branch only
v2.5.15: Merge pull request #358 from AInVFX/main
- 🍎 Fix: MPS compatibility - Disable antialias for MPS tensors and fix bfloat16 arange issues
- ⚡ Fix: Autocast device type - Use proper device type attribute to prevent autocast errors
- 📊 Memory: Accurate VRAM tracking - Use max_memory_reserved for more precise peak reporting
- 🔧 Fix: Triton compatibility - Add shim for bitsandbytes 0.45+ / triton 3.0+ (fixes PyTorch 2.7 installation errors)
v2.5.14: Merge pull request #344 from AInVFX/main
- 🍎 Fix: MPS device comparison - Normalize device strings to prevent unnecessary tensor movements
- 📊 Memory: VRAM swap detection - Peak stats now show GPU+swap breakdown when overflow occurs, with warning when swap detected
- 🛡️ Memory: Enforce physical VRAM limit - PyTorch now OOMs instead of silently swapping to shared memory (prevents extreme slowdowns on Windows)