vladmandic · vladmandic · Feb 20, 2025 · Feb 5, 2025 · Feb 6, 2025 · Feb 6, 2025
diff --git a/.gitignore b/.gitignore
@@ -51,7 +51,7 @@ build/
 dist/
 
 # dynamically generated
-/repositories/ip-instruct/
+/repositories/deepseek-vl2/
 
 # all dynamic stuff
 /extensions/**/*

diff --git a/.gitmodules b/.gitmodules
@@ -23,5 +23,5 @@
   url = https://github.com/ArtVentureX/sd-webui-agent-scheduler
   ignore = dirty
 [submodule "extensions-builtin/sdnext-modernui"]
-	path = extensions-builtin/sdnext-modernui
-	url = https://github.com/BinaryQuantumSoul/sdnext-modernui
+  path = extensions-builtin/sdnext-modernui
+  url = https://github.com/BinaryQuantumSoul/sdnext-modernui
diff --git a/.pylintrc b/.pylintrc
@@ -23,12 +23,14 @@ ignore-paths=/usr/lib/.*$,
              modules/k-diffusion,
              modules/ldsr,
              modules/meissonic,
+             modules/mod,
              modules/omnigen,
              modules/onnx_impl,
              modules/pag,
              modules/pixelsmith,
              modules/prompt_parser_xhinker.py,
              modules/pulid/eva_clip,
+             modules/ras,
              modules/rife,
              modules/schedulers,
              modules/taesd,

diff --git a/.ruff.toml b/.ruff.toml
@@ -17,12 +17,14 @@ exclude = [
     "modules/k-diffusion",
     "modules/ldsr",
     "modules/meissonic",
+    "modules/mod",
     "modules/omnigen",
     "modules/pag",
     "modules/pixelsmith",
     "modules/postprocess/aurasr_arch.py",
     "modules/prompt_parser_xhinker.py",
     "modules/pulid/eva_clip",
+    "modules/ras",
     "modules/rife",
     "modules/schedulers",
     "modules/segmoe",

diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -1,5 +1,146 @@
 # Change Log for SD.Next
 
+## Update for 2025-02-20
+
+### Highlight for 2025-02-20
+
+We're back with another update with nearly 100 commits!  
+- Starting with massive UI update with full [localization](https://vladmandic.github.io/sdnext-docs/Locale/) for 8 languages  
+  and 100+ new [hints](https://vladmandic.github.io/sdnext-docs/Hints/)  
+- Big update to [Docker](https://vladmandic.github.io/sdnext-docs/Docker/) containers  
+  with support for all major compute platforms  
+- A lot of [outpainting](https://vladmandic.github.io/sdnext-docs/Outpaint/) goodies  
+- Support for new models: [AlphaVLLM Lumina 2](https://github.com/Alpha-VLLM/Lumina-Image-2.0) and [Ostris Flex.1-Alpha](https://huggingface.co/ostris/Flex.1-alpha)  
+- And new **Mixture-of-Diffusers** regional prompting & tiling pipeline  
+- Follow-up to last weeks **interrogate/captioning** rewrite  
+  now with redesigned captioning UI, batch support, and much more  
+  plus **JoyTag**, **JoyCaption**, **PaliGemma**, **ToriiGate**, **Ovis2** added to list of supported models  
+- Some changes to **prompt parsing** to allow more control as well as  
+  more flexibility when mouting SDNext server to custom URL  
+- Of course, cumulative fixes...  
+
+*...and more* - see [changelog](https://github.com/vladmandic/sdnext/blob/dev/CHANGELOG.md) for full details!  
+
+### Details for 2025-02-20
+
+- **User Interface**  
+  - **Hints**  
+    - added/updated 100+ ui hints!  
+    - [hints](https://vladmandic.github.io/sdnext-docs/Hints/) documentation and contribution guide  
+  - **Localization**  
+    - full ui localization!  
+      *english, croatian, spanish, french, italian, portuguese, chinese, japanese, korean, russian*  
+    - set in *settings -> user interface -> language*  
+    - [localization](https://vladmandic.github.io/sdnext-docs/Locale/) documentation  
+  - **UI**  
+    - force browser cache-invalidate on page load  
+    - configurable request timeout  
+    - modernui improve gallery styling  
+    - modernui improve networks styling  
+    - modernui support variable card size  
+- **Docs**  
+  - New [Outpaint](https://vladmandic.github.io/sdnext-docs/Outpaint/) step-by-step guide  
+  - Updated [Docker](https://github.com/vladmandic/sdnext/wiki/Docker) guide  
+    includes build and publish and both local and cloud examples  
+- **Models**  
+  - [AlphaVLLM Lumina 2](https://github.com/Alpha-VLLM/Lumina-Image-2.0)  
+    new foundation model for image generation based o Gemma-2-2B text encoder and a flow-based diffusion transformer  
+    fully supports offloading and on-the-fly quantization  
+    simply select from *networks -> models -> reference*  
+  - [Ostris Flex.1-Alpha](https://huggingface.co/ostris/Flex.1-alpha)  
+    originally based on *Flux.1-Schnell*, but retrained and with different architecture  
+    result is model smaller than *Flux.1-Dev*, but with similar capabilities  
+    fully supports offloading and on-the-fly quantization  
+    simply select from *networks -> models -> reference*  
+- **Functions**  
+  - [Mixture-of-Diffusers](https://huggingface.co/posts/elismasilva/251775641926329)  
+    Regional tiling type of a solution for SDXL models  
+    select from *scripts -> mixture of diffusers*  
+  - [Automatic Color Inpaint]  
+    Automatically creates mask based on selected color and triggers inpaint  
+    simply select in *scripts -> automatic color inpaint* when in img2img mode  
+  - [RAS: Region-Adaptive Sampling](https://github.com/microsoft/RAS) *experimental*  
+    Speeds up SD3.5 models by sampling only regions of interest  
+    Enable in *settings -> pipeline modifiers -> ras*  
+- **Interrogate/Captioning**  
+  - Redesigned captioning UI  
+    split from Process tab into separate tab  
+    split `clip` vs `vlm` models processing  
+    direct *send-to* buttons on all tabs: txt/img/ctrl->process/caption, process/caption->txt/img/ctrl  
+  - Advanced params:
+    VLM: *max-tokens, num-beams, temperature, top-k, top-p, do-sample*  
+    CLiP: *min-length, max-length, chunk-size, min-flavors, max-flavors, flavor-count, num-beams*  
+    params are auto-saved in `config.json` and used when using quick interrogate  
+    params that are set to 0 mean use model defaults  
+  - Batch processing: VLM and CLiP  
+    for example, can be used to caption your training dataset in one go  
+    add option to append to captions file, can be used to run multiple captioning models in sequence  
+    add option to run recursively on all subfolders  
+    add progress bar  
+  - Add additional VLM models:  
+    [JoyTag](https://huggingface.co/fancyfeast/joytag)  
+    [JoyCaption 2](https://huggingface.co/fancyfeast/llama-joycaption-alpha-two-hf-llava)  
+    [Google PaliGemma 2](https://huggingface.co/google/paligemma2-3b-pt-224) 3B  
+    [ToriiGate 0.4](https://huggingface.co/Minthy/ToriiGate-v0.4-7B) 7B  
+    [AIDC Ovis2](https://huggingface.co/AIDC-AI/Ovis2-1B) 1B/2B/4B  
+  - *Note* some models require `flash-attn` to be installed  
+    due to binary/build dependencies, it should not be done automatically,  
+    see [flash-attn](https://github.com/Dao-AILab/flash-attention) for installation instructions  
+- **Docker**  
+  - updated **CUDA** receipe to `torch==2.6.0` with `cuda==12.6` and add prebuilt image  
+  - added **ROCm** receipe and prebuilt image  
+  - added **IPEX** receipe and add prebuilt image  
+  - added **OpenVINO** receipe and prebuilt image   
+- **System**  
+  - improve **python==3.12** compatibility  
+  - **Torch**  
+    - for **zluda** set default to `torch==2.6.0+cu118`  
+    - for **openvino** set default to `torch==2.6.0+cpu`  
+  - **OpenVINO**  
+    - update to `openvino==2025.0.0`  
+    - improve upscaler compatibility  
+    - enable upscaler compile by default  
+    - fix shape mismatch errors on too many resolution changes  
+  - **ZLUDA**  
+    - update to `zluda==3.8.8`  
+- **Other**  
+  - **Asymmetric tiling**  
+    allows for configurable image tiling for x/y axis separately  
+    enable in *scripts -> asymmetric tiling*  
+    *note*: traditional symmetric tiling is achieved by setting circular mode for both x and y  
+  - **Styles**  
+    ability to save and/or restore prompts before or after parsing of wildcards  
+    set in *settings -> networks -> styles*  
+  - **Access tokens**  
+    persist *models -> hugginface -> token*  
+    persist *models -> civitai -> token*  
+  - global switch to lancosz method for all interal resize ops and bicubic for interpolation ops  
+  - **Text encoder**  
+    add advanced per-model options for text encoder  
+    set in *settings -> text encoder -> Optional*  
+  - **Subpath**  
+    allow setting additional mount subpath over which server url will be accessible  
+    set in *settings -> user interface*  
+  - **Prompt parsing**  
+    better handling of prompt parsing when using masking char `\`  
+- **Fixes**  
+  - update torch nightly urls  
+  - docs/wiki always use relative links  
+  - ui use correct timezone for log display  
+  - ui improve settings search behavior  
+  - ui log scroll to bottom  
+  - ui fix send to inpaint/sketch  
+  - modernui add control init image toggle  
+  - modernui fix sampler advanced options  
+  - outpaint fixes  
+  - validate output before hires/refine  
+  - scheduler fix sigma index out of bounds  
+  - force pydantic version reinstall/reload  
+  - multi-unit when using controlnet-union  
+  - pulid with hidiffusion  
+  - api: stricter access control  
+  - api: universal handle mount subpaths  
+
 ## Update for 2025-02-05
 
 - refresh dev/master branches

diff --git a/README.md b/README.md
@@ -24,6 +24,8 @@
 ## SD.Next Features
 
 All individual features are not listed here, instead check [ChangeLog](CHANGELOG.md) for full list of changes
+- Fully localized:
+  ▹ **English | Chinese | Russian | Spanish | German | French | Italian | Portuguese | Japanese | Korean**  
 - Multiple UIs!  
   ▹ **Standard | Modern**  
 - Multiple [diffusion models](https://vladmandic.github.io/sdnext-docs/Model-Support/)!  
@@ -34,6 +36,7 @@ All individual features are not listed here, instead check [ChangeLog](CHANGELOG
 - Optimized processing with latest `torch` developments with built-in support for model compile, quantize and compress  
   Compile backends: *Triton | StableFast | DeepCache | OneDiff | TeaCache | etc.*  
   Quantization and compression methods: *BitsAndBytes | TorchAO | Optimum-Quanto | NNCF*  
+- **Interrogate/Captioning** with 150+ **OpenCLiP** models and 20+ built-in **VLMs**  
 - Built-in queue management  
 - Built in installer with automatic updates and dependency management  
 - Mobile compatible  
@@ -68,6 +71,8 @@ SD.Next supports broad range of models: [supported models](https://vladmandic.gi
 - *ONNX/Olive*  
 - *AMD* GPUs on Windows using **ZLUDA** libraries  
 
+Plus Docker container receipes for: [CUDA, ROCm, Intel IPEX and OpenVINO](https://vladmandic.github.io/sdnext-docs/Docker/)
+
 ## Getting started
 
 - Get started with **SD.Next** by following the [installation instructions](https://vladmandic.github.io/sdnext-docs/Installation/)  

diff --git a/TODO.md b/TODO.md
@@ -2,29 +2,26 @@
 
 Main ToDo list can be found at [GitHub projects](https://github.com/users/vladmandic/projects)
 
-## Pending
-
-- LoRA direct with caching
-- Previewer issues
-- Redesign postprocessing
-
 ## Future Candidates
 
-- Flux NF4 loader: <https://github.com/huggingface/diffusers/issues/9996>
-- IPAdapter negative: <https://github.com/huggingface/diffusers/discussions/7167>
-- Control API enhance scripts compatibility
-- PixelSmith: <https://github.com/Thanos-DB/Pixelsmith>
+- Redesign postprocessing  
+- Flux NF4 loader: <https://github.com/huggingface/diffusers/issues/9996>  
+- IPAdapter negative: <https://github.com/huggingface/diffusers/discussions/7167>  
+- Control API enhance scripts compatibility  
+- CogView4  
 
 ## Code TODO
 
-- TODO install: enable ROCm for windows when available
-- TODO resize image: enable full VAE mode for resize-latent
-- TODO processing: remove duplicate mask params
-- TODO flux: fix loader for civitai nf4 models
-- TODO model loader: implement model in-memory caching
-- TODO hypertile: vae breaks when using non-standard sizes
-- TODO model load: force-reloading entire model as loading transformers only leads to massive memory usage
-- TODO lora load: direct with bnb
-- TODO lora make: support quantized flux
-- TODO control: support scripts via api
-- TODO modernui: monkey-patch for missing tabs.select event
+- flux: loader for civitai nf4 models (fixme)
+- hypertile: vae breaks when using non-standard sizes (fixme)
+- install: enable ROCm for windows when available (fixme)
+- lora make support quantized flux (fixme)
+- lora: add other quantization types (fixme)
+- model load: force-reloading entire model as loading transformers only leads to massive memory usage (fixme)
+- model loader: implement model in-memory caching (fixme)
+- modernui: monkey-patch for missing tabs.select event (fixme)
+- processing: remove duplicate mask params (fixme)
+- resize image: enable full VAE mode for resize-latent (fixme)
+- sana: fails when quantized (fixme)
+- support scripts via api (fixme)
+- transformer from-single-file with quant (fixme)
diff --git a/cli/image-interrogate.py → cli/api-interrogate.py b/cli/image-interrogate.py → cli/api-interrogate.py
diff --git a/cli/hf-convert.py b/cli/hf-convert.py
diff --git a/cli/image-encode.py b/cli/image-encode.py
@@ -29,4 +29,3 @@ def encode(file: str):
     print('=== BEGIN ===')
     print(f'{b64}')
     print('=== END ===')
-
diff --git a/cli/locale-sanitize-override.py b/cli/locale-sanitize-override.py
@@ -0,0 +1,29 @@
+#!/usr/bin/env python
+
+# Remove the entries that no longer exist in locale from override.
+
+import sys
+import json
+from rich import print # pylint: disable=redefined-builtin
+
+if __name__ == "__main__":
+    sys.argv.pop(0)
+    if len(sys.argv) == 0:
+        print('Invalid parameters.')
+        sys.exit(1)
+    filename = sys.argv[0]
+    labels = []
+    override = None
+    try:
+        with open('html/locale_en.json', 'r', encoding="utf-8") as f:
+            locale = json.load(f)
+        for v in locale.values():
+            for item in v:
+                labels.append(item['label'])
+        with open(filename, 'r', encoding="utf-8") as f:
+            override = json.load(f)
+    except Exception:
+        print('Invalid file format.')
+        sys.exit(1)
+    with open(filename, 'w', encoding="utf-8") as f:
+        json.dump([item for item in override if item['label'] in labels], f, ensure_ascii=False)
Original file line number	Diff line number	Diff line change
Expand Up		@@ -29,4 +29,3 @@ def encode(file: str):
		print('=== BEGIN ===')
		print(f'{b64}')
		print('=== END ===')