ComfyUI-Lumina-Next-SFT-DiffusersWrapper

Lumina Diffusers Node for ComfyUI

This custom node seamlessly integrates the Lumina-Next-SFT model into ComfyUI, enabling high-quality image generation using the advanced Lumina text-to-image pipeline. While still under active development, it offers a robust and functional implementation with advanced features.

Features

Harnesses the power of the Lumina-Next-SFT model for state-of-the-art image generation
Offers a wide range of generation parameters for fine-tuned control
Implements Lumina-specific features including scaling watershed and proportional attention
Supports input latents and strength parameter for image-to-image capabilities
Automatic model downloading for seamless setup
Outputs generated latent representations

Installation

Now in ComfyUI Manager!

For manual installation:

Ensure you have ComfyUI installed and properly set up.

Clone this repository into your ComfyUI custom nodes directory:

git clone https://github.com/Excidos/ComfyUI-Lumina-Diffusers.git

The required dependencies will be automatically installed.

NOTE: This installation includes a development branch of diffusers, which may conflict with some existing nodes.

Usage

Use with the standard SDXL_VAE or SDXL_Fixed_FP16-VAE

Launch ComfyUI.
Locate the "Lumina-Next-SFT Diffusers" node in the node selection menu.
Add the node to your workflow.
Connect the necessary inputs and outputs.
Configure the node parameters as desired.
Execute your workflow to generate images.

Parameters

model_path: Path to the Lumina model (default: "Alpha-VLLM/Lumina-Next-SFT-diffusers")
prompt: Text prompt for image generation
negative_prompt: Negative text prompt
num_inference_steps: Number of denoising steps (default: 30)
guidance_scale: Classifier-free guidance scale (default: 4.0)
seed: Random seed for generation (-1 for random)
batch_size: Number of images to generate in one batch (default: 1)
scaling_watershed: Scaling watershed parameter (default: 0.3)
proportional_attn: Enable proportional attention (default: True)
clean_caption: Clean input captions (default: True)
max_sequence_length: Maximum sequence length for text input (default: 256)
use_time_shift: Enable time shift feature (default: False)
t_shift: Time shift factor (default: 4)
strength: Strength for image-to-image generation (default: 1.0, range: 0.0 to 1.0)

Inputs

latents (optional): Input latents for image-to-image generation

Outputs

LATENT: Latent representation of the generated image(s)

Known Features and Limitations

Supports input latents for image-to-image generation
Implements strength parameter for controlling the influence of input latents
Time shift feature for advanced control over the generation process
Output is currently limited to latent representations; use a VAE decode node to obtain images