ComfyOnline
ComfyUI_OmniParser

ComfyUI_OmniParser

Try OmniParser in ComfyUI which a simple screen parsing tool towards pure vision based GUI agent.


1.Installation

In the ./ComfyUI /custom_node directory, run the following:

git clone https://github.com/smthemex/ComfyUI_OmniParser.git

2.requirements

pip install -r requirements.txt


3.Checkpoints

huggingface-OmniParser


4.Example


5.Citation

microsoft/OmniParser

@misc{lu2024omniparserpurevisionbased,
      title={OmniParser for Pure Vision Based GUI Agent}, 
      author={Yadong Lu and Jianwei Yang and Yelong Shen and Ahmed Awadallah},
      year={2024},
      eprint={2408.00203},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2408.00203}, 
}

Some codes form # @aliencaocao