Stable Diffusion - SD v1.6+ 版本导致 BLIP Interrogate CLIP (CLIP 反推) 功能 RuntimeError 异常

Stable Diffusion - SD v1.6+ 版本导致 BLIP Interrogate CLIP (CLIP 反推) 功能 RuntimeError 异常

    正在检查是否收录...

欢迎关注我的CSDN:https://spike.blog.csdn.net/
本文地址:https://spike.blog.csdn.net/article/details/132994678

图像来源于 麦橘写实_MajicMIX_Realistic_v6 模型

升级 SD v1.6 版本,导致 CLIP 反推功能无法使用,即:

参考:图像反推 (Interrogate) 提示词算法 (BLIP 和 DeepBooru)

错误日志:

# ... File "stable_diffusion_webui/repositories/BLIP/models/med.py", line 277, in forward self_outputs = self.self( File "stable_diffusion_webui/venv/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl return forward_call(*args, **kwargs) File "stable_diffusion_webui/repositories/BLIP/models/med.py", line 178, in forward attention_scores = torch.matmul(query_layer, key_layer.transpose(-1, -2)) RuntimeError: The size of tensor a (2) must match the size of tensor b (4) at non-singleton dimension 0 

解决方案:SD 的 CLIP 反推功能,调用 GitHub - salesforce/BLIP,工程是上次更新是2022.9,整体的 Transformer 框架比较旧,目前仅支持 4.26.1 版本,即:

pip install transformers==4.26.1 pip install tokenizers==0.11.1 

然而,SD v1.6 版本的 transformers 建议更新至 4.30.2,因而导致冲突,参考 requirements.txtrequirements_versions.txt

transformers==4.30.2 

因此,需要修改至 transformers==4.26.1,即可使用,BLIP 目前无人维护,因此只能以 BLIP 的 Transformer 为主。

参考:

Github - Dreambooth extension causes BLIP interrogation to give error (if number of beams is changed to anything greater then 1) GitHub - Bug: Interrogate CLIP

同时,修改 stable-diffusion-webui/modules/launch_utils.py 脚本,增加 GitHub 代理 https://ghproxy.com/,可以提升启动 WebUI 工程的预处理速度,如果需要更新版本,可以根据相应的工程地址,进行更新。其中:

BLIP 工程位于 stable-diffusion-webui/stable_diffusion_webui/repositories/BLIP BLIP 模型位于 stable_diffusion_webui/models/BLIP/model_base_capfilt_large.pth

即:

def prepare_environment(): # ... clip_package = os.environ.get('CLIP_PACKAGE', "https://ghproxy.com/https://github.com/openai/CLIP/archive/d50d76daa670286dd6cacf3bcd80b5e4823fc8e1.zip") openclip_package = os.environ.get('OPENCLIP_PACKAGE', "https://ghproxy.com/https://github.com/mlfoundations/open_clip/archive/bb6e834e9c70d9c27d0dc3ecedeebeaeb1ffad6b.zip") stable_diffusion_repo = os.environ.get('STABLE_DIFFUSION_REPO', "https://ghproxy.com/https://github.com/Stability-AI/stablediffusion.git") stable_diffusion_xl_repo = os.environ.get('STABLE_DIFFUSION_XL_REPO', "https://ghproxy.com/https://github.com/Stability-AI/generative-models.git") k_diffusion_repo = os.environ.get('K_DIFFUSION_REPO', 'https://ghproxy.com/https://github.com/crowsonkb/k-diffusion.git') codeformer_repo = os.environ.get('CODEFORMER_REPO', 'https://ghproxy.com/https://github.com/sczhou/CodeFormer.git') blip_repo = os.environ.get('BLIP_REPO', 'https://ghproxy.com/https://github.com/salesforce/BLIP.git') #... 

注意:官网模型地址是 https://storage.googleapis.com/sfr-vision-language-research/BLIP/models/model_base_capfilt_large.pth,比 SD 推荐的 model_base_caption_capfilt_large.pth 模型,尺寸更大,即 2.0G 与 800M。

 files = modelloader.load_models( model_path=os.path.join(paths.models_path, "BLIP"), - model_url='https://storage.googleapis.com/sfr-vision-language-research/BLIP/models/model_base_caption_capfilt_large.pth', + model_url='https://storage.googleapis.com/sfr-vision-language-research/BLIP/models/model_base_capfilt_large.pth',^M ext_filter=[".pth"], - download_name='model_base_caption_capfilt_large.pth', + download_name='model_base_capfilt_large.pth',^M ) 

图像描述,来自于 New Bing:

a picture of a person sitting on a chair in a luxurious room,
wearing a black and white zebra print dress and black high heels,
chair is a light beige color with a curved back and armrests,
room has a large window with white curtains and a gold-framed mirror on the wall,
floor is made of light-colored wood,
shows a contrast between the bold and striking pattern of the dress and the soft and elegant colors of the room,
seems to be relaxed and comfortable as they are leaning back on the chair and crossing their legs,
picture might be taken for a fashion magazine or a personal blog as it showcases the style and taste of the person,
It might also be taken for a hotel advertisement or a travel diary as it shows the beauty and luxury of the room,
The picture creates an impression of sophistication and glamour as well as curiosity and interest,


完整提升词:

(masterpiece, best quality:1.2),highly detailed,extremely detailed,real photo,
looking at viewer,body facing viewer,240D wrap hip very thick pantyhose,
a picture of a person sitting on a chair in a luxurious room,
wearing a black and white zebra print dress and black high heels,
chair is a light beige color with a curved back and armrests,
room has a large window with white curtains and a gold-framed mirror on the wall,
floor is made of light-colored wood,
shows a contrast between the bold and striking pattern of the dress and the soft and elegant colors of the room,
seems to be relaxed and comfortable as they are leaning back on the chair and crossing their legs,
The picture might be taken for a fashion magazine or a personal blog as it showcases the style and taste of the person,
It might also be taken for a hotel advertisement or a travel diary as it shows the beauty and luxury of the room,
picture creates an impression of sophistication and glamour as well as curiosity and interest,
(pair shoes,pair legs:1.2),nice hand,nice figure,
(photorealistic,realistic:1.2),
<lora:more_details:0.4>,<lora:clothing_adjuster_v2:-0.8>,
Negative prompt: (ng_deepnegative_v1_75t:1.3),(negative_hand),(badhandv4),
(negative_feet_v2:0.5),
cleavage,buttocks,
missing arm,missing leg,extra arms,extra legs,mutated legs,extra limbs,malformed limbs,floating limbs,disconnected limbs,
bad anatomy,bad proportions,disfigured,long neck,long leg,
worst quality,bad quality,jpeg artifacts,lowres,normal quality,low quality,
EasyNegative,
Steps: 30, Sampler: DPM++ 2M SDE Karras, CFG scale: 7, Seed: 2386674497, Size: 512x768, Model hash: e4a30e4607, Model: 麦橘写实_MajicMIX_Realistic_v6, Denoising strength: 0.3, ADetailer model: face_yolov8n.pt, ADetailer prompt: “asian face,beatiful face,”, ADetailer confidence: 0.3, ADetailer dilate/erode: 4, ADetailer mask blur: 4, ADetailer denoising strength: 0.4, ADetailer inpaint only masked: True, ADetailer inpaint padding: 32, ADetailer version: 23.9.2, Hires upscale: 2, Hires steps: 5, Hires upscaler: 4x-UltraSharp, Lora hashes: “more_details: 3b8aa1d351ef, clothing_adjuster_v2: f038e3a5b67b”, TI hashes: “ng_deepnegative_v1_75t: 54e7e4826d53, negative_hand: 73b524a2da12, badhandv4: 5e40d722fc3d, negative_feet_v2: df90b1ff666d, EasyNegative: 66a7279a88dd”, Version: v1.6.0

codegitdiffusiongithubcliclipproxywebwebuitransformertransformersapiraggoogleurlsalesforcehivesemcreatesalespromptgansopioserpideopenai图像描述pythonstablediffusiontputoken解决方案提示词dreamboothdreamrapbing
  • 本文作者:李琛
  • 本文链接: https://wapzz.net/post-3312.html
  • 版权声明:本博客所有文章除特别声明外,均默认采用 CC BY-NC-SA 4.0 许可协议。
本站部分内容来源于网络转载,仅供学习交流使用。如涉及版权问题,请及时联系我们,我们将第一时间处理。
文章很赞!支持一下吧 还没有人为TA充电
为TA充电
还没有人为TA充电
0
  • 支付宝打赏
    支付宝扫一扫
  • 微信打赏
    微信扫一扫
感谢支持
文章很赞!支持一下吧
关于作者
2.3W+
5
0
1
WAP站长官方

Google Gemini Pro版怎么申请 好用的AI助手分享

上一篇

AI对英国就业的影响:10%-30%的工作可能被AI自动化替代

下一篇
  • 复制图片
按住ctrl可打开默认菜单