- 博客(161)
- 收藏
- 关注
原创 Building Bridges across Spatial and Temporal Resolutions: Reference-Based Super-Resolution via Chang
Fi1γiFe⊕Fi⋅FiβiFe⊕Fi)Fi1FiFeγi⋅βi⋅。
2024-10-22 13:57:33
777
原创 Plug-and-Play Diffusion Features for Text-Driven Image-to-Image Translation
IGI∗P。
2024-10-21 22:51:01
384
原创 Prompt-to-Prompt Image Editing with Cross Attention Control
II∗PP∗ϕzt)QlQϕzt))KlKψP))vlVψP))MMijj−thiEditMtMt∗t。
2024-10-21 22:20:03
736
原创 Imagic: Text-Based Real Image Editing with Diffusion Models
etgteopteoptetgtetgteopteη⋅etgt1−η⋅eoptetgt。
2024-10-21 21:45:55
1058
原创 DIFFEDIT: DIFFUSION-BASED SEMANTIC IMAGE EDIT- ING WITH MASK GUIDANCE
https://arxiv.org/pdf/2210.11427问题引入针对的问题是输入text prompt完成对图片的编辑,基于的是T2I model;本文的方法不需要额外提供mask来将任务变为inpaint任务,而是可以自动的根据text prompt来提取出需要编辑区域的maskmethods
2024-10-21 21:10:00
320
原创 High-Resolution Image Synthesis with Latent Diffusion Models
DψDEx))qEz∣xNz;EμEσ2)zEμxEσx⋅ϵϵ∼N01。
2024-10-21 20:39:40
529
原创 Taming Transformers for High-Resolution Image Synthesis
Zzkk1K⊂RnznzEGxzEx∈Rh×w×nzzij∈Rnzzkzqqz:=argminzk∈Z∣∣zij−zk∣∣∈Rh×w×nzzqxGzqGqEx)))qLVQEGZ∣∣x−x∣∣2∣∣sgEx)]−zq∣∣22β∣∣sgzq−。
2024-10-19 18:30:30
527
原创 MasaCtrl: Tuning-Free Mutual Self-Attention Control for Consistent Image Synthesis and Editing
IsPsPMsM。
2024-10-19 15:53:39
761
原创 GLIDE: Towards Photorealistic Image Generation and Editing with Text-Guided Diffusion Models
μθxt∣y)∑θxt∣y)logpϕy∣xt)μθxt∣yμθxt∣ys⋅∑θxt∣y∇xtlogpϕy∣xt)ss↑↓ϵθxt∣yϵθxt∣∅s⋅ϵθxt∣y−ϵθxt∣∅))s≥1fx)gc)fx⋅gc)μθxt。
2024-10-18 22:33:15
622
原创 SDEDIT: GUIDED IMAGE SYNTHESIS AND EDITING WITH STOCHASTIC DIFFERENTIAL EQUATIONS
x0∼p0pdataxtt∈01xtαtx0σtzz∼N0I)σt01→0∞)αt01→01]xt)ptαt1σ1)p1∼N0σ21Iα2tσ2t1t∼1αt∼0p1∼N0I)t∇xlogptx)dxt−dtdσ2t)]∇xlogptxdtdtdσ2t。
2024-10-18 20:59:53
1061
原创 Blended Diffusion for Text-driven Editing of Natural Images
xdmx^x⊙mx⊙1−m≈x⊙1−m)⊙x0ϵθxtt)x0αtxt−αt1−αtϵθxttx0DCLIPxdmDcCLIPimgx⊙mCLIPtxtd))x0dDbgx1x2mdx1⊙1−mx2⊙1−m))dx1x。
2024-10-18 16:03:35
622
原创 Null-text Inversion for Editing Real Images using Guided Diffusion Models
II∗PP∗zT∗⋯z0∗z0∗z0{∅t1T∅t∅t1tT⋯1NzTzT∗min∅t∣∣zt−1∗−zt−1zt∅tC∣∣22。
2024-10-18 14:27:06
870
原创 Subject-Diffusion: Open Domain Personalized Text-to-Image Generation without Test-time Fine-tuning
LaLaβ⋅tanhγ⋅S([Lahe])heMLP([vFourierl)])LaβγSvl。
2024-10-18 10:44:22
827
原创 BLIP-Diffusion: Pre-trained Subject Representation for Controllable Text-to-Image Generation and Edi
https://proceedings.neurips.cc/paper_files/paper/2023/file/602e1a5de9c47df34cae39353a7f5bb1-Paper-Conference.pdfhttps://github.com/salesforce/LAVIS/tree/main/projects/blip-diffusion问题引入针对subject driven image generation的任务,首先根据BLIP2的训练方法训练一个multimod
2024-10-17 15:07:39
413
原创 Real-World Image Variation by Aligning Diffusion Inversion Chain
CRXGR{XR0⋯XRT}XGT{XGT⋯XG0}⊕XGTXRTC∗。
2024-10-17 11:21:47
928
2
原创 ELITE: Encoding Visual Concepts into Textual Embeddings for Customized Text-to-Image Generation
vRN×dNNve∈Rp×p×dKlWKl⋅e∗mVlWVl⋅e∗m)∣∣v∣∣1∣∣Vl∣∣1。
2024-10-12 20:59:53
703
原创 Taming Encoder for Zero Fine-tuning Image Customization with Text-to-Image Diffusion Models
IxcT。
2024-10-12 20:12:12
396
原创 SSR-Encoder: Encoding Selective Subject Representation for Subject-Driven Generation
IqcsctIqzq∈RNq×Dqz0∈RNi×DiWQWKQKAt2pSoftmaxdQKT)zIzkk0KkWkVkVAt2pcscskAt2pVkTkcskcsk。
2024-10-12 16:33:55
832
原创 FastComposer: Tuning-Free Multi-subject Image Generation with Localized Attention
Pw1w2⋯wn}Ss1s2⋯sm}mIi1i2⋯imij∈12⋯nA∈01h×w×nAijk]ijkmMM1M2⋯Mm}Ii1i2⋯imij∈12⋯nAiAi∈01h×w)iAijmjLlocm1∑j1mmeanAij。
2024-10-11 17:07:14
887
原创 MoA: Mixture-of-Attention for Subject-Context Disentanglement in Personalized Image Generation
Lrouter∣∣1−M⊙1−R∣∣22R∣L∣1∑l∈LR0l。
2024-10-09 22:03:36
978
原创 ORYX MLLM: ON-DEMAND SPATIAL-TEMPORAL UNDERSTANDING AT ARBITRARY RESOLUTION
H×WN×Np×p(Np×Np)2048×2048ld1d2d3d34d216d1fHfLfHkvfLq。
2024-10-09 10:36:30
739
原创 StoryMaker: Towards Holistic Consistent Characters in Text-to-image Generation
FfaceFcharacterFfaceFcharacterEbgE1R1FfaceE2R2FcharacterEiMLPCatE1E2EposciCatEbgReshapeEiN×LD))ciQZWqΔWqKtctWktΔW。
2024-10-08 15:13:32
1209
原创 flow model
GpGzGxGz)xpGx)pdatax){x1x2⋯xm∈pdatax)G∗argmaxG∑i1mlogpGxi)xiGxfzzz1z2xx1x2]Jf∂x1∂z1∂x2∂z1∂x1∂z2∂x2∂z2]zf−1x)Jf−1。
2024-09-10 20:45:58
801
原创 SimSwap: An Efficient Framework For High Fidelity Face Swapping
EncDecSDecTEnc−DecsEnc−DecTFeaTISvSvsvrvsDim。
2024-08-26 14:57:51
765
原创 InstantID: Zero-shot Identity-Preserving Generation in Seconds
https://arxiv.org/pdf/2401.07519#page=9.73https://github.com/instantX-research/InstantID?tab=readme-ov-filehttps://github.com/instantX-research/InstantID/pull/89/files问题引入目标是生成和reference图片相符合的图片,特别是人脸;现在基于微调模型的方法例如dreambooth,text inversion,lora等需
2024-08-26 11:33:21
229
原创 Adversarial Diffusion Distillation
θϕψxsαsx0σsϵxθxss)Tstudentτ1⋯τn}N4τn1000sxθx0FDϕkFkkxθxθtxψxθtt)sgd。
2024-08-24 14:32:08
773
原创 Prompt-Free Diffusion: Taking “Text” out of Text-to-Image Diffusion Models
CVPR2024 SHI Labshttps://arxiv.org/pdf/2305.16223https://github.com/SHI-Labs/Prompt-Free-Diffusion问题引入在SD模型的基础之上,去掉text prompt,使用reference image作为生成图片语义的指导,optional structure image作为生成图片structure的指导来进行生成;使用SeeCoder来提取参考图片的embedding作为生成条件,且SeeCode
2024-07-04 10:58:30
452
原创 Uni-ControlNet: All-in-One Control to Text-to-Image Diffusion Models
FMGfimgi{concatmfjconcatgi−1fji1ij132≤i≤12ij13yF′M′{concatmm′fjzerofj′))concatgi−1fjzerofj′))i1ij132≤i≤12ij13HFDNrZrclnormZr。
2024-07-04 10:16:59
874
原创 PixArt-Σ: Weak-to-Strong Training of Diffusion Transformer for 4K Text-to-Image Generation
ON2→OR2N2。
2024-07-03 22:16:38
343
原创 PIXART-α: FAST TRAINING OF DIFFUSION TRANS- FORMER FOR PHOTOREALISTIC TEXT-TO-IMAGE SYNTHESIS
Siβ1iβ2iγ1iγ2iα1iα2i]SSigSEi)Ei)Ei)Si。
2024-07-03 21:48:34
538
原创 T2I-Adapter: Learning Adapters to Dig out More Controllable Ability for Text-to-Image Diffusion Mode
512×51264×64FcFc1Fc2Fc3Fc4}Fci。
2024-07-03 21:04:17
557
原创 Pixel-Aware Stable Diffusion for Realistic Image Super-Resolution and Personalized Stylization
xyx′∈Rh∗w×cy′∈Rh∗w×cPACAQKVSoftmaxdQKT⋅VQtoQx′KtoKy′VtoVy′。
2024-07-01 20:58:10
1033
1
原创 DiffBIR: Towards Blind Image Restoration with Generative Diffusion Prior
深圳先进研究院&上海ai lab&港中文https://github.com/XPixelGroup/DiffBIRhttps://arxiv.org/pdf/2308.15070问题引入使用一个统一的框架来处理image restoration任务,包含图片超分BSR,图片去噪BID和人脸restoration BFR,分为两个阶段,第一个阶段是degradation removal来去掉与图片无关的退化信息,第二个阶段是generation module for los
2024-07-01 19:58:20
415
原创 SeeSR: Towards Semantics-Aware Real-World Image Super-Resolution
xyyxfxrepfxlogitsfyrepfylogitsLDAPELrfyreffxrepλLlfylogitsfxlogits)LrLl。
2024-07-01 17:04:38
1267
原创 IP-Adapter: Text Compatible Image Prompt Adapter for Text-to-Image Diffusion Models
N×dZ′AttentionQKVSoftmaxdQKTVQZWqKctWkVctWvZZ′ctZ′′AttentionQK′V′SoftmaxdQK′TV′QZWqK′ciWk′V′ciWv′WqWk′Wv′WkWvZnewZ。
2024-06-20 18:01:58
1050
原创 Hierarchical Integration Diffusion Model for Realistic Image Deblurring
z∈RN×C′Xr∈RHW×CQzKVSoftMaxQKTC⋅V。
2024-06-19 20:08:23
897
原创 DiffIR: Efficient Diffusion Model for Image Restoration
ZCPENS1PixelUnshuffleConcatIGTILQ)))Z∈R4C′F′Wl1Z⊙NormFWl2ZWlFF′QWdQWcQF′KWdKWcKF′VWdVWcVF′WdWcQ∈RHW×CK∈RC×HWV∈RHW×CFWcV。
2024-06-19 19:23:52
1143
原创 Humans in 4D: Reconstructing and Tracking Humans with Transformers
θ∈R24×3×3β∈R10M∈R3×NN6890X∈R3×kXMWW∈RN×kθb∈R23×3×3θg∈R3×3πRt)R∈R3×3t∈R3XxIθβπ。
2024-06-07 14:42:56
1014
原创 Scalable Diffusion Models with Transformers
I∈RH×W×3z∈R8H×8W×4T×ddTpγβγβαα。
2024-06-05 17:48:58
882
原创 Human Guided Ground-truth Generation for Realistic Image Super-resolution
IHIPosINeg)IHILLR−PosLR−NegL1。
2024-06-05 13:29:00
567
空空如也
空空如也
TA创建的收藏夹 TA关注的收藏夹
TA关注的人