Wanrong Zhu
Wanrong Zhu
Home
Publications
CV
3
LayoutGPT: Compositional Visual Planning and Generation with Large Language Models
Preprint (arXiv 2305.15393)
Multimodal Procedural Planning via Dual Text-Image Prompting
Preprint (arXiv 2305.01795)
Multimodal C4: An Open, Billion-scale Corpus of Images Interleaved With Text
Preprint (arXiv 2304.06939)
OpenFlamingo: An Open-Source Framework for Training Vision-Language Models with In-Context Learning
Stay-tuned for the technical report!
Collaborative Generative AI: Integrating GPT-k for Efficient Editing in Text-to-Image Generation
Preprint (arXiv 2305.11317)
Large Language Models Are Implicitly Topic Models: Explaining and Finding Good Demonstrations for In-Context Learning
Preprint (arXiv 2301.11916)
CLIP also Understands Text: Prompting CLIP for Phrase Understanding
Preprint (arXiv 2210.05836)
Text Infilling
Preprint (arXiv 1901.00158)
Cite
×