work - promptcrafter

PromptCrafter: Crafting Text-to-Image Prompt through Mixed-Initiative Dialogue with LLM 🤖

As text-to-image models become increasingly powerful, people still struggle to articulate their intentions in ways that these systems can meaningfully interpret. The gap between what users imagine and what the model produces often stems not from creativity itself, but from the opacity of how prompts shape generation. We were interested in how interaction design might reorganize this process-helping users externalize evolving ideas, reflect on their intentions, and discover the model’s expressive range in a more cognitively aligned way.

Author
Seungho Baek
Heather Hyerin Im
Jiseung Ryu
Juhyeong Park
Takyeon Lee

My Role
system development & design, qualitative research

Published

ICML 2023 Workshop : https://doi.org/10.48550/arXiv.2307.08985

Approach

PromptCrafter explores a mixed-initiative workflow where users and an LLM collaboratively construct prompts through small, interpretable steps.

Understanding Cognitive Bottlenecks A formative study with users of varying expertise revealed that people struggle to identify which textual elements drive unexpected outputs and often narrow their exploration prematurely.

Decomposing Intent Through Dialogue Instead of editing a monolithic prompt, the system breaks the process into a sequence of clarifying questions generated by an LLM. Users respond, compare visual outcomes, and refine their conceptual direction gradually-mirroring how people naturally iterate on ideas.

Making Creative Reasoning Visible Each interaction is stored as a visual history that users can revisit or branch. This transforms prompt engineering into a reflective workflow where the evolution of thought, interpretation, and AI response becomes part of the creative process.

Pipeline

UI

Conclusion

PromptCrafter reveals how thoughtfully designed human-AI dialogue can scaffold imagination, surface tacit preferences, and support more intentional control over generative models. Rather than treating prompts as static commands, the system frames them as sites of iterative meaning-making, showing how AI can augment not only image generation, but the cognitive process of forming and refining an idea.