Interactive Story Writing with Large Language Models and Stable Diffusion

Interactive Story Writing with Large Language Models and Stable Diffusion

As artificial intelligence continues to reshape our world, an exciting convergence of technology and artistic expression is paving the way for novel storytelling experiences. Here I would like to showcase a project that captures the essence of this innovation: a Story Writer that leverages advanced Language Models alongside Stable Diffusion’s image generation, breathing new life into the way we craft narratives. This project not only demonstrates how to integrate these models on consumer-grade GPUs with at least 12 GB of memory, like the RTX 3060, but also includes instructions for utilizing platforms like Kaggle or Google Colab’s free accounts with their default accelerators. Additionally, the notebook provides insights on running the inference on a CPU using Intel OpenVino, making it accessible for a wide range of users and setups.

This project is detailed in a Jupyter Notebook hosted on GitHub, which you can explore here: Story Writing with LLMs and Stable Diffusion.

Overview: The Story Writer harnesses the power of advanced LLMs to generate narratives based on user prompts. By integrating Stable Diffusion, the tool not only crafts textual content but also complements these narratives with relevant images, making each story visually as well as textually rich. This approach not only enhances the reader’s engagement but also illustrates the potential of AI in creative domains.

Key Features:

  • Automated Story Generation: Enter a starting text for a story paragraph around a selected theme (e.g fantasy, adventure) and watch as the model completes the text followed by it.
  • Integrated Image Synthesis: Each story is paired with unique images generated through Stable Diffusion, tailored to the narrative context. The LLM model is again used to generate a stable diffuser prompt (editable by user) which further is used to trigger image generation.
  • Customizable and Interactive: Users can guide the story’s direction, making this tool not just a writer but an interactive partner in creative storytelling.

Story Writer UI

Sample Output:

Story Writer Output

Technical Insights: The notebook details the technical underpinnings, from setting up the LLMs to integrating the image generation model, providing a comprehensive guide for enthusiasts and developers interested in exploring AI-driven content creation.

Models: Here are the default models used however user can replace them easily with their choice.

  • LLama 3 is utilized for text generation.
  • For image generation, the options include either stable-diffusion-3-medium-diffusersor stable-diffusion-v1-5.

Stable-diffusion-3-medium-diffusers requires more GPU memory but produces outstanding images. Users equipped with larger GPU capacities might prefer this option for its enhanced image quality.

Applications and Implications: This project isn’t just a technical showcase; it’s a glimpse into the future of digital storytelling where AI assists in creative expression. It’s ideal for writers seeking inspiration, educators who want to bring stories into classrooms, or anyone curious about the intersection of AI and art. Additionally, it should serve as a source of inspiration and enjoyment for those learning about Generative AI.

Explore the project on GitHub: Story Writing with LLMs and Stable Diffusion

Carlos Luis Fernández Santana

Ecommerce Manager | AI Engineer |Data Analysis | Python | Digital Marketing

2 周

I looks like a fun project to work on

要查看或添加评论,请登录