SceneSynth
AI-Powered Hierarchical Scene Graph Generator for 2D Level Design
Transform natural language into explorable, nested world structures
Powered by Gemini + Nano Banana (gemini-2.5-flash-image)
What is SceneSynth?
SceneSynth is a desktop application that generates semantic scene graphs from text prompts using AI. Describe a world, and watch it materialize as an interactive node graph. Each node can be expanded into its own detailed sub-graph, enabling infinite hierarchical world-building—from continents down to individual rooms.
Then bring your scenes to life with Nano Banana (gemini-2.5-flash-image)—Google's native image generation model—which renders your semantic graphs into vivid scene illustrations.
NEWS: Working on SceneSynth WEB, for no-coders to use this as a website, easy to use -- URL Will be Here SOON (star the project and stay in touch for January 2026)
Key Features
- Natural Language to Scene Graph — Describe your world in plain text; AI generates a structured graph
- Recursive Drill-Down — Double-click any node to expand it into a detailed child graph
- Semantic Positioning — Nodes are placed meaningfully (sky elements at top, underground at bottom)
- Graph Modification via Chat — Refine your scene with instructions like "Add a tavern near the market"
- Nano Banana Rendering — Transform your graphs into stunning visuals with
gemini-2.5-flash-image - Multiple Art Styles — Fantasy, photorealistic, pixel art, watercolor, anime, and custom styles
- Full Project Persistence — Save/load entire hierarchies including rendered images
Nano Banana: AI Scene Rendering
SceneSynth leverages Nano Banana (gemini-2.5-flash-image), Google's native multimodal image generation model via Vertex AI, to render your semantic graphs as actual scene artwork.
| Semantic Scene Graph | Nano Banana Render |
|---|---|
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
How it works:
- Your scene graph (nodes, relationships, spatial layout) is serialized to JSON
- A screenshot of your graph canvas is captured
- Both are sent to Nano Banana with your chosen art style
- The model generates a cohesive scene illustration respecting the semantic layout
Available Styles:
- Top-down 2D Game Art
- Fantasy Illustration
- Photorealistic
- Pixel Art (16-bit)
- Watercolor
- Anime/Manga
- Dark Fantasy
- Isometric
- Custom prompts supported
Rendered images are stored with each graph and persist across save/load cycles.
Prerequisites
Important: SceneSynth requires Google Cloud services for AI functionality.
| Requirement | Purpose |
|---|---|
| Google Cloud Account | Access to Vertex AI services |
| Vertex AI API | Must be enabled in your GCP project |
| Gemini API Key | For graph generation (Get one here) |
| GCP Project ID | For image rendering via Vertex AI |
| gcloud CLI | For authentication |
Authentication Setup
# Install gcloud CLI if needed, then authenticate:
gcloud auth application-default loginInstallation
# Clone the repository git clone https://github.com/yourusername/SceneSynth.git cd SceneSynth # Install dependencies pip install -r requirements.txt # Run the application python main.py
Dependencies
- Python 3.10+
- PyQt6
- google-genai
- networkx
- pydantic
Configuration
- Launch SceneSynth
- Go to Edit → API Settings
- Enter your Gemini API Key (for graph generation)
- Enter your GCP Project ID (for image rendering)
- Click Save
Quick Start
- Generate a World — Enter a prompt like "A medieval fantasy kingdom with diverse regions" and click Generate
- Explore the Graph — Pan, zoom, and select nodes to view their details
- Drill Down — Double-click any node to generate a detailed sub-graph
- Modify with Language — Use the prompt panel to add, remove, or change elements
- Render Scenes — Click "Render Scene" to generate an AI visualization
- Save Your Work — File → Save Project to preserve your entire hierarchy
How It Works
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
│ Text Prompt │ ──▶ │ Gemini LLM │ ──▶ │ Scene Graph │
│ "A forest..." │ │ (Structured) │ │ (Nodes/Edges) │
└─────────────────┘ └─────────────────┘ └─────────────────┘
│
▼
┌─────────────────┐
│ Interactive │
│ Canvas (PyQt) │
└─────────────────┘
SceneSynth uses structured output from Gemini to generate semantically meaningful graphs. Each node contains:
- Name and description
- Semantic type (location, landmark, character, etc.)
- Spatial coordinates for meaningful layout
- Expandability flag for drill-down capability
The hierarchy manager maintains parent-child relationships between graphs, enabling seamless navigation through nested worlds.
Project Structure
SceneSynth/
├── main.py # Entry point
├── config.py # App configuration
├── core/
│ ├── graph/ # Scene graph data structures
│ ├── llm/ # AI providers and prompts
│ └── state/ # Application state management
└── ui/
├── main_window.py # Main application window
└── widgets/ # UI components
License
MIT License — Feel free to use, modify, and distribute.
Built with PyQt6 and Google Gemini








