Playgrounds
Accessing Playgrounds
Playgrounds is your testing and experimentation environment for prompts. It provides a column-based workspace where you can compare multiple prompt versions side-by-side, test different models, and collaborate with your team on prompt development. Playgrounds is your prompt testing environment that solves a critical challenge in AI development: knowing which prompt version and model combination actually works best.
To access Playgrounds:
- Click on Playgrounds in the navigation
- You'll see all your previously saved playgrounds
Using Playgrounds
When you open Playgrounds, you'll see a column-based interface. Each column represents a different prompt or prompt version that you can test and compare simultaneously.
You can either create a new prompt from scratch or load an existing prompt as a starting point.

Loading Prompts
When you select a prompt to load into Playground, you'll need to choose which version you want to work with. This version control system is essential for testing old versions and comparing new iterations.
You can have several columns with the same Prompt and version in the same playground.
Changing Models and Settings
Click the model selector at the top of any column to change which model that prompt uses. This is a playground-only change - it won't update your underlying saved prompt. This allows you to quickly test if a different model responds better to your specific use case without committing to the change. You can also press the gear icon to test different model settings such as temperature.
Why Test Different Models? Different models have different strengths:
- Some excel at creative tasks
- Others are better at following strict instructions
- Newer models might handle complex reasoning better
- Smaller models might be faster and more cost-effective for simple tasks
By testing the same prompt across multiple models in Playground, you can find the optimal balance of quality, speed, and cost for your specific needs.
Sharing and Saving
When you save a playground, it preserves your entire workspace - not just the prompt settings, but also the actual run results. This creates a complete snapshot of your testing session that you can return to later, picking up exactly where you left off.
Since you're sharing the actual outputs alongside the prompts, your team sees exactly what you tested and what results you got, making feedback concrete and actionable.
When to Use Playgrounds
Testing prompt iterations against a baseline Load your current production prompt alongside 2-3 new variations to see if your changes actually improve outputs. Having the old version right there prevents you from accidentally making things worse.
Comparing models when you're unsure which to use Run the same prompt through GPT-4, Claude, and other models side-by-side. Sometimes a smaller, faster model works just as well as an expensive one for your specific use case.
Getting team approval before deployment Instead of describing what changed, show stakeholders the actual outputs from different versions. Save the playground with results and share it for review - they can see exactly why version 3 works better than version 2.
Documenting why you reverted to an older version When something goes wrong in production, load up the problematic version alongside the stable one to document exactly what broke. This testing record helps prevent the same regression in the future.
Proposing different approaches for a new feature Create a playground with completely different prompt strategies, run them all, and share with your team to decide which direction to pursue. The saved results make the decision meeting much more productive.