How to Keep and Summarize A Dream Diary with the GPT-3 Machine Learning Language Model from OpenAI

UPDATE: This post was written in August of 2022, before the launch of ChatGPT and subsequent entry of LLMs into popular consciousness. The boilerplate code I wrote relies upon older GPT-3 models, but can be easily adjusted to use whatever is the most advanced OpenAI LLM at the moment.

Do-it-yourself:

Google Sheets template: https://docs.google.com/spreadsheets/d/1I951dcIb423RpkqZekGnomzsrznYi1_W5aCvWzsH7GA/

Python script: https://github.com/liamtrotzuk/gpt3-dream-diary

As an experiment, I kept a simple dream diary in Google Sheets for about 3 months, recording about 35 dreams in this period. I then used a simple Python script to read the last 10 dreams and give them to the GPT-3 language machine learning model from OpenAI, asking the model to summarize the dreams and look for symbolic trends, as a way to easily keep running impartial tabs on my general subconscious mental state. I was compelled to do this quick project by A. a newfound desire to reflect on the strange scenarios my mind conjures when it sleeps, coupled with B. a disinterest in spending time myself trying to summarize the dreams (through either qualitative or quantitative means) due to the potential for personal bias, wedded with C. an unwillingness to let another individual summarize my dreams due to the private nature of what’s in them. I’d been playing with GPT-3 at the time, and it struck me that an impartial machine that has already proven itself very skilled at writing good summaries of highly abstract works of fiction (such as book reports) would be an excellent, discrete, and time-efficient tool for summarizing the strange non-sequiturs and odd symbolism of a human being’s dreams.


A random sample of 3 out of the 35 dreams that I recorded:

A. “You were watching planes taking off and landing from an airport in the Bronx, its runway fringed with trees. Prior to that, you’d been riding a subway line that did a loop around a park in the Bronx that felt a lot like Van Cortlandt. While you watched the planes, however, there was an aerial threat. Witches, on broomsticks, who could assault us from above with some kind of flaming bombs. It was a constant, enduring threat, but not enough to make us stop plane-watching.”

B. “A hardware store opened beneath the apartment, and you were very excited.”

C. “You were taking the subway. You were above-ground somewhere in Brooklyn, trying to get further into to the borough, away from Manhattan. You entered a vast, cavernous station, a chamber that sloped down, with a complex series of openings in the top that allowed in the sun. A pool/fountain cascaded its way down the length of the station. You were alone, though you knew that the bottom led to a bigger station with more people. You wanted to add it to your ‘Top 3’ favorite subway station list.


All 3 of these dreams are among the last 10 dreams that I’ve remembered, so therefore they were among those 10 that were read by the script. I fed these 3 dreams plus 7 more into the Da Vinci GPT-3 model, the most powerful of OpenAI’s models (for commercial applications, users might choose a faster model at the expense of power). I set the ‘temperature’ — a measure of the ‘risk’ the model is willing take take — to 0.9, quite high, which yielded the most interesting and non-literal results. You can read more about how OpenAI categorizes ‘riskiness’ in AI responses here.

After experimenting with various prompts, you can quickly gain a general sense of where the machine demonstrates consistency and cohesiveness in its textual analysis. I ultimately settled on asking 3 generally useful questions, from most-specific to least-specific, that would hopefully yield a useful high-level summary of how my dreams had been thematically trending:


1. openai.Completion.create(

model=“text-davinci-002″,

prompt=“Is the following list of dreams predominately positive or predominately negative or predominately neutral? How many are positive? How many are negative? How many are neutral?” + STR_dreams_last_10,

max_tokens=2000,

temperature=0.9)

2. openai.Completion.create(

model=“text-davinci-002″,

prompt=“What does the following list of dreams say about the dreamer’s mental state?” + STR_dreams_last_10,

max_tokens=2000,

temperature=0.9)

3. openai.Completion.create(

model=“text-davinci-002″,

prompt=“What is the overarching theme of the following list of dreams?” + STR_dreams_last_10,

max_tokens=2000,

temperature=0.9)


The 1st time I ran these prompts, the model returned the following for each:

1. “text”: “\n\nThere are 5 positive dreams, 2 negative dreams, and 3 neutral dreams.”

2. “text”: “\n\nThe dreamer may be experiencing anxiety about an upcoming event.”

3. “text”: “\n\nThe overarching theme of the following dreams is escape.”


All 3 responses are interesting, and at least 1 + 2 are largely accurate. The 1st prompt — ‘predominately positive or predominately negative?’ — is just rough sentiment analysis, and the easiest for which to gauge accuracy. After tallying up each dream with my own judgments of relative positivity or negativity, I concurred with the GPT-3 count of 5 positive (one of which was the hardware store dream), 2 negative (one of which was the plane-watching, witch-attack dream), and 3 neutral (one of which was the subway station dream). That’s impressive, despite the potential for bias in my own sentiment analysis of my own dreams (a truly impartial analysis would have asked others to rank my dreams in order to test against GPT-3’s claims, but I didn’t bother doing so), and we can generally count this 1 out of the 3 prompts as useful and accurate so far. The script’s response to the 2nd prompt — anxiety about an upcoming event — is more qualitative, but the machine’s answer feels accurate for this prompt as well. I’m fairly easy-going, and am often not particularly concerned about future events, but there are several looming events in my life that I’m definitely apprehensive about at present — much more so than usual — so GPT-3 is making a verifiable claim that generally aligns with specific realities in my life, different from the status quo. Again, an experiment hewing to more empirical methods would probably have asked a sample set of outside observers of my life to rank the accuracy of the machine’s statement given their knowledge of my actual life and mental state, but a simple self-administered sanity check should suffice for the purposes of this little experiment.

The machine’s response to the 3rd prompt is more akin to a newspaper-column horoscope designed to mean all things to all people — GPT-3 could have likely sent back almost any nebulous concept, from the actual ‘escape’ to ‘courage’ to ‘warmth’ or any generalized terms of that nature, and I’d likely have cherry-picked the necessary events and thoughts in my life to support that response. ‘Escape’ does feel like an important theme in my life, but when does it not? Doesn’t everyone seek an escape? Nonetheless, I enjoy a silly verbal Rorschach test as much as the next person, so I decided to keep that 3rd prompt in there as a fun reminder that a machine trained on human data can be as dumb and vague as humans are.

And I’ll plan to keep using this easy-to-run script to get a nifty high-level summary of my dreams in the future. While I’m still a foolish novice in the craft of experimenting with AI language models, this exercise was a useful excuse to set up programmatic access to GPT-3 and play around with it, however trivially.