# Image captions

When training Stable Diffusion models, captions can help the model better understand the concepts and styles in your training images.&#x20;

## Using captions

Take the following steps to provide captions.

1. Open a text editor and create a new file `captions.json` (you can name it whatever, but it has to be a JSON file)
2. For each image file in your training dataset, write a caption in the JSON file. Each individual caption can be up to 77 tokens long (around 300 characters). Use the filename **including file ending** but without the folder or path:

{% code title="captions.json" overflow="wrap" %}

```json
{
  "myimage0001.jpeg": "ukj soda bottle on a wooden table in warm afternoon lighting, pine trees in the background",
  "myimage0002.jpeg": "two ukj soda bottles next to a tropical waterfall",
  ...
}
```

{% endcode %}

You can also provide multiple captions per image by providing a list. In that case we will randomly sample them during training:

```json
{
  ...
  "myimage0002.jpeg": [
    "two ukj soda bottles next to a tropical waterfall",
    "two ukj soda bottles by a waterfall in the jungle"
    ],
  ...
}
```

4. Go to <https://dreamlook.ai/dreambooth> and enable "Expert mode" at the top of the page.&#x20;
5. Upload the JSON file by clicking on "Image captions" under "Advanced settings".
6. Configure the other parameters for your job and start it. That's it!&#x20;

{% hint style="info" %}
You do not have to provide captions for *all* images. When not providing captions for some of the training images we will fall back to `instance_prompt`.&#x20;
{% endhint %}

{% hint style="info" %}
When using the API, the filename is the last part of the image URL, excluding any URL parameter.&#x20;

For example:

* URL: `https://myserver/images/image001.jpeg?token=x91dj1kjh41bxlj1`
* Filename to use in JSON file: `image001.jpeg`
  {% endhint %}

## Using individual `.txt` files

Some captioning tools use individual `.txt` files for captions:

<figure><img src="/files/LEkeZ7LXJBV4j1nM5p2z" alt=""><figcaption></figcaption></figure>

We provide a Colab notebook to convert captions in this format into our JSON format:

**🔗** [**https://colab.research.google.com/drive/13s9cMduESF4Wzv8tVcajQPLjrdoH5hH3**](https://colab.research.google.com/drive/13s9cMduESF4Wzv8tVcajQPLjrdoH5hH3#scrollTo=jmHafIiwa9F6)

Simply follow the instructions in the notebook to create the JSON file, then upload it to [dreamlook.ai](https://dreamlook.ai/) as described above when configuring your job.

## Failure cases

The job may fail under the following conditions:

* The file could not be parsed as JSON.
* The file contains captions in an invalid format - see above.
* Not a single provided caption could be matched to a filename.&#x20;

If this happens, the tokens used for this job are immediately returned.&#x20;

## How to best write captions?

This is an active field of research and this likely evolve over time.

Since writing captions manually can be quite tedious, a common practice is to use AI models such as GPT4V or BLIP2 to write captions automatically.

**Don't hesitate to ask on** [**our Discord server**](https://discord.gg/yX9D9KxHMS) **if you are looking for more guidance!**


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.dreamlook.ai/advanced-features/image-captions.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
