Prompting

Prompt Overview

An Isomeric prompt is the JSON schema passed in the schema key of your request POST body. Each schema strictly follows the JSON Schema spec. A valid Isomeric POST request body looks like this:

{
  "text": [YOUR INPUT TEXT],
  "schema": {
    "type": "object",
    "properties": {
      "title": {
        "description": "The title of the main product described in the text",
        "type": "string"
      }
    }
  }
}

Prompt Design

Designing the right prompt is critical to extracting the right data. Our LLM processes your prompt from top to bottom, and each key/value is its own pass through the model, including any previously generated data. This means that generating a title before generating a price will get you a better result because the price being generated will be inferred given the context of the title. For example:

{
  "type": "object",
  "properties": {
    "price": {
      "description": "Dollar and cents representation of the price of the main product in the provided text",
      "type": "number"
    }
  }
}

This prompt requests the price. Let’s say you are getting the price of an item on Amazon. Typically there are many products (primary, related, frequently bought with) on a single page, and this prompt does not clearly indicate which price you would like to extract, so there is a high likelihood that the LLM will not get it right. However, this prompt:

{
  "type": "object",
  "properties": {
    "title": {
      "type": "string"
    },
    "price": {
      "type": "string"
    }
  }
}

will run two full passes through the model. First to generate the title and second to generate the price. When generating the price output the prompt sent to the model will actually be:

{
  "title": "LEGO Architecture London Model 2201",
  "price": "

The model will continue generating from this point; therefore, the price will be strongly tied to the title. The best approach is to play around with your prompts with multiple sources (i.e., website URLs) until you get the results you’re looking for. Another approach to take for improved results is nesting. For example, if you are looking for recipe data, it’s best to make that your top-level object. Like this:

{
  "type": "object",
  "properties": {
    "recipe": {
      "type": "object",
      "properties": {
        "title": {
          "type": "string"
        },
        "ingredients": {
          "type": "array"
          "items": {
            "type": "object"
            "properties": {
              "ingredient": {
                "type": "string",
              },
              "measure": {
                "type": "string",
              }
              "amount": {
                "type": "number",
              }
            }
          }
        },
        "steps": {
          "type": "array",
          "items": {
            "type": "string"
          }
        }
      }
    }
  }
}

Descriptions

The description value is the most powerful value to consider when designing your prompt. This informs the LLM what data specifically you’re looking for. Use plain english when describing the data definition. For example:

{
  "type": "object"
  "properties": {
    "shipping": {
      "description": "The total calculated shipping costs of all items in the user's shopping cart represented in dollars and cents."
      "type": "number"
    }
  }
}

Types

Every key in a prompt needs a valid type. This will not only constrain the model to generate that type, it will also instruct the model on how to generate responses based on the type.

String

When you want a single-string response, use string. This is good for things like titles, descriptions, summaries, etc.

{
  "first_name": {
    "type": "string"
  }
}

Will produce

{
  "first_name": "Kelley"
}

Number

The number type will constrain the model only to generate a number as a result. Valid number examples are: 2, 22, 22.01, and 22.202. The resulting JSON will be a number therefore not wrapped in quotes.

{
  "amunt": {
    "type": "number"
  }
}

Will produce

{
  "amount": 22
}

Boolean

boolean types will return only true or false. This type is great for keys used to determine something like page type. For example:

{
  "is_product_detail_page": {
    "type": "boolean"
  }
}

Will produce

{
  "is_product_detail_page": true
}

Object

Isomeric objects are powerful prompt types that contain children types. These are important in both generating the output you desire as well as in designing prompts that will yield the best results. Every object must have a properties key describing each property of the object. For example:

{
  "contact_details": {
    "type": "object",
    "properties": {
      "first_name": {
        "type": "string"
      },
      "last_name": {
        "type": "string"
      },
      "email": {
        "type": "string"
      },
      "phone": {
        "type": "string"
      }
    }
  }
}

Will produce

{
  "contact_details": {
    "first_name": "Kelley",
    "last_name": "Morris",
    "email": "kmorris92@bayside-high.edu"
    "phone": "310-555-5252"
  }
}

Array

Isomeric can produce lists of content. Every array type must have an items attribute describing each item of the array. Let’s use reviews as an example.

{
  "reviews": {
    "type": "array",
    "items": {
      "type": "object"
      "properties": {
        "date": {
          "type": "string"
        },
        "author": {
          "type": "string"
        },
        "review": {
          "type": "string"
        },
        "sentiment": {
          "type": "string"
        }
      }
    }
  }
}

Will produce

{
  "reviews": [{
    "date": "12-16-1986",
    "author": "Zack Morris",
    "reviews": "Bayside High is far out.",
    "sentiment": "positive",
  }, {
    "date": "12-15-1986",
    "author": "Kelley Kapowski",
    "reviews": "High school at Bayside was a dream come true.",
    "sentiment": "positive",
  }]
}

Max Items

You can add max_items to control how many items in the list you want generated. This can be specified like so:

{
  "similar_products": {
    "type": "array",
    "max_items": 4,
    "items": {...}
  }
}

Attributes

Stop

As mentioned, Isomeric is an LLM trained on billions of tokens of website data. For every key/value pair, the model runs a full pass from start to finish. Just like working with other LLMs, you need to tell it when to stop generating new tokens. For every type specified above, we have added a default stop sequence. You can override these by specifying your own via the stop key like so:

{
  "email_inbox": {
    "type": "string",
    "stop": "@"
  }
}

Max Tokens

You can control how many tokens are generated per key with the max_tokens attribute like so:

{
  "title": {
    "type": "string",
    "max_tokens": 10
  }
}

Pattern

Pattern is a powerful tool to use regular expressions giving you fine-grained control of the output of the model. Remember that because this is being passed in as json you need to escape your escape characters too. This is an example that will constrain the model to produce prices according to a strict format:

{
  "price": {
    "type": "string",
    "pattern": "\\d+\\.\\d{2}"
  }
}

Options

You can constrain the model to a list of options to choose from. For example, given the following prompt:

{
  "color": {
    "type": "string",
    "options": ["red", "green", "blue"]
  }
}

the model will only produce either red, green, or blue.

Nullable

strings, numbers, and booleans can be marked as nullable. This is helpful to prevent the model from hallucinating if a value does not exist on a website. In which case, a nullable type will return null instead of producing a hallucinated value because no value exists.

{
  "color": {
    "type": "string",
    "nullable": true
  }
}

API Documentation

Data Extraction

Prompt Overview

Prompt Design

Descriptions

Types

String

Number

Boolean

Object

Array

Max Items

Attributes

Stop

Max Tokens

Pattern

Options

Nullable

API Documentation

Data Extraction

​Prompt Overview

​Prompt Design

​Descriptions

​Types

​String

​Number

​Boolean

​Object

​Array

​Max Items

​Attributes

​Stop

​Max Tokens

​Pattern

​Options

​Nullable

Prompt Overview

Prompt Design

Descriptions

Types

String

Number

Boolean

Object

Array

Max Items

Attributes

Stop

Max Tokens

Pattern

Options

Nullable