llm_first Aggregate Function

The llm_first function is used to extract the first matching result that satisfies a condition defined by a model's prompt and column data. It operates across rows, typically combined with a GROUP BY clause, to return the first relevant row for each group.

1. Usage Examples

1.1. Example without `GROUP BY`

Retrieve the first relevant product feature across all rows:

SELECT llm_first(
    {'model_name': 'gpt-4'},
    {'prompt': 'What is the most relevant detail for these products, based on their names and descriptions?'},
    {'product_name': product_name, 'product_description': product_description}
) AS first_product_feature
FROM products;

Description: This query returns the first relevant feature from all product descriptions and product names, based on the provided prompt.

1.2. Example with `GROUP BY`

Retrieve the first relevant product feature for each product category:

SELECT category,
       llm_first(
           {'model_name': 'gpt-4'},
           {'prompt': 'What is the most relevant detail for these products, based on their names and descriptions?'},
           {'product_name': product_name, 'product_description': product_description}
       ) AS first_product_feature
FROM products
GROUP BY category;

Description: The query groups the products by category and returns the first relevant feature for each group.

1.3. Using a Named Prompt with `GROUP BY`

Use a reusable prompt, such as "first-relevant-detail", to extract the first relevant feature for each product category:

SELECT category,
       llm_first(
           {'model_name': 'gpt-4', 'secret_name': 'product_key'},
           {'prompt_name': 'first-relevant-detail', 'version': 1},
           {'product_name': product_name, 'product_description': product_description}
       ) AS first_product_feature
FROM products
GROUP BY category;

Description: This example leverages a named prompt (first-relevant-detail) to extract the first relevant feature for each product category. The query groups the results by category.

1.4. Advanced Example with Multiple Columns and `GROUP BY`

Retrieve the first relevant feature for products grouped by category, using both the product name and description:

WITH product_info AS (
    SELECT category, product_name, product_description
    FROM products
    WHERE category = 'Electronics'
)
SELECT category,
       llm_first(
           {'model_name': 'gpt-4'},
           {'prompt': 'What is the most relevant detail for these products, based on their names and descriptions?'},
           {'product_name': product_name, 'product_description': product_description}
       ) AS first_product_feature
FROM product_info
GROUP BY category;

Description: This query extracts the first relevant feature from both the product_name and product_description columns, grouped by product category (in this case, electronics).

2. Input Parameters

2.1 Model Configuration

Parameter: model_name and secret_name

2.1.1 Model Selection

Description: Specifies the model used for text generation.
Example:
```
{ 'model_name': 'gpt-4' }
```

2.1.2 Model Selection with Secret

Description: Specifies the model along with the secret name to be used for authentication when accessing the model.

Example:

{ 'model_name': 'gpt-4', 'secret_name': 'your_secret_name' }

2.2. Prompt Configuration

Two types of prompts can be used:

Inline Prompt

Directly provides the prompt in the query.

Example:

{'prompt': 'What is the most relevant detail for these products, based on their names and descriptions?'}

Named Prompt
- Refers to a pre-configured prompt by name.
- Example:
```
{'prompt_name': 'first-relevant-detail'}
```
Named Prompt with Version
- Refers to a specific version of a pre-configured prompt.
- Example:
```
{'prompt_name': 'first-relevant-detail', 'version': 1}
```

2.3. Column Mappings (Optional)

Key: Column mappings.
Purpose: Maps table columns to prompt variables for input.

Example:

{'product_name': product_name, 'product_description': product_description}

3. Output

Type: JSON object.
Behavior: The function returns a JSON object containing the values of the columns you provided in the input. The structure of the returned JSON will mirror the input columns' values.

Output Example:
For a query that extracts the first relevant feature, the result could look like:

Input Rows:
- product_name: "Wireless Headphones"
- product_description: "High-quality wireless headphones with noise cancellation."

Output JSON:

{
  "product_name": "Wireless Headphones",
  "product_description": "High-quality wireless headphones with noise cancellation."
}

The output contains the values for product_name and product_description from the first relevant row based on the prompt.

1. Usage Examples​

1.1. Example without GROUP BY​

1.2. Example with GROUP BY​

1.3. Using a Named Prompt with GROUP BY​

1.4. Advanced Example with Multiple Columns and GROUP BY​

2. Input Parameters​

2.1 Model Configuration​

2.1.1 Model Selection​

2.1.2 Model Selection with Secret​

2.2. Prompt Configuration​

2.3. Column Mappings (Optional)​

3. Output​

1. Usage Examples

1.1. Example without `GROUP BY`

1.2. Example with `GROUP BY`

1.3. Using a Named Prompt with `GROUP BY`

1.4. Advanced Example with Multiple Columns and `GROUP BY`

2. Input Parameters

2.1 Model Configuration

2.1.1 Model Selection

2.1.2 Model Selection with Secret

2.2. Prompt Configuration

2.3. Column Mappings (Optional)

3. Output