Skip to main content

Aggregate / Reduce Functions

Aggregate / Reduce functions in FlockMTL perform operations on groups of rows, returning a single result for each group. They're particularly useful for summarizing, ranking, and reordering data, often used with the GROUP BY clause in SQL queries. Leveraging language models, these functions enable advanced tasks like summarization, ranking, and relevance-based filtering, enhancing data analysis and NLP capabilities.

1. Available Aggregate / Reduce Functions

FlockMTL offers several powerful aggregate / reduce functions:

  1. llm_reduce: Aggregates a group of rows using a language model, typically for summarization or text consolidation.

    • Example Use Cases: Summarizing documents, aggregating product descriptions.
  2. llm_first: Returns the most relevant item from a group based on a prompt.

    • Example Use Cases: Selecting the top-ranked document, finding the most relevant product.
  3. llm_last: Returns the least relevant item from a group based on a prompt.

    • Example Use Cases: Finding the least relevant document, selecting the least important product.
  4. llm_rerank: Reorders a list of rows based on relevance to a prompt using a sliding window mechanism.

    • Example Use Cases: Reranking search results, adjusting document or product rankings.

2. How Aggregate / Reduce Functions Work

Aggregate / Reduce functions process groups of rows defined by a GROUP BY clause. They apply language models to the grouped data, generating a single result per group. This result can be a summary, a ranking, or another output defined by the prompt.

3. When to Use Aggregate / Reduce Functions

  • Summarization: Use llm_reduce to consolidate multiple rows.
  • Ranking: Use llm_first, llm_last, or llm_rerank to reorder rows based on relevance.
  • Data Aggregation: Use these functions to process and summarize grouped data, especially for text-based tasks.