Aggregate / Reduce Functions
Aggregate / Reduce functions in FlockMTL perform operations on groups of rows, returning a single result for each group. They're particularly useful for summarizing, ranking, and reordering data, often used with the GROUP BY
clause in SQL queries. Leveraging language models, these functions enable advanced tasks like summarization, ranking, and relevance-based filtering, enhancing data analysis and NLP capabilities.
1. Available Aggregate / Reduce Functions
FlockMTL offers several powerful aggregate / reduce functions:
-
llm_reduce
: Aggregates a group of rows using a language model, typically for summarization or text consolidation.- Example Use Cases: Summarizing documents, aggregating product descriptions.
-
llm_first
: Returns the most relevant item from a group based on a prompt.- Example Use Cases: Selecting the top-ranked document, finding the most relevant product.
-
llm_last
: Returns the least relevant item from a group based on a prompt.- Example Use Cases: Finding the least relevant document, selecting the least important product.
-
llm_rerank
: Reorders a list of rows based on relevance to a prompt using a sliding window mechanism.- Example Use Cases: Reranking search results, adjusting document or product rankings.
2. How Aggregate / Reduce Functions Work
Aggregate / Reduce functions process groups of rows defined by a GROUP BY
clause. They apply language models to the grouped data, generating a single result per group. This result can be a summary, a ranking, or another output defined by the prompt.
3. When to Use Aggregate / Reduce Functions
- Summarization: Use
llm_reduce
to consolidate multiple rows. - Ranking: Use
llm_first
,llm_last
, orllm_rerank
to reorder rows based on relevance. - Data Aggregation: Use these functions to process and summarize grouped data, especially for text-based tasks.