VIEW SPEECH SUMMARY

1. Overview of Search and Information Retrieval Challenges:
- Traditional search involves users formulating text queries to find relevant items.
- Defining relevancy and building quality reference data for search remains difficult.
- Users may struggle to express precise queries, leading to the need for more natural, flexible search methods.
- AI is revolutionizing search by enabling relevance detection beyond text, including images and other data types.

2. Role of Large Language Models (LLMs) and Context Limits:
- LLMs have context size limits (e.g., 2 million tokens) insufficient for large corporate data.
- LLMs have limited fact recall; retrieval augmented generation (RAG) supplements queries with external context, improving accuracy.
- Vector databases have emerged as specialized search engines enabling effective retrieval for AI-enhanced search.

3. Sparse Embeddings in Search:
- Earlier search used sparse embeddings like TF-IDF and BM25 based on word frequency and importance.
- Sparse embeddings remove stop words, apply normalization, and assign values only to relevant vocabulary dimensions.
- Recent sparse embedding models (e.g., PLATE) can include synonyms, easing synonym handling without manual lists.
- Sparse embeddings excel in exact matches, such as identifiers, but struggle with semantic similarity and synonyms.

4. Dense Embeddings and Neural Approaches:
- Dense embeddings are fixed-size vector representations produced by neural networks, capturing semantic meaning.
- These embeddings better handle synonymy, multiple languages, and varying terminology.
- Transformer models tokenize input text, create contextual embeddings, and pool to a single vector.
- Dense embeddings require chunking long documents, posing challenges for indexing and retrieval.

5. Multi-Vector Embeddings – Emerging Hybrid Approach:
- Representing documents with multiple vectors for different aspects improves search quality.
- Multi-vector search compares query tokens’ vectors to document token vectors, enabling finer semantic matching.
- This approach typically is used in re-ranking top candidate results due to higher computational cost.
- Kasper proposed removing pooling layers in models to obtain multi-vector representations.
- Co-Poly and vision-language models (VLMs) extend this concept to multi-modal inputs, including images and diagrams.

6. Vision-Language Models and OCR Replacement:
- VLMs can process images and text together, enabling search over scanned documents without traditional OCR.
- They break images into patches, create vector representations, and relate visual elements to queries in natural language.
- This method reduces data preparation time and improves search quality for non-textual or hybrid documents.

7. Real-Life Experiment Findings:
- Comparing sparse and dense embedding models on a dishwasher manual:
- Both perform well for direct queries in English.
- Sparse embeddings fail on multilingual queries.
- Dense embeddings handle synonyms and multiple languages better.
- Sparse embeddings better for exact matches like error codes.
- Combining multiple search methods in pipelines (fusion) and applying business rules enhances final result quality.

8. Advanced Search Pipelines and Metadata Filtering:
- LLMs enable parsing user queries to extract metadata filters (e.g., price limits, color preferences).
- Applying these filters on top of search results improves relevance and user experience.
- Combining sparse, dense, and multi-vector methods in parallel with re-ranking offers flexible, high-quality search systems.

---

Actionable Items / Tasks:
- Consider using retrieval augmented generation to improve LLM answers by supplementing queries with relevant context.
- Incorporate both sparse and dense embeddings in search architectures to leverage strengths of each.
- Explore multi-vector embeddings and re-ranking strategies for higher search precision.
- Evaluate vision-language models to enable search over image-based or scanned documents without OCR.
- Develop search pipelines that utilize LLM-based metadata extraction for dynamic filtering of results.
- Implement fusion techniques like reciprocal rank fusion to combine heterogeneous search method outputs.
- Conduct user and query analysis to tailor the combination of methods based on exact match vs. semantic similarity needs.
- Stay updated on emerging multi-modal and multi-vector embedding technologies to enhance information retrieval capabilities.

All the flavors of AI-powered search

10:00 - 10:30, 28th of May (Wednesday) 2025 / DEV AI & DATA STAGE

Machine Learning has revolutionized how we find relevant documents given a query based on its semantics, not keywords. In its basic form, vector search relies on single vector representations using the same model for queries and documents. Currently, we're experiencing a second wave of vector search, enabling new modalities to be searchable. Do you want to search over vast amounts of PDFs? OCR is no longer needed, as modern vector search architectures can handle that with no additional parsing.

Why should you care about search? The "R" in RAG stands for Retrieval, which is just a different name for search. The better you can find what's relevant, the higher the quality of AI responses you may expect.

Let's review the current state of AI-powered search, including multivector representations such as ColPali.

LEVEL:

Basic Advanced Expert

TRACK:

AI/ML Data

TOPICS:

AI FutureTrends ML/DL SoftwareEngineering

Kacper Łukawski

Qdrant