In information technology, where data seems like a complicated puzzle, vector indexing emerges as a perfect guide. Like a library filled with books, each represented by a unique set of numbers, these are vectors. Now, the Vector index comes to the rescue in this scenario, ready to find the item you want. But it’s not as easy as you think!
In this article, we took you through the difficulties and compromises in vector indexing. We’ll uncover how to efficiently find important details in the vast sea of data, all while making crucial decisions that define the very essence of this technical term!
What are the Challenges in Vector Indexing?
These are the common challenges every individual has to face when they are about to go with Vector indexing:
- Speed vs Accuracy: Choosing a vector index strategy involves considering how fast you need results and how accurate they should be. Every method strikes a balance between the speed of retrieving data and the precision of results.
- Data Storage: Different algorithms may significantly increase the amount of data needed to run searches efficiently, impacting the memory used by the index.
- Building Complexity: Before executing queries, the index must be built. The complexity of this process needs careful consideration.
- Update Difficulty: It is crucial to assess how challenging it is to update the index when new vectors are added. If recomputing the entire index with each addition is necessary, the index update frequency becomes a key question.
These were the challenges, but there are some of the good compromises that can be a super helping for you to look into.
What are the Trade-offs in Vector Indexing?
Now, let’s talk about the choices we have to make. These are the important decisions that affect how well our data system works:
1. On-prem vs. Cloud Hosting for Vector Indexing:
Choosing between on-premises and cloud hosting for vector indexing involves trade-offs in scalability and costs, considering embedded database architectures.
These are some of the aspects you have to trade between:
Client-Server vs. Embedded:
Balancing between client-server and embedded architectures is crucial. LanceDB’s default embedded, serverless design and alternatives like Chroma offer choices.
Flexibility and Privacy in Embedded Vector Databases:
For organisations managing on-premise and cloud storage, embedded/serverless vector database, exemplified by LanceDB, provide flexibility, especially for security-conscious setups handling sensitive data.
Cost Balance in Vector Indexing:
Cloud-native solutions bill based on stored data and queries, suitable for some. However, organisations with robust in-house teams may find on-prem or embedded setups more cost-effective, particularly with substantial data volumes.
2. Purpose-Built vs. Incumbent Vendors
Traditional databases (Elasticsearch, MongoDB) may have added vector search features, but be cautious of limitations like usage restrictions or cloud-exclusive options.
Consider purpose-built vendors (Qdrant, Weaviate, LanceDB) designed for efficient vector indexing. For scalability, they optimise storage, indexing, and querying using modern languages (Go, Rust). With Python or Javascript clients, testing becomes straightforward.
Benchmark studies consistently favour purpose-built solutions, emphasising their prowess in vector indexing.
3. Adding vs. Finding Speed in Vector Indexing
In vector indexing, there’s a choice between how fast you can add information (insertion speed) and how quickly you can find it (query speed).
Some tools, like Milvus/Zilliz, are great for situations where you need to add and find lots of information quickly, like video surveillance or tracking financial transactions.
However, finding information fast is more important for most groups than quickly adding it. That’s where something like Qdrant, a special tool built for this, comes in. It not only adds information fast but can also find things super quickly, giving you relevant results in just a few milliseconds when searching in real-time.
4. Recall vs. Latency in Vector Indexing
In vector indexing, the choice between recall (result accuracy) and latency (result speed) varies among database vendors.
Flat Index: Stores vectors directly for high accuracy but slower kNN searches.
IVF-Flat Indexes: Use inverted file indexes for quicker searches, sacrificing some recall accuracy.
IVF-PQ: Merges IVF with product quantification, reducing memory usage and speeding up search while maintaining better recall.
HNSW Index: Widely popular, balancing recall and memory efficiency. HNSW-PQ improves recall and memory efficiency compared to IVF-PQ.
Vamana Index: Newcomer optimised for on-disk performance, potentially handling larger-than-memory vector data efficiently—limited adoption due to on-disk performance challenges.
For early stages, IVF-PQ might suffice where perfect relevance is optional. Yet, purpose-built vendors often favour the HNSW index for enhanced quality and relevance.
5. Optimizing Storage in Vector Indexing
The in-memory vs. on-disk storage problem is pivotal in the vector indexing realm. In-memory databases like Redis offer speed but face limitations with large vectors. Scaling beyond memory, solutions like Qdrant and Weaviate employ memory-mapped files for near-in-memory speeds without loading the entire dataset into RAM.
Indexing, especially with memory-intensive HNSW, poses challenges. Combining Product Quantization (PQ) with HNSW effectively reduces memory requirements. Vamana, part of the DiskANN algorithm, competes by excelling in scaling to larger-than-memory indexes, achieved purely on disk.
Ultimately, the storage choice significantly impacts speed, scalability, and memory efficiency in vector indexing databases.
6. Sparse vs. Dense Storage
In vector indexing, embeddings from models like sentence transformers are typically dense, consisting entirely of non-zero floats. Alternatively, sparse vectors, computed through algorithms like BM25 or SPLADE, focus on relative word frequencies per document, often with many zero values. Elasticsearch introduces its proprietary pre-trained sparse model, ELSER, designed for English with roughly 30,000 dimensions.
Despite having lower computational and storage costs due to sparsity, dense vectors, especially from transformer models, excel in compressing language semantics. However, they come with a higher indexing cost, a crucial consideration for datasets reaching the scale of 100 million vectors.
7. Full-Text vs. Vector Hybrid
Vector search, while useful, only fits some needs, especially for big business applications. Colin Harman points out that it might prioritise similar results over exact matches, which can sometimes be a problem.
To tackle this, a smart solution is to blend full-text and vector search strengths. For important cases, you can prioritise exact matches using keyword-based searches. Also, improving results by combining strategies, like using both an inverted and vector index, boosts quality.
Vespa, a great search engine using an HNSW-IVF index, is an example of this smart mix, providing a balanced solution for the various challenges of optimising searches.
8. Optimizing Search Filters
In real-world searches, it’s not just about keywords; filtering by attributes matters. Consider clothing searches where size is crucial; the filtering strategy makes a big difference.
Pre-filtered search: Applying size filters before vector search might make sense, but it can cause issues like missing relevant results.
Post-filtered search: Returning top-k nearest neighbours and then applying size filters can give uneven results, especially when the filtered attribute is a small part of the dataset.
Custom-filtered search: Intermediate methods like Weaviate and Qdrant strike a balance. Weaviate uses an “allow list” for effective pre-filtering, while Quadrant ensures a connected HNSW graph by forming edges between categories during indexing.
Thus, Having Challenges is not mean that you can not make it out of your Vector index; by simply applying these trade-offs, you can stay away from any of the future threats.
Conclusion
We’ve explored a bunch of challenges and choices for Vector indexing. Whether deciding between cloud and on-premises hosting or finding the right balance between speed and accuracy, it’s like finding the perfect tuning.
We talked about databases designed just for vectors and how sometimes combining different search methods can give us better results. Remember, there’s no one-size-fits-all solution, but we can navigate vector indexing by understanding these challenges!