cover image
LanceDB

Senior Software Engineer, Vector Indexing

On site

San francisco, United states

Senior

Full Time

03-11-2025

Share this job:

Skills

Communication Rust Training Architecture Machine Learning Databases apache benchmarking Pandas Data Science

Job Specifications

About LanceDB
LanceDB is a developer-friendly, open-source data lake for multimodal AI. From hyper-scalable vector search to advanced retrieval for RAG, from streaming training data to interactive exploration of large-scale AI datasets, LanceDB is the best foundation for your AI application, and powers some of the most groundbreaking applications and challenging requirements today. About the Role
We're looking for a Software Engineer focused on Vector Indexing to help build the next generation of vector-native data infrastructure. You'll work on high-performance indexing and search systems at the core of LanceDB, enabling scalable similarity search, full-text search, and flexible indexing for the open-source and enterprise communities alike. You'll be responsible for:
Designing, building, and maintaining core vector indexing and search components
Implementing GPU-accelerated indexing algorithms and performance optimizations
Maintaining and evolving vector index algorithms, including pruning, quantization, and graph-based methods
Developing and optimizing full-text search capabilities and integrations
Benchmarking, profiling, and tuning performance across varied workloads
Writing and maintaining documentation, benchmarks, tutorials, and blog posts to support and grow adoption
Engaging with the open-source community: reviewing contributions, triaging issues, and joining design discussions Requirements
Strong proficiency in Rust
Experience designing or implementing vector search or indexing algorithms (eg, HNSW, IVF, PQ, quantization, clustering)
Proficiency in C for GPU-related development
Familiarity with GPU acceleration frameworks (CUDA, ROCm, etc.)
Demonstrated ability to benchmark, profile, and optimize system performance
Excellent written communication and documentation skills
Comfortable collaborating in open-source environments Nice to Have
Understanding of full-text search systems (Lucene, Elasticsearch, Tantivy, etc.)
Experience building or maintaining data systems, databases, or search engines
Familiarity with distributed systems and scale-out architecture
Background in web APIs, embedding serving, or Real Time systems
Contributions to or maintenance of open-source projects What We Offer
A key role shaping an open-source project with real production usage
Remote-first team with flexible hours
Competitive compensation, equity, and benefits
Generous learning budget and support for open-source contributions About the LanceDB Team
LanceDB was created by experts with decades of experience building tools for data science and machine learning. From co-authors of pandas to Apache PMC members of HDFS, Arrow, and Delta, the LanceDB team has created open-source tools used by millions worldwide.

About the Company

LanceDB is a developer-friendly, open source database for multimodal AI. From hyper scalable vector search to advanced retrieval for RAG, from streaming training data to interactive exploration of large scale AI datasets, LanceDB is the best foundation for your AI application. Know more