GenTune-CyberDB: Workload-Generative, Cross-Family Auto-Tuning for Cybersecurity Vector Databases
Abstract
Vector databases are essential for AI-driven cybersecurity tasks, such as intrusion detection, anomaly detection, and threat intelligence retrieval, where high-dimensional security data like network traffic patterns, user behavior analytics, and security event logs are processed. However, the performance of these systems often relies on manual selection and tuning of indexing families (e.g., HNSW, IVF-PQ, ScaNN) and hyperparameters, which is inefficient and impractical in dynamic security environments. In this paper, we propose GenTune-CyberDB, a workload-generative, cross-family auto-tuning framework specifically designed for cybersecurity applications. GenTune-CyberDB leverages workload generation to create realistic attack and anomaly detection queries, optimizing database performance for real-time security data processing. It performs multi-objective, multi-fidelity optimization on index families, execution plans, and hyperparameters, considering constraints like latency, memory, and build time, ultimately improving detection efficiency and resource usage. GenTune-CyberDB demonstrates significant gains in recall and latency optimization, achieving up to 60% memory reduction with minimal recall loss (≤1%). The system adapts to evolving attack patterns and workloads, ensuring robustness even with shifts in data distribution. By automating the tuning process, GenTune-CyberDB offers superior performance for cybersecurity deployments compared to traditional, manually-tuned systems, delivering better recall-latency-memory trade-offs and improving overall security infrastructure.