CameoDB Engine
A high-performance, distributed, shared-nothing hybrid-search database built in Rust 2024 Edition.
Combines the reliability of ACID-compliant key-value storage (redb), flexible document modeling, and full-text search (Tantivy) in a multi-tenant, horizontally scalable architecture.
stars Key Features
Multi-Tenant Architecture
Complete index isolation with dynamic scaling.
Atomic Batch Operations
High-throughput bulk processing with ACID guarantees.
Hybrid Storage
Combined KV store (redb) + full-text search (Tantivy).
Schema Management
Dynamic schema evolution with type validation.
Distributed Ready
Actor-based architecture with consistent hashing.
Performance Optimized
Smart Commits, memory budgets, and adaptive batching.
Query Language & Syntax
CameoDB features a powerful, strict-typed query parser supporting complex boolean logic, phrase matching, wildcards, and deep JSON traversal.
rocket_launch Quick Start References
For detailed, step-by-step instructions, visit our Interactive Quickstart Guide. Below are the raw commands for rapid setup.
Option 1: Docker Hub (Recommended)
dns 1. Start Server
# Create data directory with proper permissions
mkdir -p $(pwd)/data/cameodb
# Pull and run CameoDB from Docker Hub
docker run -d \
--name cameodb-server \
--user $(id -u):$(id -g) \
-p 9480:9480 \
-p 9580:9580 \
-v $(pwd)/data/cameodb:/data/cameodb \
-e RUST_LOG=error \
goranc/cameodb:latest
terminal 2. Run Client
# Run interactive client
docker run --rm -it \
--name cameodb-client \
--network host \
goranc/cameodb:latest \
client --interactive
Option 2: Build from Source
# Clone and start CameoDB
cargo run --bin cameodb
# CameoDB starts on http://localhost:9480 by default
account_tree Distributed Architecture Overview
CameoDB is designed as a distributed, shared-nothing cluster:
- Per-node storage is handled by the server crate with actors (
NodeOrchestrator,MicroshardActor) on top of redb + Tantivy. - Routing & clustering use a
ClusterCoordinatoractor with a consistent hash ring and libp2p Kademlia DHT. - Remote execution is powered by Kameo remote actors over a custom libp2p swarm (TCP/QUIC/Noise/Yamux, no mDNS).
- Scatter–gather search and multi-node writes are implemented via a
RouterActorthat fans out to peers and aggregates results. - Event-driven metadata - Cluster state transitions and persistence triggered purely by actor messages with no background polling or timeouts.
- State reconciliation - On boot, nodes compare expected cluster topology from snapshots vs actual peer reports, logging discrepancies and converging to distributed reality.
Connection Pool & Cache Invalidation
The RemotePeerPool eliminates repeated swarm registry/DHT lookups on every remote operation:
┌───────────────────────────────────┐
│ RemotePeerPool │
│ RwLock<HashMap<(Uuid, Channel), │
│ RemoteActorRef>> │
├───────────────────────────────────┤
│ get_orchestrator(node, channel) │──→ cache hit: clone ref
│ get_coordinator(node) │──→ cache miss: lookup + cache
│ invalidate_peer(node) │──→ evict all refs for node
│ invalidate_all() │──→ full cache clear
└───────────────────────────────────┘
▲
│ invalidate_peer()
┌───────────────┴───────────────┐
│ ClusterCoordinator │
│ handle(PeerLost { node_id }) │
└───────────────────────────────┘
▲
│ swarm event
Peer disconnected
route Operation Routing Workflows
Every client request follows the same top-level path: HTTP handler → RouterActor → ClusterCoordinator routing decision → execute. The routing decision determines whether the operation runs locally, is forwarded to a single remote node (unicast), or is fanned out to all nodes (broadcast).
Routing Decision Logic
┌──────────────────────┐
│ ClusterCoordinator │
│ RouteOperation msg │
└─────────┬────────────┘
│
routing_key present?
┌────────┴────────┐
YES NO
│ │
Hash ring lookup RoutingDecision::
│ Broadcast
owner == local?
┌─────┴─────┐
YES NO
│ │
RoutingDecision RoutingDecision::Remote
::Local { node_id, peer_addr }
- Local: The owning shard lives on this node. Execute directly.
- Remote: The owning shard lives on another node. Forward via cached
RemoteActorRef. - Broadcast: No routing key (e.g. search). Fan out to local + all known peers, merge results.
Read (Search) Workflow
Searches have no routing key, so they always broadcast to gather results from all nodes.
HTTP POST /api/{index}/search │ ▼ RouterActor::route_and_handle(routing_key=None) │ ▼ RoutingDecision::Broadcast │ ├── LOCAL ──→ Worker Pool (or actor mailbox fallback) │ └── OrchestratorEngine::orch_search() │ └── Fan out to all local MicroshardActors │ └── spawn_blocking { store.search() } │ └── REMOTE (per peer, up to fanout_limit) ──→ try_remote() │ ▼ RemotePeerPool::get_orchestrator(node_id) ◄── cache hit: O(1) ├── RwLock read → HashMap lookup ◄── cache miss: swarm lookup, then cached │ ▼ remote_ref.ask(&ClientOp::Search) │ ▼ Remote node executes same local search path │ ▼ ┌────────────────────────────────────────────┐ │ Merge: bounded score-aware top-K merge, │ │ then truncate to the requested limit │ └────────────────────────────────────────────┘
Bulk Write Workflow
Bulk writes are the most complex path: documents are routed individually, then grouped by owning node for batched forwarding.
HTTP POST /api/{index}/_bulk │ ▼ RouterActor::route_and_handle(routing_hint=first_doc.id) │ ▼ Routed to one node (usually local for the first doc) │ ▼ NodeOrchestrator::orch_bulk_write(index, docs[]) │ ├── 1. Schema Resolution │ └── Fingerprint cache → shard fallback │ ├── 2. Staged Schema Validation │ └── Parallel Rayon validation + sequential evolution │ ├── 3. Per-Document Routing (spawn_blocking + Rayon par_iter) │ └── For each doc: hash(routing_key) → ConsistentRing → target shard │ ├── 4. Separate Local vs Remote │ ├── shard in self.shards → local_docs │ └── shard owned by other node → remote_docs (grouped by node_id) │ ├── 5. Phase 3.1: Parallel Local Shard Processing │ └── Per-shard MicroshardActor::write_batch() │ └── writer_thread → redb WAL + Tantivy index │ └── 6. Phase 3.2: Parallel Remote Forwarding (futures::join_all) │ for each (node_id, docs_for_remote): │ ▼ NodeOrchestrator::forward_bulk_to_remote() │ ▼ RemotePeerPool::get_orchestrator(node_id) ◄── cached lookup │ ▼ remote_ref.ask(&ClientOp::BulkWrite) │ ▼ Remote node runs orch_bulk_write() (recursive, same path)
api HTTP API Reference
CameoDB provides a comprehensive REST API for document management, search, and system administration.
Search Operations
/api/{index}/search
Search documents within an index with relevance scoring. Returns a single JSON payload.
curl -s -X POST http://localhost:9480/api/books/search \
-H "Content-Type: application/json" \
-d '{
"query": "science fiction space",
"limit": 10
}'
- Supplying an explicit list in the payload:
"fields": ["title", "author", "year"] - Embedding a return clause at the end of the query:
"query": "space opera return title,author"
/api/{index}/search/stream
Get search results as a real-time stream (NDJSON) for large result sets.
curl -s -X POST http://localhost:9480/api/books/search/stream \
-H "Content-Type: application/json" \
-d '{"query": "fantasy adventure"}' \
--no-buffer
Document Operations
/api/{index}/document
Insert or update a single document.
curl -s -X PUT http://localhost:9480/api/books/document \
-H "Content-Type: application/json" \
-d '{
"id": "book_001",
"routing_key": "book_001",
"doc": {
"title": "The Rust Programming Language",
"author": "Steve Klabnik",
"publication_year": 2018,
"genres": ["Programming", "Technical"]
}
}'
/api/{index}/_bulk
Insert or update multiple documents in a single atomic operation.
curl -s -X POST http://localhost:9480/api/books/_bulk \
-H "Content-Type: application/json" \
-d '[
{
"id": "book_002",
"doc": {
"title": "Clean Code",
"author": "Robert C. Martin"
}
}
]'
/api/{index}/document/stream
Insert or update multiple documents using NDJSON streaming for large datasets.
cat << 'EOF' | curl -s -X POST http://localhost:9480/api/books/document/stream \
-H "Content-Type: application/json" \
--data-binary @-
{"id": "book_002", "doc": {"title": "Clean Code", "author": "Robert C. Martin", "genres": ["Programming"]}}
{"id": "book_003", "doc": {"title": "Design Patterns", "author": "Gang of Four", "genres": ["Programming", "Software Engineering"]}}
EOF
Index Management & System
/api/{index}/_config
Retrieve current schema.
curl -s http://localhost:9480/api/books/_config
/api/{index}
Permanently delete an index.
curl -s -X DELETE http://localhost:9480/api/books
/_indexes
List all available indexes.
curl -s http://localhost:9480/_indexes
/_cluster/health
Cluster health check.
curl -s http://localhost:9480/_cluster/health
settings Configuration Options
CameoDB configuration via cameodb.toml mirrors the runtime struct layout:
[node]
label = "cameo-node-01"
zone = "default"
[network.http]
bind_address = "0.0.0.0"
port = 9480
request_timeout_secs = 30
max_body_size_mb = 200
cors_allowed_origins = ["*"]
[network.cluster]
enabled = true
bind_address = "0.0.0.0"
port = 9580
cluster_name = "cameodb-cluster"
seed_nodes = []
# cluster_nodes = ["/ip4/10.0.1.5/tcp/9580"] # Optional validation list
[storage]
data_paths = ["./data/cameodb"]
disk_usage_threshold_percent = 90
wal_sync = true
wal_segment_size_mb = 64
default_batch_size = 1000
num_shards_init = 4
max_shards_per_node = 8
[search]
indexer_memory_min_mb = 32
indexer_memory_max_mb = 512
total_memory_limit_mb = 4096
memory_pressure_threshold_percent = 80
search_threads = 8
enable_streaming_search = true
max_concurrent_shard_searches = 32
max_concurrent_remote_searches = 8
enable_early_termination = true
supervisor_timeout_secs = 5
default_search_limit = 10
nodeprovides human-friendly identity fields (label, zone).networkseparates HTTP and cluster transport while clarifying bind_address.storagecentralizes shard configuration plus disk thresholds.searchexposes indexer memory budgets, streaming search settings, concurrency caps, supervisor timeout, and default_search_limit.
deployed_code Docker Deployment
CameoDB provides configurations for both single-node and multi-node cluster deployments using Docker Compose.
1. Single-Node
Ideal for local development. Uses docker-compose.yml.
mkdir -p data/cameodb
docker-compose -f docker/docker-compose.yml up -d
Access: http://localhost:9480
2. Multi-Node Cluster
Runs a 3-node cluster with NGINX load balancer.
mkdir -p data/cameodb/node{1,2,3}
docker-compose -f docker/docker-compose-cluster.yml up -d
Load Balanced: http://localhost:9480
Direct: ports 9481, 9482, 9483
Docker Run vs Compose Equivalent
| Docker Run Flag | Docker Compose Equivalent |
|---|---|
| -p 9480:9480 -p 9580:9580 | ports: ["9480:9480", "9580:9580"] |
| -v $(pwd)/data/cameodb:/data/cameodb | volumes: ["../data/cameodb:/data/cameodb"] |
| -e RUST_LOG=info | environment: ["RUST_LOG=info"] |
| --restart unless-stopped | restart: unless-stopped |
| --user 65532:65532 | user: "65532:65532" |
inventory_2 RPM / DEB Package Building
CameoDB supports building RPM and DEB packages for x86_64 Linux distributions using cargo-zigbuild or Docker for cross-compilation.
Automated Build Script (Recommended for CI/CD)
This script handles both RPM and DEB package generation in one run with persistent caching.
chmod +x build-dist.sh
./build-dist.sh
Manual RPM Generation (cargo-zigbuild)
cargo install cargo-zigbuild cargo-generate-rpm
RUSTFLAGS="-C target-feature=+crt-static -C relocation-model=pie -C relro-level=full -C link-arg=-pie -C link-arg=-static" \
cargo zigbuild --release --target x86_64-unknown-linux-musl --no-default-features
cargo generate-rpm -p crates/server --target x86_64-unknown-linux-musl --auto-req disabled \
-o target/x86_64-unknown-linux-musl/release/cameodb-0.2.2-1.x86_64.rpm \
--set-metadata 'package.name="cameodb"'