Documentation | CameoDB

stars Key Features

sync

Multi-Tenant Architecture

Complete index isolation with dynamic scaling.

bolt

Atomic Batch Operations

High-throughput bulk processing with ACID guarantees.

search

Hybrid Storage

Combined KV store (redb) + full-text search (Tantivy).

schema

Schema Management

Dynamic schema evolution with type validation.

lan

Distributed Ready

Actor-based architecture with consistent hashing.

memory

Performance Optimized

Smart Commits, memory budgets, and adaptive batching.

manage_search

Query Language & Syntax

CameoDB features a powerful, strict-typed query parser supporting complex boolean logic, phrase matching, wildcards, and deep JSON traversal.

menu_book Read Syntax Guide

rocket_launch Quick Start References

For detailed, step-by-step instructions, visit our Interactive Quickstart Guide. Below are the raw commands for rapid setup.

Option 1: Docker Hub (Recommended)

dns 1. Start Server

# Create data directory with proper permissions
mkdir -p $(pwd)/data/cameodb

# Pull and run CameoDB from Docker Hub
docker run -d \
  --name cameodb-server \
  --user $(id -u):$(id -g) \
  -p 9480:9480 \
  -p 9580:9580 \
  -v $(pwd)/data/cameodb:/data/cameodb \
  -e RUST_LOG=error \
  goranc/cameodb:latest

terminal 2. Run Client

# Run interactive client
docker run --rm -it \
  --name cameodb-client \
  --network host \
  goranc/cameodb:latest \
  client --interactive

Option 2: Build from Source

# Clone and start CameoDB
cargo run --bin cameodb

# CameoDB starts on http://localhost:9480 by default

account_tree Distributed Architecture Overview

CameoDB is designed as a distributed, shared-nothing cluster:

Per-node storage is handled by the server crate with actors (NodeOrchestrator, MicroshardActor) on top of redb + Tantivy.
Routing & clustering use a ClusterCoordinator actor with a consistent hash ring and libp2p Kademlia DHT.
Remote execution is powered by Kameo remote actors over a custom libp2p swarm (TCP/QUIC/Noise/Yamux, no mDNS).
Scatter–gather search and multi-node writes are implemented via a RouterActor that fans out to peers and aggregates results.
Event-driven metadata - Cluster state transitions and persistence triggered purely by actor messages with no background polling or timeouts.
State reconciliation - On boot, nodes compare expected cluster topology from snapshots vs actual peer reports, logging discrepancies and converging to distributed reality.

Connection Pool & Cache Invalidation

The RemotePeerPool eliminates repeated swarm registry/DHT lookups on every remote operation:

                    ┌───────────────────────────────────┐
                    │         RemotePeerPool            │
                    │  RwLock<HashMap<(Uuid, Channel),  │
                    │         RemoteActorRef>>          │
                    ├───────────────────────────────────┤
                    │  get_orchestrator(node, channel)  │──→ cache hit: clone ref
                    │  get_coordinator(node)            │──→ cache miss: lookup + cache
                    │  invalidate_peer(node)            │──→ evict all refs for node
                    │  invalidate_all()                 │──→ full cache clear
                    └───────────────────────────────────┘
                                    ▲
                                    │ invalidate_peer()
                    ┌───────────────┴───────────────┐
                    │  ClusterCoordinator           │
                    │  handle(PeerLost { node_id }) │
                    └───────────────────────────────┘
                                    ▲
                                    │ swarm event
                              Peer disconnected

route Operation Routing Workflows

Every client request follows the same top-level path: HTTP handler → RouterActor → ClusterCoordinator routing decision → execute. The routing decision determines whether the operation runs locally, is forwarded to a single remote node (unicast), or is fanned out to all nodes (broadcast).

Routing Decision Logic

                         ┌──────────────────────┐
                         │  ClusterCoordinator  │
                         │  RouteOperation msg  │
                         └─────────┬────────────┘
                                    │
                         routing_key present?
                           ┌────────┴────────┐
                          YES                NO
                           │                 │
                    Hash ring lookup    RoutingDecision::
                           │              Broadcast
                    owner == local?
                     ┌─────┴─────┐
                    YES          NO
                     │           │
              RoutingDecision  RoutingDecision::Remote
                ::Local        { node_id, peer_addr }

Local: The owning shard lives on this node. Execute directly.
Remote: The owning shard lives on another node. Forward via cached RemoteActorRef.
Broadcast: No routing key (e.g. search). Fan out to local + all known peers, merge results.

Read (Search) Workflow

Searches have no routing key, so they always broadcast to gather results from all nodes.

HTTP POST /api/{index}/search
  │
  ▼
RouterActor::route_and_handle(routing_key=None)
  │
  ▼ RoutingDecision::Broadcast
  │
  ├── LOCAL ──→ Worker Pool (or actor mailbox fallback)
  │               └── OrchestratorEngine::orch_search()
  │                     └── Fan out to all local MicroshardActors
  │                           └── spawn_blocking { store.search() }
  │
  └── REMOTE (per peer, up to fanout_limit) ──→ try_remote()
        │
        ▼
      RemotePeerPool::get_orchestrator(node_id)    ◄── cache hit: O(1)
        ├── RwLock read → HashMap lookup           ◄── cache miss: swarm lookup, then cached
        │
        ▼
      remote_ref.ask(&ClientOp::Search)
        │
        ▼
      Remote node executes same local search path
        │
        ▼
  ┌────────────────────────────────────────────┐
  │  Merge: bounded score-aware top-K merge,   │
  │  then truncate to the requested limit      │
  └────────────────────────────────────────────┘

Bulk Write Workflow

Bulk writes are the most complex path: documents are routed individually, then grouped by owning node for batched forwarding.

HTTP POST /api/{index}/_bulk
  │
  ▼
RouterActor::route_and_handle(routing_hint=first_doc.id)
  │
  ▼ Routed to one node (usually local for the first doc)
  │
  ▼
NodeOrchestrator::orch_bulk_write(index, docs[])
  │
  ├── 1. Schema Resolution
  │     └── Fingerprint cache → shard fallback
  │
  ├── 2. Staged Schema Validation
  │     └── Parallel Rayon validation + sequential evolution
  │
  ├── 3. Per-Document Routing (spawn_blocking + Rayon par_iter)
  │     └── For each doc: hash(routing_key) → ConsistentRing → target shard
  │
  ├── 4. Separate Local vs Remote
  │     ├── shard in self.shards → local_docs
  │     └── shard owned by other node → remote_docs (grouped by node_id)
  │
  ├── 5. Phase 3.1: Parallel Local Shard Processing
  │     └── Per-shard MicroshardActor::write_batch()
  │           └── writer_thread → redb WAL + Tantivy index
  │
  └── 6. Phase 3.2: Parallel Remote Forwarding (futures::join_all)
        │
        for each (node_id, docs_for_remote):
          │
          ▼
        NodeOrchestrator::forward_bulk_to_remote()
          │
          ▼
        RemotePeerPool::get_orchestrator(node_id)    ◄── cached lookup
          │
          ▼
        remote_ref.ask(&ClientOp::BulkWrite)
          │
          ▼
        Remote node runs orch_bulk_write() (recursive, same path)

api HTTP API Reference

CameoDB provides a comprehensive REST API for document management, search, and system administration.

Search Operations

POST /api/{index}/search

Search documents within an index with relevance scoring. Returns a single JSON payload.

curl -s -X POST http://localhost:9480/api/books/search \
  -H "Content-Type: application/json" \
  -d '{
    "query": "science fiction space",
    "limit": 10
  }'

Return fields list: You can ask CameoDB to return only a subset of document fields by either:

Supplying an explicit list in the payload: "fields": ["title", "author", "year"]
Embedding a return clause at the end of the query: "query": "space opera return title,author"

POST /api/{index}/search/stream

Get search results as a real-time stream (NDJSON) for large result sets.

curl -s -X POST http://localhost:9480/api/books/search/stream \
  -H "Content-Type: application/json" \
  -d '{"query": "fantasy adventure"}' \
  --no-buffer

Document Operations

PUT /api/{index}/document

Insert or update a single document.

curl -s -X PUT http://localhost:9480/api/books/document \
  -H "Content-Type: application/json" \
  -d '{
    "id": "book_001",
    "routing_key": "book_001",
    "doc": {
      "title": "The Rust Programming Language",
      "author": "Steve Klabnik",
      "publication_year": 2018,
      "genres": ["Programming", "Technical"]
    }
  }'

POST /api/{index}/_bulk

Insert or update multiple documents in a single atomic operation.

curl -s -X POST http://localhost:9480/api/books/_bulk \
  -H "Content-Type: application/json" \
  -d '[
    {
      "id": "book_002",
      "doc": {
        "title": "Clean Code",
        "author": "Robert C. Martin"
      }
    }
  ]'

POST /api/{index}/document/stream

Insert or update multiple documents using NDJSON streaming for large datasets.

cat << 'EOF' | curl -s -X POST http://localhost:9480/api/books/document/stream \
  -H "Content-Type: application/json" \
  --data-binary @-
{"id": "book_002", "doc": {"title": "Clean Code", "author": "Robert C. Martin", "genres": ["Programming"]}}
{"id": "book_003", "doc": {"title": "Design Patterns", "author": "Gang of Four", "genres": ["Programming", "Software Engineering"]}}
EOF

Index Management & System

GET /api/{index}/_config

Retrieve current schema.

curl -s http://localhost:9480/api/books/_config

DELETE /api/{index}

Permanently delete an index.

curl -s -X DELETE http://localhost:9480/api/books

GET /_indexes

List all available indexes.

curl -s http://localhost:9480/_indexes

GET /_cluster/health

Cluster health check.

curl -s http://localhost:9480/_cluster/health

settings Configuration Options

CameoDB configuration via cameodb.toml mirrors the runtime struct layout:

[node]
label = "cameo-node-01"
zone = "default"

[network.http]
bind_address = "0.0.0.0"
port = 9480
request_timeout_secs = 30
max_body_size_mb = 200
cors_allowed_origins = ["*"]

[network.cluster]
enabled = true
bind_address = "0.0.0.0"
port = 9580
cluster_name = "cameodb-cluster"
seed_nodes = []
# cluster_nodes = ["/ip4/10.0.1.5/tcp/9580"] # Optional validation list

[storage]
data_paths = ["./data/cameodb"]
disk_usage_threshold_percent = 90
wal_sync = true
wal_segment_size_mb = 64
default_batch_size = 1000
num_shards_init = 4
max_shards_per_node = 8

[search]
indexer_memory_min_mb = 32
indexer_memory_max_mb = 512
total_memory_limit_mb = 4096
memory_pressure_threshold_percent = 80
search_threads = 8
enable_streaming_search = true
max_concurrent_shard_searches = 32
max_concurrent_remote_searches = 8
enable_early_termination = true
supervisor_timeout_secs = 5
default_search_limit = 10

node provides human-friendly identity fields (label, zone).
network separates HTTP and cluster transport while clarifying bind_address.
storage centralizes shard configuration plus disk thresholds.
search exposes indexer memory budgets, streaming search settings, concurrency caps, supervisor timeout, and default_search_limit.

deployed_code Docker Deployment

CameoDB provides configurations for both single-node and multi-node cluster deployments using Docker Compose.

1. Single-Node

Ideal for local development. Uses docker-compose.yml.

mkdir -p data/cameodb
docker-compose -f docker/docker-compose.yml up -d

Access: http://localhost:9480

2. Multi-Node Cluster

Runs a 3-node cluster with NGINX load balancer.

mkdir -p data/cameodb/node{1,2,3}
docker-compose -f docker/docker-compose-cluster.yml up -d

Load Balanced: http://localhost:9480

Direct: ports 9481, 9482, 9483

Docker Run vs Compose Equivalent

Docker Run Flag	Docker Compose Equivalent
-p 9480:9480 -p 9580:9580	ports: ["9480:9480", "9580:9580"]
-v $(pwd)/data/cameodb:/data/cameodb	volumes: ["../data/cameodb:/data/cameodb"]
-e RUST_LOG=info	environment: ["RUST_LOG=info"]
--restart unless-stopped	restart: unless-stopped
--user 65532:65532	user: "65532:65532"

inventory_2 RPM / DEB Package Building

CameoDB supports building RPM and DEB packages for x86_64 Linux distributions using cargo-zigbuild or Docker for cross-compilation.

Automated Build Script (Recommended for CI/CD)

This script handles both RPM and DEB package generation in one run with persistent caching.

chmod +x build-dist.sh
./build-dist.sh

Manual RPM Generation (cargo-zigbuild)

cargo install cargo-zigbuild cargo-generate-rpm

RUSTFLAGS="-C target-feature=+crt-static -C relocation-model=pie -C relro-level=full -C link-arg=-pie -C link-arg=-static" \
cargo zigbuild --release --target x86_64-unknown-linux-musl --no-default-features

cargo generate-rpm -p crates/server --target x86_64-unknown-linux-musl --auto-req disabled \
  -o target/x86_64-unknown-linux-musl/release/cameodb-0.2.2-1.x86_64.rpm \
  --set-metadata 'package.name="cameodb"'

CameoDB Engine