CAMEODB
search Language Reference

Query Syntax

Master the CameoDB search syntax. Our query parser supports complex boolean logic, strict phrase matching, wildcards, and deep JSON traversal.

data_object Pseudo-grammar

The fundamental structure of a CameoDB query follows this grammar:

query = '(' query ')'
      | query operator query
      | unary_operator query
      | query query
      | clause

operator = 'AND' | 'OR'
unary_operator = 'NOT' | '-'

clause = field_name ':' field_clause
       | defaultable_clause
       | '*'

field_clause = term | term_prefix | term_set | phrase | phrase_prefix | range | '*'
defaultable_clause = term | term_prefix | term_set | phrase | phrase_prefix

title Writing Queries & Escaping

Special Characters

Some characters need to be escaped in non-quoted terms because they are syntactically significant. The special reserved characters are:

+ ^ ` : { } " [ ] ( ) ~ ! \ * SPACE

If these characters appear in your query terms, prefix them with a backslash \. In quoted terms, only the quote character itself ( ' or " ) needs escaping.

Structured Data Constraints

  • Datetime Requirements

    Defaults to strict RFC3339 format (e.g., 1970-01-01T00:00:00Z), but highly configurable in the schema mapping.

  • IP Addresses

    Provide as IPv4 or IPv6. CIDR notation is not supported natively, but standard inclusive/exclusive range queries work perfectly for IP sweeps.

category Data Types & Aliases

CameoDB supports all native Tantivy field types, plus convenient language-specific aliases for broader compatibility when ingesting schemas from other SQL or Python systems.

Core Raw Types (11 Supported)

text string i64 u64 f64 date boolean bytes ip json facet
Alias (Provided in Schema) Maps To Core Type Source Language
float, double, decimal f64 Python, SQL
integer, int, number, signed i64 Python, JavaScript
unsigned, uint u64 C/C++, Rust
bool boolean Python, JavaScript
datetime, timestamp date SQL, Python
binary, blob bytes SQL
object, document json JavaScript, Python
category, tag facet Common terminology

Composite Types: Array types are fully supported for all raw types except objects. Simply wrap the type in your schema declaration, e.g., array<i64>.

schema Document Mapping & Field Config

The document mapping defines exactly how fields are stored, tokenized, and indexed. CameoDB allows strict typing, but can optionally fallback to dynamic schemaless modes.

Variable Description Default
field_mappings Collection of explicit field definitions. []
mode Handling of missing fields. "dynamic" enables schemaless use. dynamic
store_source Whether the original JSON document is stored directly in the index. false
timestamp_field Datetime field used for sharding documents in splits. None
partition_key Field used to route documents into different split partitions. null
index_field_presence Enables exists (field:*) queries for non-fast fields (CPU cost). false

Type Configuration Deep-Dive

Text Fields

Tailored for full-text search. Analyzed and tokenized before indexing.

name: body
type: text
tokenizer: default
record: position
fieldnorms: true
fast:
  normalizer: lowercase
  • tokenizers: raw, default, en_stem, chinese_compatible.
  • record: basic (DocIds), freq (+ frequency), position (required for phrase queries).

Numeric (i64, u64, f64)

Supports column-oriented storage (fast fields) for rapid range queries and aggregations.

name: rating
type: u64
stored: true
indexed: true
fast: true
coerce: true
  • fast: True enables doc-value columnar storage.
  • coerce: True automatically converts stringified numbers to integers/floats during ingestion.

Datetime

Internally stored as i64 nanoseconds in fast fields and seconds in the term dictionary.

name: timestamp
type: datetime
input_formats:
  - rfc3339
  - unix_timestamp
output_format: unix_timestamp_secs
fast: true
fast_precision: milliseconds
  • input_formats: Array evaluated in order. Supports iso8601, rfc2822, unix_timestamp, or custom strptime (e.g. %Y %m %d).
  • fast_precision: Determines truncation level for better compression (seconds to nanoseconds).

JSON & Objects

Accepts freeform JSON objects or explicit nested composite schemas.

name: parameters
type: json
stored: true
tokenizer: raw
expand_dots: true
fast:
  normalizer: lowercase
  • expand_dots: If true, {"k8s.node.id": "x"} is automatically expanded internally to nested objects so it can be queried natively via k8s.node.id:x.
  • Composite Objects: Can explicitly map nested fields using type: object and providing field_mappings.

account_tree Querying Nested JSON

Data stored deep inside nested structures (like json or object fields) can be addressed using dots as separators in the field name during queries.

For example, the document {"product": {"attributes": {"color": "red"}}} is matched by:

product.attributes.color:red
warning Caution: Escaping Dots in Keys

If the literal keys of your object contain dots (e.g., {"k8s.component.name": "cameo"}) and you did not set expand_dots: true, CameoDB will try to match a nested structure by default. To remove this ambiguity and match the exact key, you must escape the dots in your query:

k8s\.component\.name:cameo

view_list Types of Query Clauses

Exact Term

Matches documents if the targeted field contains a token exactly equal to the provided term.

field:value

Term Prefix

Matches if the field contains a token starting with the value. Will match quickstart but not qui.

field:quick*

Term Set (IN)

Matches if the document contains any of the tokens provided in the array.

Performance Note: Similar to OR statements, but highly optimized for large lists of values.

field:IN [ab cd ef]

Exact Phrase format_quote

Matches if the field contains the exact sequence of tokens. The field must have been configured with record: position.

field:"looks good to me"

The Slop Operator ~N

Allows matching a sequence with some distance/transposition.

"looks to me"~1

Matches "looks good to me" (distance 1).

Phrase Prefix

Matches an exact sequence, but the last token acts as a prefix wildcard. Note: CameoDB trims phrase prefix results to the first 50 matching terms in storage order for performance.

field:"thanks for your contrib"*

Range Queries linear_scale

Matches tokens between provided bounds. For text fields, ranges are lexicographic on UTF-8 bytes. For numbers/dates, they behave naturally. You must explicitly provide the field name.

Inclusive Bounds [ ]

ip:[127.0.0.1 TO 127.0.0.50]
date:>=2024-01-01

Exclusive Bounds { }

ip:{127.0.0.1 TO 127.0.0.50}
date:>2024-01-01

Field Exists

Matches any document where the specified field is set/populated.

field:*

Match All

Returns every document in the index. Cannot be prefixed with a field name.

*

functions Building Queries & Operators

Implicitly, if no operator is provided between clauses, AND is assumed.

AND

Matches only if both sides evaluate to true.

OR

Matches if either (or both) sides evaluate to true.

NOT or -

Negates the clause. -term is identical to NOT term.

data_array Grouping & Precedence

Parentheses are used to force the order of evaluation. Without parentheses, NOT takes precedence over everything, and AND takes precedence over OR.

a AND b OR c is interpreted as (a AND b) OR c.

(field1:one OR field1:two) AND field2:three

rule Field Validation Rules

CameoDB strictly validates index field names based on the following regex:

^[@$_\-a-zA-Z][@$_/\.\-a-zA-Z0-9]{0,254}$

In plain language:

  • It needs to have at least one character.
  • It can only contain uppercase and lowercase ASCII letters [a-zA-Z], digits [0-9], dots ., hyphens -, underscores _, slashes /, at @ and dollar $ signs.
  • It must not start with a dot or a digit.
  • It must be different from CameoDB's reserved mapping names: _source, _dynamic, _field_presence.
info

Because the . character is used for JSON nested object traversal during queries, it is highly recommended to avoid using dots directly in your base field names to prevent having to escape them on every search.