Query Syntax
Master the CameoDB search syntax. Our query parser supports complex boolean logic, strict phrase matching, wildcards, and deep JSON traversal.
data_object Pseudo-grammar
The fundamental structure of a CameoDB query follows this grammar:
query = '(' query ')' | query operator query | unary_operator query | query query | clause operator = 'AND' | 'OR' unary_operator = 'NOT' | '-' clause = field_name ':' field_clause | defaultable_clause | '*' field_clause = term | term_prefix | term_set | phrase | phrase_prefix | range | '*' defaultable_clause = term | term_prefix | term_set | phrase | phrase_prefix
title Writing Queries & Escaping
Special Characters
Some characters need to be escaped in non-quoted terms because they are syntactically significant. The special reserved characters are:
If these characters appear in your query terms, prefix them with a backslash \. In quoted terms, only the quote character itself ( ' or " ) needs escaping.
Structured Data Constraints
-
Datetime Requirements
Defaults to strict RFC3339 format (e.g.,
1970-01-01T00:00:00Z), but highly configurable in the schema mapping. -
IP Addresses
Provide as IPv4 or IPv6. CIDR notation is not supported natively, but standard inclusive/exclusive range queries work perfectly for IP sweeps.
category Data Types & Aliases
CameoDB supports all native Tantivy field types, plus convenient language-specific aliases for broader compatibility when ingesting schemas from other SQL or Python systems.
Core Raw Types (11 Supported)
| Alias (Provided in Schema) | Maps To Core Type | Source Language |
|---|---|---|
| float, double, decimal | f64 | Python, SQL |
| integer, int, number, signed | i64 | Python, JavaScript |
| unsigned, uint | u64 | C/C++, Rust |
| bool | boolean | Python, JavaScript |
| datetime, timestamp | date | SQL, Python |
| binary, blob | bytes | SQL |
| object, document | json | JavaScript, Python |
| category, tag | facet | Common terminology |
Composite Types: Array types are fully supported for all raw types except objects. Simply wrap the type in your schema declaration, e.g., array<i64>.
schema Document Mapping & Field Config
The document mapping defines exactly how fields are stored, tokenized, and indexed. CameoDB allows strict typing, but can optionally fallback to dynamic schemaless modes.
| Variable | Description | Default |
|---|---|---|
| field_mappings | Collection of explicit field definitions. | [] |
| mode | Handling of missing fields. "dynamic" enables schemaless use. | dynamic |
| store_source | Whether the original JSON document is stored directly in the index. | false |
| timestamp_field | Datetime field used for sharding documents in splits. | None |
| partition_key | Field used to route documents into different split partitions. | null |
| index_field_presence | Enables exists (field:*) queries for non-fast fields (CPU cost). |
false |
Type Configuration Deep-Dive
Text Fields
Tailored for full-text search. Analyzed and tokenized before indexing.
name: body type: text tokenizer: default record: position fieldnorms: true fast: normalizer: lowercase
- tokenizers:
raw,default,en_stem,chinese_compatible. - record:
basic(DocIds),freq(+ frequency),position(required for phrase queries).
Numeric (i64, u64, f64)
Supports column-oriented storage (fast fields) for rapid range queries and aggregations.
name: rating type: u64 stored: true indexed: true fast: true coerce: true
- fast: True enables doc-value columnar storage.
- coerce: True automatically converts stringified numbers to integers/floats during ingestion.
Datetime
Internally stored as i64 nanoseconds in fast fields and seconds in the term dictionary.
name: timestamp type: datetime input_formats: - rfc3339 - unix_timestamp output_format: unix_timestamp_secs fast: true fast_precision: milliseconds
- input_formats: Array evaluated in order. Supports
iso8601,rfc2822,unix_timestamp, or custom strptime (e.g.%Y %m %d). - fast_precision: Determines truncation level for better compression (
secondstonanoseconds).
JSON & Objects
Accepts freeform JSON objects or explicit nested composite schemas.
name: parameters type: json stored: true tokenizer: raw expand_dots: true fast: normalizer: lowercase
- expand_dots: If true,
{"k8s.node.id": "x"}is automatically expanded internally to nested objects so it can be queried natively viak8s.node.id:x. - Composite Objects: Can explicitly map nested fields using
type: objectand providingfield_mappings.
account_tree Querying Nested JSON
Data stored deep inside nested structures (like json or object fields) can be addressed using dots as separators in the field name during queries.
For example, the document {"product": {"attributes": {"color": "red"}}} is matched by:
If the literal keys of your object contain dots (e.g., {"k8s.component.name": "cameo"}) and you did not set expand_dots: true, CameoDB will try to match a nested structure by default. To remove this ambiguity and match the exact key, you must escape the dots in your query:
view_list Types of Query Clauses
Exact Term
Matches documents if the targeted field contains a token exactly equal to the provided term.
Term Prefix
Matches if the field contains a token starting with the value. Will match quickstart but not qui.
Term Set (IN)
Matches if the document contains any of the tokens provided in the array.
Performance Note: Similar to OR statements, but highly optimized for large lists of values.
Exact Phrase format_quote
Matches if the field contains the exact sequence of tokens. The field must have been configured with record: position.
The Slop Operator ~N
Allows matching a sequence with some distance/transposition.
Matches "looks good to me" (distance 1).
Phrase Prefix
Matches an exact sequence, but the last token acts as a prefix wildcard. Note: CameoDB trims phrase prefix results to the first 50 matching terms in storage order for performance.
Range Queries linear_scale
Matches tokens between provided bounds. For text fields, ranges are lexicographic on UTF-8 bytes. For numbers/dates, they behave naturally. You must explicitly provide the field name.
Inclusive Bounds [ ]
Exclusive Bounds { }
Field Exists
Matches any document where the specified field is set/populated.
Match All
Returns every document in the index. Cannot be prefixed with a field name.
functions Building Queries & Operators
Implicitly, if no operator is provided between clauses, AND is assumed.
Matches only if both sides evaluate to true.
Matches if either (or both) sides evaluate to true.
Negates the clause. -term is identical to NOT term.
data_array Grouping & Precedence
Parentheses are used to force the order of evaluation. Without parentheses, NOT takes precedence over everything, and AND takes precedence over OR.
a AND b OR c is interpreted as (a AND b) OR c.
(field1:one OR field1:two) AND field2:three
rule Field Validation Rules
CameoDB strictly validates index field names based on the following regex:
In plain language:
- It needs to have at least one character.
- It can only contain uppercase and lowercase ASCII letters
[a-zA-Z], digits[0-9], dots., hyphens-, underscores_, slashes/, at@and dollar$signs. - It must not start with a dot or a digit.
- It must be different from CameoDB's reserved mapping names:
_source,_dynamic,_field_presence.
Because the . character is used for JSON nested object traversal during queries, it is highly recommended to avoid using dots directly in your base field names to prevent having to escape them on every search.