query

query:​* components specify lists of documents in a declarative way. Typically, you pass them to the documents:​by​Query stage to execute document searches or to document​Content to highlight search terms in document content.

You can use the following query:​* components in your analysis requests:

query:​all

Matches all documents in the index.

query:​complement

Negates the query you provide.

query:​composite

Composes a list of queries using the AND or OR operator.

query:​filter

Narrows down the matches of the query you provide to the set of documents also matched by the filter query.

query:​for​Document​Fields

Invokes the query​Builder on field values of one document you provide and returns the query.

query:​for​Field​Values

Matches documents containing any of the provided values in one or more content fields.

query:​for​Labels

Matches documents containing labels you provide.

query:​from​Documents
A query that matches the documents you provide.
query:​from​Query​Builder

Invokes the query​Builder on a constant set of inputs and returns the query.

query:​string

Parses text queries using the Lucene query parser of your choice.


query:​reference

References a query:​* component defined in the request or in the project's default components.


query:​all

A query matching all documents in the index.

{
  "type": "query:all"
}

query:​complement

Negates the set of documents from the query you provide.

{
  "type": "query:complement",
  "query": null
}

query

Type
query
Default
null
Required
yes

The query to negate. Any documents not matching this query will be returned.

query:​composite

Composes a list of queries using the A​N​D or O​R operators.

{
  "type": "query:composite",
  "operator": "OR",
  "queries": []
}

Note that certain query component implementations (like query:​string) may offer built-in Boolean operations that are more efficient. This component should be used to combine documents from different query implementations.

operator

Type
string
Default
"OR"
Constraints
one of [OR, AND]
Required
no

Declares the way documents from queries are combined. The operator property supports the following values:

O​R

Produces the union of all unique documents from all queries.

A​N​D

Produces the intersection of all documents from all queries. A document must appear in all queries to appear in the output.

queries

Type
array of query
Default
[]
Required
no

A list of query:* components to compose.

query:​filter

Narrows down the matches of the query you provide to the set of documents also matched by the filter query.

{
  "type": "query:filter",
  "filter": null,
  "query": null
}

The query:​filter component acts similar to the query:​composite component with the A​N​D operator. The subtle difference is that filter queries do not contribute to document scores.

Here is an example request using the query:​filter component and searching for occurrences of cats and dogs, where the document score is only computed for the hits on dogs.

{
  "comment": "query:filter component example (content and highlighting stages for demonstration)",
  "components": {
    "query": {
      "type": "query:filter",
      "query": {
        "type": "query:string",
        "query": "dogs"
      },
      "filter": {
        "type": "query:string",
        "query": "cats"
      }
    }
  },
  "stages": {
    "documents": {
      "type": "documents:byQuery",
      "query": {
        "type": "query:reference",
        "use": "query"
      },
      "limit": 10
    },
    "content": {
      "type": "documentContent",
      "fields": {
        "type": "contentFields:grouped",
        "groups": [
          {
            "fields": [
              "title", "abstract"
            ],
            "config": {
              "maxValues": 3,
              "maxValueLength": 160,
              "highlighting": {
                "enabled": true,
                "startMarker": "⁌%s⁍",
                "endMarker": "⁌\\%s⁍"
              }
            }
          }
        ]
      },
      "queries": {
        "q1": {
          "type": "query:reference",
          "use": "query"
        }
      }
    }
  }
}

filter

Type
query
Default
null
Required
yes

Any query:​* component that acts as and A​N​D (conjunctive) clause but does not contribute to scoring.

query

Type
query
Default
null
Required
yes

Any query:​* component reference which takes part in document scoring.

query:​for​Document​Fields

Invokes the query​Builder on field values of one document you provide and returns the query.

{
  "type": "query:forDocumentFields",
  "documents": {
    "type": "documents:reference",
    "auto": true
  },
  "queryBuilder": {
    "type": "queryBuilder:reference",
    "auto": true
  }
}

documents

Type
documents
Default
{
  "type": "documents:reference",
  "auto": true
}
Required
no

The document for which to build the query.

The selector must return exactly one document.

query​Builder

Type
queryBuilder
Default
{
  "type": "queryBuilder:reference",
  "auto": true
}
Required
no

The query builder to use to build the query.

query:​for​Field​Values

Matches documents containing any of the provided values in one or more content fields.

{
  "type": "query:forFieldValues",
  "fields": {
    "type": "contentFields:reference",
    "auto": true
  },
  "values": []
}

The typical use case for this type of query is selecting large numbers (thousands) of documents based on their identifiers or some other unique field values. An equivalent Boolean string query will be less efficient.

fields

Type
contentFields
Default
{
  "type": "contentFields:reference",
  "auto": true
}
Required
no

A reference to the content​Field:​* component providing the set of field names to scan for the presence of values. At least one field is required.

values

Type
array of string
Default
[]
Required
no

An array of field values to match.

Note that a "field value" is actually the value stored in the inverted index. A field with an analyzer that tokenizes strings into multiple values (or otherwise manipulates them) will result in index values that are different to those passed on input. We recommend to use this type of query for literal fields only.

query:​for​Labels

Matches documents containing labels from any labels:​* component you provide.

{
  "type": "query:forLabels",
  "fields": {
    "type": "featureFields:reference",
    "auto": true
  },
  "labels": {
    "type": "labels:reference",
    "auto": true
  },
  "minOrMatches": 1,
  "operator": "OR"
}

In this example request, we search for any documents that contain any existing labels present in an explicit snippet of text. Such a scenario can be useful for looking up documents that are similar to the provided text (a basic more-like-this functionality).

{
  "comment": "query:filter component example (content and highlighting stages for demonstration)",
  "components": {
    "query": {
      "type": "query:forLabels",
      "fields":{
        "type": "featureFields:simple",
        "fields": [
          "abstract$phrases"
        ]
      },
      "labels": {
        "type": "labels:fromText",
        "text": "yellow cats and blue dogs"
      },
      "operator": "OR",
      "minOrMatches": 2
    }
  },
  "stages": {
    "documents": {
      "type": "documents:byQuery",
      "query": {
        "type": "query:reference",
        "use": "query"
      },
      "limit": 10
    },
    "content": {
      "type": "documentContent",
      "fields": {
        "type": "contentFields:grouped",
        "groups": [
          {
            "fields": [
              "title", "abstract"
            ],
            "config": {
              "maxValues": 3,
              "maxValueLength": 160,
              "highlighting": {
                "enabled": true,
                "startMarker": "⁌%s⁍",
                "endMarker": "⁌\\%s⁍"
              }
            }
          }
        ]
      },
      "queries": {
        "q1": {
          "type": "query:reference",
          "use": "query"
        }
      }
    }
  }
}

fields

Type
featureFields
Default
{
  "type": "featureFields:reference",
  "auto": true
}
Required
no

An array of one or more feature fields.

labels

Type
labels
Default
{
  "type": "labels:reference",
  "auto": true
}
Required
no

The source of labels.

min​Or​Matches

Type
integer
Default
1
Constraints
value > 0
Required
no

Sets the minimum number of labels that must match in a document for it to be included in the result. This setting applies to O​R-type queries only (disjunction queries).

operator

Type
string
Default
"OR"
Constraints
one of [OR, AND]
Required
no

Declares the way labels should be composed:

O​R

Produces documents matching any of the labels.

A​N​D

Produces documents matching all the labels.

query:​from​Documents

Extracts the search query from the documents stage you provide.

{
  "type": "query:fromDocuments",
  "buildFromDocumentIds": false,
  "documents": {
    "type": "documents:reference",
    "auto": true
  }
}

If the input documents originate from a search query, such as query:​string, this query becomes equal to that underlying search query. Otherwise, which is the case, for example, for documents:​by​Id or documents:​embedding​Nearest​Neighbors, this query becomes a synthetic query matching exactly the ids of the input documents.

query:​from​Documents has two practical use cases:

  • Highlighting query occurrences in a union of document lists. The following request illustrates this use case:

    {
      "name": "Using query:fromDocuments for query occurrence highlighting",
      "comment": "The primary use case for query:fromDocuments is highlighting query occurrences in a union of documents.",
      "stages": {
        "documents1": {
          "type": "documents:byQuery",
          "query": {
            "type": "query:string",
            "query": "photon"
          }
        },
        "documents2": {
          "type": "documents:byQuery",
          "query": {
            "type": "query:string",
            "query": "electron"
          }
        },
        "union": {
          "type": "documents:composite",
          "selectors": [
            {
              "type": "documents:reference",
              "use": "documents1"
            },
            {
              "type": "documents:reference",
              "use": "documents2"
            }
          ],
          "operator": "OR"
        },
        "content": {
          "type": "documentContent",
          "limit": 10,
          "documents": {
            "type": "documents:reference",
            "use": "union"
          },
          "queries": {
            "q1": {
              "type": "query:fromDocuments",
              "documents": {
                "type": "documents:reference",
                "use": "documents1"
              }
            },
            "q2": {
              "type": "query:fromDocuments",
              "documents": {
                "type": "documents:reference",
                "use": "documents2"
              }
            }
          }
        }
      }
    }

    Using query:​from​Documents to highlight query occurrences in a union of multiple lists of documents.

    While query highlighting in the above request could be implemented by referencing the corresponding queries both in the documents:​by​Query and document​Content.queries map, this is not possible in general. In particular, the documents:​rwmd stage generates a query that cannot be constructed in any other way – query:​from​Documents is the only way to access that query for highlighting purposes. See the similar document retrieval tutorial for real-world examples.

  • Filtering by sampled document set. To improve the performance of certain requests, you can use the documents:sample stage to take a random sample of a set of documents and process only that sample rather than the whole set. For stages requiring a query on input, you can use the query:​from​Documents component to convert the random sample of documents into a query.

    The following example uses query:​from​Documents to create a consistent sample of documents falling within two overlapping time periods.

    {
      "name": "Using query:fromDocuments with documents:sample.",
      "components": {
        "rangeQuery0": {
          "type": "query:string",
          "query": "created:[2015-01-01 TO 2017-01-01]"
        },
        "rangeQuery1": {
          "type": "query:string",
          "query": "created:[2016-01-01 TO 2018-01-01]"
        }
      },
      "stages": {
        "sample": {
          "type": "documents:sample",
          "limit": 10000,
          "query": {
            "type": "query:composite",
            "queries": [
              {
                "type": "query:reference",
                "use": "rangeQuery0"
              },
              {
                "type": "query:reference",
                "use": "rangeQuery1"
              }
            ],
            "operator": "OR"
          }
        },
        "documents0": {
          "type": "documents:byQuery",
          "query": {
            "type": "query:filter",
            "query": {
              "type": "query:reference",
              "use": "rangeQuery0"
            },
            "filter": {
              "type": "query:fromDocuments",
              "documents": {
                "type": "documents:reference",
                "use": "sample"
              },
              "buildFromDocumentIds": true
            }
          }
        },
        "documents1": {
          "type": "documents:byQuery",
          "query": {
            "type": "query:filter",
            "query": {
              "type": "query:reference",
              "use": "rangeQuery1"
            },
            "filter": {
              "type": "query:fromDocuments",
              "documents": {
                "type": "documents:reference",
                "use": "sample"
              },
              "buildFromDocumentIds": true
            }
          }
        }
      }
    }

    Using query:​from​Documents for random sampling of documents.

    In the components section, the request defines two queries that determine the boundaries of two overlapping time periods. The sample stage samples 10k documents covering the union of the time periods. Finally, the documents0 and documents1 stages select the sample of documents for the two time periods, using query:​from​Documents in query:​filter.filter property. Note that the request sets the build​From​Document​Ids property to true in both filter queries. This causes Lingo4G to build queries matching only the documents selected at the sampling stage rather than pass the original search query provided to the sample stage.

    Note that for overlapping time periods, sampling from each individual period leads to overrepresentation of certain periods. If this is undesirable, the above request avoids the problem by performing sampling only once for the union of all time periods.

build​From​Document​Ids

Type
boolean
Default
false
Required
no

If true, builds a query that matches the input documents by internal identifiers. Otherwise, returns the original query used by the input documents stage.

documents

Type
documents
Default
{
  "type": "documents:reference",
  "auto": true
}
Required
no

The document selector from which to extract the query.

query:​from​Query​Builder

Invokes the query​Builder on a constant set of inputs and returns the query.

{
  "type": "query:fromQueryBuilder",
  "queryBuilder": {
    "type": "queryBuilder:reference",
    "auto": true
  }
}

The primary use case of this component is combining multiple user-provided variable values into a single more complex query or reusing the same user input to build multiple queries.

query​Builder

Type
queryBuilder
Default
{
  "type": "queryBuilder:reference",
  "auto": true
}
Required
no

The query builder to call.

This component invokes the query builder with an empty set of inputs, so the query builder you provide must use the input property of each query​Builder​Variable to

query:​string

Parses text queries using the Apache Lucene query parser of your choice.

{
  "type": "query:string",
  "query": "",
  "queryParser": {
    "type": "queryParser:project",
    "queryParserKey": ""
  }
}

Text queries provide a very powerful and flexible way of selecting a subset of documents matching the provided criteria. The type of query​Parser will determine how the query text is interpreted. The query parsers chapter lists all available query parsers and provides examples of their query syntax.

In the example request below, we search for all occurrences of the word cat, preceding the word dog by no more than 15 word positions. Note the highlighted fragment, which is the query:​string component inside the documents:​by​Query stage.

{
  "comment": "query:string component example (content and highlighting stages for demonstration)",
  "components": {
    "query": {
      "type": "query:string",
      "query": "fn:maxWidth(15 fn:ordered(cat dog))"
    }
  },
  "stages": {
    "documents": {
      "type": "documents:byQuery",
      "query": {
        "type": "query:reference",
        "use": "query"
      },
      "limit": 10
    },
    "content": {
      "type": "documentContent",
      "fields": {
        "type": "contentFields:grouped",
        "groups": [
          {
            "fields": [
              "title", "abstract"
            ],
            "config": {
              "maxValues": 3,
              "maxValueLength": 160,
              "highlighting": {
                "enabled": true,
                "startMarker": "⁌%s⁍",
                "endMarker": "⁌\\%s⁍"
              }
            }
          }
        ]
      },
      "queries": {
        "q1": {
          "type": "query:reference",
          "use": "query"
        }
      }
    }
  }
}

query

Type
string
Default
<empty string>
Required
no

The text query to pass to the query parser.

query​Parser

Type
queryParser
Default
{
  "type": "queryParser:project",
  "queryParserKey": ""
}
Required
no

The name of the query parser to use. If blank, the default query parser definition from the project descriptor is used.

Consumers of query:​*

The following stages and components take query:​* as input:

Stage or component Property
debug:​explain
  • query
  • dictionary:​query​Terms
  • query
  • document​Content
  • queries
  • document​Pairs:​duplicates
  • query
  • query
  • documents:​by​Query
  • query
  • documents:​embedding​Nearest​Neighbors
  • filter​Query
  • documents:​sample
  • query
  • documents:​vector​Field​Nearest​Neighbors
  • filter​Query
  • query:​complement
  • query
  • query:​composite
  • queries
  • query:​filter
  • query
  • filter