values

The values:​* stages return string values, typically document field values.

You can use the following values stages in your analysis requests:

values:​from​Document​Field

For each document, retrieves one or more values of the document field of your choice.


values:​reference

References the results of another values:​* stage defined in the request.


values:​from​Document​Field

Returns one or more field values for each document in the referenced documents set. The default structure of this stage is shown in the figure below.

{
  "type": "values:fromDocumentField",
  "documents": {
    "type": "documents:reference",
    "auto": true
  },
  "fieldName": null,
  "multipleValues": "REQUIRE_EXACTLY_ONE",
  "threads": "auto"
}

The output of this request contains a list of string values (or arrays of string values). The putput array's entries are index-aligned with the list of input documents.

For example, the following request provides a list of values from the (required) title field for documents matching the photon query.

{
  "stages": {
    "values": {
      "type": "values:fromDocumentField",
      "documents": {
        "type" : "documents:byQuery",
        "limit": 5,
        "query": {
          "type": "query:string",
          "query": "photon"
        }
      },
      "multipleValues": "REQUIRE_EXACTLY_ONE",
      "fieldName": "title"
    }
  }
}

The result of the above request, on the reference Arxiv index:

{
  "result" : {
    "values" : {
      "values" : [
        "Jets in Photon-Photon Collisions",
        "Studying 750 GeV Di-photon Resonance at Photon-Photon Collider",
        "Two-photon interference of temporally separated photons",
        "Photon-Photon Interactions via Rydberg Blockade",
        "Photon-Photon and Photon-Hadron Physics at Relativistic Heavy Ion Colliders"
      ]
    }
  }
}

Note that for multi-valued fields, the response will contain a list of values for each document. Here is a similar request, collecting all values of the category field (which is multi-valued):

{
  "stages": {
    "values": {
      "type": "values:fromDocumentField",
      "documents": {
        "type" : "documents:byQuery",
        "limit": 5,
        "query": {
          "type": "query:string",
          "query": "photon"
        }
      },
      "multipleValues": "COLLECT_ALL",
      "fieldName": "category"
    }
  }
}

The result of the above request, on the reference Arxiv index:

{
  "result" : {
    "values" : {
      "values" : [
        [
          "hep-ph"
        ],
        [
          "hep-ph",
          "hep-ex"
        ],
        [
          "quant-ph"
        ],
        [
          "quant-ph"
        ],
        [
          "hep-ph"
        ]
      ]
    }
  }
}

documents

Type
documents
Default
{
  "type": "documents:reference",
  "auto": true
}
Required
no

The source list of documents:​* from which field values should be retrieved.

field​Name

Type
string
Default
null
Required
yes

The field name whose values should be retrieved.

multiple​Values

Type
string
Default
"REQUIRE_EXACTLY_ONE"
Constraints
one of [COLLECT_FIRST, REQUIRE_EXACTLY_ONE, COLLECT_ALL]
Required
no

Provides additional information on the number of expected values retrieved from each document.

The multiple​Values property supports the following values:

C​O​L​L​E​C​T_​F​I​R​S​T

Retrieve and return only the first value of a field. This option can be applied to single and multivalued fields that always have at least one value. The response will contain an array of strings.

R​E​Q​U​I​R​E_​E​X​A​C​T​L​Y_​O​N​E

Retrieve and return the value of a single-valued field. This option can be applied to fields that always have exactly one value. The response will contain an array of strings.

C​O​L​L​E​C​T_​A​L​L

Retrieve and return all values of a field. This option can be applied to all fields. The response will contain an array of arrays of values. An empty array of values is returned for documents which have no associated value for the requested field.

threads

Type
threads
Default
auto
Required
no

The number of CPU threads used for computing aggregations. Leave at the default value.

Consumers of values:​*

The following stages and components take values:​* as input:

Stage or component Property
clusters:​by​Values
  • values
  • documents:​contrast​Score
  • document​Timestamps
  • context​Timestamps