values

The values:‚Äč* stages return string values, typically document field values.

You can use the following values stages in your analysis requests:

values:‚Äčfrom‚ÄčDocument‚ÄčField

For each document, retrieves one or more values of the document field of your choice.


values:‚Äčreference

References the results of another values:‚Äč* stage defined in the request.


values:‚Äčfrom‚ÄčDocument‚ÄčField

Returns one or more field values for each document in the referenced documents set. The default structure of this stage is shown in the figure below.

{
  "type": "values:fromDocumentField",
  "documents": {
    "type": "documents:reference",
    "auto": true
  },
  "fieldName": null,
  "multipleValues": "REQUIRE_EXACTLY_ONE",
  "threads": "auto"
}

The output of this request contains a list of string values (or arrays of string values). The putput array's entries are index-aligned with the list of input documents.

For example, the following request provides a list of values from the (required) title field for documents matching the photon query.

{
  "stages": {
    "values": {
      "type": "values:fromDocumentField",
      "documents": {
        "type" : "documents:byQuery",
        "limit": 5,
        "query": {
          "type": "query:string",
          "query": "photon"
        }
      },
      "multipleValues": "REQUIRE_EXACTLY_ONE",
      "fieldName": "title"
    }
  }
}

The result of the above request, on the reference Arxiv index:

{
  "result" : {
    "values" : {
      "values" : [
        "Photons, Photon Jets and Dark Photons at 750 GeV and Beyond",
        "Final States in Photon-Photon and Photon-Proton Interactions",
        "Two-Photon Processes and Photon Structure",
        "Photon-photon dispersion of TeV gamma rays and its role for photon-ALP conversion",
        "Statistics of photon-subtracted and photon-added states"
      ]
    }
  }
}

Note that for multi-valued fields, the response will contain a list of values for each document. Here is a similar request, collecting all values of the category field (which is multi-valued):

{
  "stages": {
    "values": {
      "type": "values:fromDocumentField",
      "documents": {
        "type" : "documents:byQuery",
        "limit": 5,
        "query": {
          "type": "query:string",
          "query": "photon"
        }
      },
      "multipleValues": "COLLECT_ALL",
      "fieldName": "category"
    }
  }
}

The result of the above request, on the reference Arxiv index:

{
  "result" : {
    "values" : {
      "values" : [
        [
          "hep-ph",
          "hep-ex"
        ],
        [
          "hep-ex"
        ],
        [
          "hep-ph"
        ],
        [
          "astro-ph.HE",
          "hep-ph"
        ],
        [
          "quant-ph"
        ]
      ]
    }
  }
}

documents

Type
documents
Default
{
  "type": "documents:reference",
  "auto": true
}
Required
no

The source list of documents:‚Äč* from which field values should be retrieved.

field‚ÄčName

Type
string
Default
null
Required
yes

The field name whose values should be retrieved.

multiple‚ÄčValues

Type
string
Default
"REQUIRE_EXACTLY_ONE"
Constraints
one of [COLLECT_FIRST, REQUIRE_EXACTLY_ONE, COLLECT_ALL]
Required
no

Provides additional information on the number of expected values retrieved from each document.

The multiple‚ÄčValues property supports the following values:

C‚ÄčO‚ÄčL‚ÄčL‚ÄčE‚ÄčC‚ÄčT_‚ÄčF‚ÄčI‚ÄčR‚ÄčS‚ÄčT

Retrieve and return only the first value of a field. This option can be applied to single and multivalued fields that always have at least one value. The response will contain an array of strings.

R‚ÄčE‚ÄčQ‚ÄčU‚ÄčI‚ÄčR‚ÄčE_‚ÄčE‚ÄčX‚ÄčA‚ÄčC‚ÄčT‚ÄčL‚ÄčY_‚ÄčO‚ÄčN‚ÄčE

Retrieve and return the value of a single-valued field. This option can be applied to fields that always have exactly one value. The response will contain an array of strings.

C‚ÄčO‚ÄčL‚ÄčL‚ÄčE‚ÄčC‚ÄčT_‚ÄčA‚ÄčL‚ÄčL

Retrieve and return all values of a field. This option can be applied to all fields. The response will contain an array of arrays of values. An empty array of values is returned for documents which have no associated value for the requested field.

threads

Type
threads
Default
auto
Required
no

The number of CPU threads used for computing aggregations. Leave at the default value.

Consumers of values:‚Äč*

The following stages and components take values:‚Äč* as input:

Stage or component Property
clusters:‚Äčby‚ÄčValues
  • values
  • documents:‚Äčcontrast‚ÄčScore
  • document‚ÄčTimestamps
  • context‚ÄčTimestamps