documentLabels

The document​Labels stage retrieves labels contained in each of the documents you provide on input. Use the output of this stage for display purposes. To collect or aggregate labels from multiple documents, use labels:​from​Documents instead.

Use document​Labels stage results only for presentation purposes.

If you need to collect an aggregate list of labels occurring in a set of documents, use the labels:​from​Documents stage.

document​Labels

Selects and returns the best-ranking labels for each document.

{
  "type": "documentLabels",
  "documents": {
    "type": "documents:reference",
    "auto": true
  },
  "labelCollector": {
    "type": "labelCollector:topFromFeatureFields",
    "fields": {
      "type": "featureFields:reference",
      "auto": true
    },
    "labelFilter": {
      "type": "labelFilter:reference",
      "auto": true
    },
    "labelListFilter": {
      "type": "labelListFilter:truncatedPhrases"
    },
    "minTf": 0,
    "minTfMass": 1,
    "tieResolution": "AUTO"
  },
  "limit": "unlimited",
  "maxLabels": 10,
  "start": 0
}

For example, the following request selects the top 3 documents matching the query photon and retrieves up to 3 most frequent labels contained in each document:

{
  "stages": {
    "documents": {
      "type": "documents:byQuery",
      "query": {
        "type": "query:string",
        "query": "photon"
      },
      "limit": 3
    },
    "documentLabels": {
      "type": "documentLabels",
      "maxLabels": 3
    }
  }
}

The response for the above request contains an array with entries referencing each document and listing top-scoring labels in that document:

"documentLabels": {
  "documents": [
    {
      "id": 188201,
      "labels": [
        {
          "label": "photon",
          "weight": 16
        },
        {
          "label": "photon-jet",
          "weight": 5
        },
        {
          "label": "e⁺e",
          "weight": 5
        }
      ]
    },
    {
      "id": 62168,
      "labels": [
        {
          "label": "photon-photon",
          "weight": 3
        },
        {
          "label": "photon-proton",
          "weight": 3
        }
      ]
    },
    {
      "id": 252264,
      "labels": [
        {
          "label": "photon",
          "weight": 4
        },
        {
          "label": "virtual",
          "weight": 4
        }
      ]
    }
  ]
}

documents

Type
documents
Default
{
  "type": "documents:reference",
  "auto": true
}
Required
no

A mandatory reference to any documents:* component or stage, providing documents for which labels should be retrieved.

label​Collector

Type
labelCollector
Default
{
  "type": "labelCollector:topFromFeatureFields",
  "labelFilter": {
    "type": "labelFilter:reference",
    "auto": true
  },
  "labelListFilter": {
    "type": "labelListFilter:truncatedPhrases"
  },
  "fields": {
    "type": "featureFields:reference",
    "auto": true
  },
  "minTf": 0,
  "minTfMass": 1,
  "tieResolution": "AUTO"
}
Required
no

A labelCollector:* component used to collect and score labels in each document.

limit

Type
limit
Default
unlimited
Required
no

The maximum number of documents to return labels for.

The value must be an integer >= 0 or the string unlimited, in which case the stage will return labels for all documents returned by the documents component.

max​Labels

Type
integer
Default
10
Constraints
value >= 0
Required
no

Maximum number of labels to return (per document). The actual number of labels may exceed this limit if the tail of the label list has ranking score ties: in this case all labels with the same score will be returned.

start

Type
integer
Default
0
Constraints
value >= 0
Required
no

If greater than zero, skips over the initial number of documents returned by the documents component.