documentScorer

The document​Scorer:​* components compute scores for documents based on different criteria. In combination with the documents:​scored stage, you can select top-scoring documents based on the criteria of your choice.

You can use the following document​Scorer:​* components in your analysis requests:

document​Scorer:​by​Document​Neighbors

For each input document, computes the document's neighbors (similar documents) and aggregates the neighbors' weights to compute the score of the input document.

document​Scorer:​by​Values

For each input document, counts the values collected by the value collector you provide.

document​Scorer:​by​Vector​Similarity

Scores documents by their similarity to the vector you provide.


document​Scorer:​reference

References a document​Scorer:​* component defined in the request or in the project's default components.


document​Scorer:​by​Document​Neighbors

For each input document, computes the document's neighbors (similar documents) and aggregates the neighbors' weights to compute the score of the input document.

{
  "type": "documentScorer:byDocumentNeighbors",
  "documentNeighbors": {
    "type": "documentNeighbors:reference",
    "auto": true
  },
  "threads": "auto",
  "weightAggregation": "COUNT"
}

You can use the document​Neighbors:​by​Query component to generate document neighbors based on search queries specific to each input document.

document​Neighbors

Type
documentNeighbors
Default
{
  "type": "documentNeighbors:reference",
  "auto": true
}
Required
no

The component to generate neighbors (similar documents) of each input document.

limit

Type
limit
Required
no

The maximum number of neighbors to request for each input document.

threads

Type
threads
Default
auto
Required
no

Controls the number of threads Lingo4G uses to compute document neighbors.

weight​Aggregation

Type
weightAggregation
Default
"COUNT"
Required
no

The aggregation function Lingo4G uses to compute the document score based on the scores of the input document's neighbors.

See weight​Aggregation documentation for the available options.

document​Scorer:​by​Values

For each input document, counts the values collected by the value collector you provide. The number of values becomes the document's score.

{
  "type": "documentScorer:byValues",
  "threads": "auto",
  "unique": false,
  "valueCollector": null
}

threads

Type
threads
Default
auto
Required
no

The number of processing threads to use.

unique

Type
boolean
Default
false
Required
no

If true, the scorer counts the number of unique values returned by the collector.

value​Collector

Type
valueCollector
Default
null
Required
yes

The collector to use for collecting the values to count.

document​Scorer:​by​Vector​Similarity

Scores documents by their similarity to the vector you provide.

{
  "type": "documentScorer:byVectorSimilarity",
  "vector": {
    "type": "vector:reference",
    "auto": true
  },
  "vectors": {
    "type": "vectors:reference",
    "auto": true
  }
}

vector

Type
vector
Default
{
  "type": "vector:reference",
  "auto": true
}
Required
no

The vector against which to score the documents.

vectors

Type
vectors
Default
{
  "type": "vectors:reference",
  "auto": true
}
Required
no

Vectors corresponding to the documents you want to score.

Consumers of document​Scorer:​*

The following stages and components take document​Scorer:​* as input:

Stage or component Property
documents:​scored
  • scorer