Components

A component provides a piece of configuration for a stage or another component, such as the search query to execute or the list of document fields to use for label extraction.

To introduce the concept of components, let's have a look at the one-stage request we introduced at the start of this tutorial:

{
  "stages": {
    "documents": {
      "type": "documents:byQuery",
      "query": {
        "type": "query:string",
        "query": "photon"
      },
      "limit": 50
    }
  }
}

On closer examination, you'll notice that the query property contains a definition that looks like an in-lined stage: it has a type and a number of other properties. However, the query alone does not perform any text operations, it only provides the configuration required by the documents:​by​Query stage. We call such configuration-bearing elements components.

Much like with stages, you can in-line components directly where they are required or use component references.

Component references

To use a component by reference, put the definition of the component in the components section of the request and then use the explicit or auto component reference where required.

For the sake of example, the following request rewrites the document search request by extracting the query into the components section:

{
  "components": {
    "query": {
      "type": "query:string",
      "query": "photon"
    }
  },
  "stages": {
    "documents": {
      "type": "documents:byQuery",
      "query": {
        "type": "query:reference",
        "use": "query"
      },
      "limit": 50
    }
  }
}

In most cases, keeping query components in-line makes more sense because the same query is rarely required at multiple points of the request. On the other hand, components of other types, such as label filters or feature field lists, can benefit a lot from referencing, especially in combination with default component definitions.

Default components

Default components are the analysis components you define in the analysis_v2/components section of the project descriptor. Lingo4G makes the default components available for every request as if you defined them explicitly in the components section of the request.

To see default components in action, execute the following label extraction request in JSON Sandbox:

{
  "stages": {
    "documents": {
      "type": "documents:byQuery",
      "query": {
        "type": "query:string",
        "query": "photon"
      },
      "limit": 1000
    },
    "labels": {
      "type": "labels:fromDocuments"
    }
  }
}

Then, activate the diagram tab and enable the Show default components option:

Lingo4G JSON sandbox app, request diagram with default components shown (light theme).
Lingo4G JSON sandbox app, request diagram with default components shown (dark theme).

Label extraction request diagram with default components shown.

The diagram shows two default components called fields and label​Filter. The fields component defines the list of document fields from which to extract labels. The label​Filter component defines which labels to filter out during label extraction.

Our request does not mention any of the components explicitly. To understand why the request works, let's have a look at the default value of the label​Aggregator property of the labels:​from​Documents stage:

{
  "type": "labelAggregator:topWeight",
  "labelCollector": {
    "type": "labelCollector:topFromFeatureFields",
    "labelFilter": {
      "type": "labelFilter:reference",
      "auto": true
    },
    "fields": {
      "type": "featureFields:reference",
      "auto": true
    },
    "minWeight": 0,
    "minWeightMass": 1,
    "tieResolution": "AUTO"
  },
  "maxLabelsPerDocument": 10,
  "minAbsoluteDf": 1,
  "minRelativeDf": 0,
  "maxRelativeDf": 1,
  "tieResolution": "AUTO",
  "threads": "auto"
}

Notice two properties, label​Filter and fields of the label​Collector property, which default to the auto reference types. When you execute the request, Lingo4G resolves the references to target the corresponding default components.

Default components and auto references as default property values are a powerful combination you can use to preconfigure all stages to use project-specific defaults, such as feature fields, content fields or label filters.

You can view the default component definitions either by looking at the analysis_v2/components section of the project descriptor or by inspecting the request object returned as part of the response JSON. The latter also includes all stage and component properties, also the default ones not defined in the request you submitted.

Shadowing default components

To use a different definition of a default component in a specific request, you can shadow the definition by explicitly defining a component with the same name in the components section of the request.

For example, assuming that the project descriptor defines a default component called label​Filter, you can disable filtering for this component by shadowing the component to accept all labels:

{
  "components": {
    "labelFilter": {
      "type": "labelFilter:acceptAll"
    }
  }
}

Overriding default components

To prevent Lingo4G from using a default component for a specific stage, override the corresponding property of the stage to replace the auto reference with an inline component definition or an explicit reference.

For example, to disable filtering during label extraction, override the label​Filter property of the label collector:

{
  "stages": {
    "documents": {
      "type": "documents:byQuery",
      "query": {
        "type": "query:string",
        "query": "photon"
      },
      "limit": 1000
    },
    "labels": {
      "type": "labels:fromDocuments",
      "labelAggregator": {
        "type": "labelAggregator:topWeight",
        "labelCollector": {
          "type": "labelCollector:topFromFeatureFields",
          "labelFilter": {
            "type": "labelFilter:acceptAll"
          }
        }
      }
    }
  }
}

Note that if the request contains other stages with auto references, Lingo4G resolves those references using the default component. To change the definition of a default component across the whole request, use shadowing.