Clustering Workbench

You can use Clustering Workbench to quickly try and tune Lingo3G clustering on your data. If Workbench suits your needs, you can use it as a text mining and research tool.

Installation and running

If you don't mind the limitations of the on-line version of Workbench, installation is not required.

  1. Install Lingo3G on your machine.

  2. Start the Lingo3G Document Clustering Server (DCS) application located in the dcs/ folder of your Lingo3G installation.

    • On Windows, run the dcs.cmd script.
    • On Linux and Mac, run the dcs script.

    If the DCS starts successfully, you should see a terminal window with messages similar to the following:

    16:59:55: DCS context initialized [algorithms: [Lingo3G], templates: [frontend-default]]
    16:59:55: Service started on port 8080.
    16:59:55: The following contexts are available:
      http://localhost:8080/          DCS Root
      http://localhost:8080/doc       Documentation
      http://localhost:8080/frontend  End-user apps
      http://localhost:8080/javadoc   Java API Javadoc
      http://localhost:8080/service   REST API
  3. Open http://localhost:8080/frontend/#/workbench in a modern browser.

User interface highlights

If you'd like to learn how to use Workbench to cluster your own data, see the Trying Lingo3G section. For some Clustering Workbench tips and tricks, see below.

Data source choice
Data source choice in Lingo3G Clustering Workbench, light theme.
Data source choice in Lingo3G Clustering Workbench, dark theme.

Use the data source choice section to choose the data for clustering and to run the clustering process.

The Cluster button turns blue once you modify any parameters in the parameters panel to let you know to re-run clustering for the parameter changes to take effect.

Clusters view
Clusters view of Lingo3G Workbench, light theme.
Clusters view of Lingo3G Workbench, dark theme.

Use the list, treemap and pie-chart tabs to choose cluster presentation.

Use the icons to invoke additional tools for the current view, such as treemap interaction help, exporting of the visualization to JPEG and configuration of the visualization display.

Documents view
Documents view of Lingo3G Workbench, light theme.
Documents view of Lingo3G Workbench, dark theme.

The documents view shows the documents belonging to the cluster you select. Press the icon in the top right corner to configure which documents fields to show for each documents.

Documents view configuration
Documents view configuration in Lingo3G Workbench, light theme.
Documents view configuration in Lingo3G Workbench, dark theme.

If the documents you submit for clustering contain multiple fields, you can use the document view configuration to choose which fields to show.

For each field you can choose one of the following display roles:

title
Shows in bold at the top of the document, works well for document title and other short textual fields.
body
Shows under title fields, works well for document body and other longer textual fields. Workbench truncates body fields if they exceed the maximum number of characters per document you configure.
id
Shows under body fields, use for document identifiers.
tag
Shows under the id fields, use for short multi-valued document fields, such as tags or list of authors.
property
Shows under the tag fields, use for short single-valued document fields, such as dates, numbers or booleans.

Workbench tries to determine the best document display configuration based on the distribution of the field values in your data set. You can tune that configuration if needed.

Parameters panel
Parameters panel in Lingo3G Clustering Workbench, light theme.
Parameters panel in Lingo3G Clustering Workbench, dark theme.

Use the parameters panel to change parameters of the data source and the clustering algorithm. Click the (?) icon for a description of a specific parameter.

Once you finish changing parameter values, press the Cluster button to re-run clustering.

Results export
Clusters and documents export in Lingo3G Clustering Workbench, light theme.
Clusters and documents export in Lingo3G Clustering Workbench, dark theme.

Use the export tool to save the current documents and clusters in Excel, OpenOffice, CSV or JSON format.

Label dictionaries
Label dictionaries editor in Lingo3G Clustering Workbench, light theme.
Label dictionaries editor in Lingo3G Clustering Workbench, dark theme.

Use the text box in the Dictionaries section to edit label exclusion dictionaries. Use the glob, exact and regexp tabs to choose the label matching mechanism. The glob syntax should serve most label filtering needs.

Click syntax for a syntax overview. See Label matchers for the complete documentation.

Click Copy JSON to copy to clipboard the JSON representation of the dictionaries, ready for pasting into JSON request and code.

Parameter search
Lingo3G Clustering Workbench parameters search, light theme.
Lingo3G Clustering Workbench parameters search, dark theme.

Press the Filters button to enable filtering of parameters. Type part of parameter name to the search box to show matching parameters.

Advanced parameters
Advanced parameters switch in Lingo3G Clustering Workbench, light theme.
Advanced parameters switch in Lingo3G Clustering Workbench, dark theme.

Toggle the advanced parameters button to access all the available parameters, including those designed for expert users.

Parameters JSON export
Parameters JSON export in Lingo3G Clustering Workbench, light theme.
Parameters JSON export in Lingo3G Clustering Workbench, dark theme.

Use the Parameters Export tool to copy the current Lingo3G parameters as JSON ready to paste into REST API requests.