Clustering Engine Manual for version 2.0.0-beta2

This manual provides detailed information about the Carrot Search Lingo3G document clustering engine. It includes a general overview of Lingo3G, a description of Lingo3G Java API, HTTP/REST service API, tuning attributes and configuration files.

What is Lingo3G?

Lingo3G is a text clustering engine that can organize small to medium collections of documents into clearly labeled thematic groups called clusters. Clustering is performed in real-time, fully automatically and is based only on the provided text fields of each document. Lingo3G's unique algorithm ensures high-quality semantic results are delivered quickly.

Lingo3G can turn, for example, search result titles and snippets into groups like these:

Search results (snippets) and clusters discovered from them.

Search result titles and snippets (on the left) for query "salsa" and corresponding cluster labels (right).

Lingo3G is a programming component (Java library). Programming knowledge is required to use it. The distribution package comes with examples and demonstration applications.

What's in the box?

Lingo3G distribution comes with:

Additionally, several downstream projects provide integration between Carrot2 and popular document retrieval services: