"Cascaded Analysis Broker" for error-tolerant linguistic analysis


If you are looking for documentation on the DTA::CAB web-service, start here.


Command-line utilities for local analysis & conversion

HTTP Server/Client Stuff

XML-RPC Server/Client Stuff (deprecated)


... are available on CPAN.

Batteries Not Included: You should be aware that the source code distribution alone is not sufficient to set up and run a complete CAB analysis pipeline on your local site. In order to do that, you will also need various assorted language models and additional resources which are not themselves part of CAB (which aspires to be language-agnostic), and therefore not included in the source code distribution. See Jurish (2012) and the source code documentation for more details.


I would appreciate CAB users citing its use in any related publications As a general CAB-related reference, please cite: Other CAB-related publications include:

Related Packages

  • GFSM: finite-state library
  • GFSM::XL: finite-state cascade lookup library
  • moot: HMM utility suite
  • Taxi::Mysql: flexible document indexing system with some DTA::CAB-like features
  • Lingua::LTS: LTS ruleset compiler/interpreter with standalone transducer lookup
  • unicruft: transliteration C library