(NOTE: this plugin is still work in progress)

This plugin provides a number of processing resources and an API that allows the construction of a new (on the fly) document from a mix of various annotation types and features in a source document. Optionally, original annotations from the source document can be mapped forward into the new "virtual" document and annotations created for the new virtual document can be mapped back into the original document.

The processing resources provided are:

  • Annotation by Specification PR: use an annotation specification list to pick annotations and create a new output annotation. This basically implements an annotation matching processes where priority is more important than match length.
  • Copy Virtual Document PR: created from an annotation specification list to a new corpus. This uses an annotation specification list to define which annotations, features etc to pick from the original document to create a new virtual document.
  • Create Virtual Document PR (will be renamed): this replaces each document in a corpus by the virtual document created according to an annotation specification list.
  • Export Contained Annotaitons PR: a simple PR which will export the content of an annotationtype.feature to textual documents for each input document
  • Indirect Language Analyser PR: this PR will run another PR on a virtual document created from the original document according to a annotation specification list and map the annotations created back to the original document. This can be used to run any PR on a "virtual view" of some document and create annotations on the original document according to which parts of the original correspond to the virtual document.

This project is hosted on Google Code as gateplugin-virtualdocuments


This plugin is available under the GNU Lesser General Publice License

See also: Other GATE Plugins and Resources