大象传媒

CitronCitron is an experimental quote extraction system created by 大象传媒 R&D

License: Apache-2.0

Project Links:

Citron is an experimental quote extraction and attribution system created by 大象传媒 R&D, based on a and a developed by the School of Informatics at the University of Edinburgh.

It can be used to extract quotes from text documents, attributing them to the appropriate speaker and resolving pronouns where necessary. It supports direct and indirect quotes (with and without quotation marks respectively) and mixed quotes (which have direct and indirect parts). Note that there can be a significant number of errors and omissions. Extracted quotes should be checked against the input text.

You can run Citron using the pre-trained model, external or train your own model, external. You can also evaluate its performance, external.

Training and evaluating models requires data using Citron's Annotation Format, external. Citron provides pre-processing scripts, external to extract suitable data from the . Alternatively, you can create your own data using the Citron Annotator, external app.

Technical details and potential applications are discussed in: .