Project Abstract:
Maximum Entropy has been used in several statistical classification
problems in Natural Language Processing (NLP). This project uses
the Maximum Entropy approach to build an Information Extraction
(IE) system for extracting information about events such as terrorist
events. Here, the Information Extraction (IE) task is modeled as a
classification problem where information extracted from the
document is used to fill information slots in a template. Manually
generated templates are used as the training set for this system
but the amount of manually labeled data that can be provided is
very limited. So, in our approach we use weakly labeled data, which
essentially contains news articles describing same event, and are
thus a rich source of paraphrases. The weakly labeled data is freely
available and hence provides ample training examples. A basic
Maximum Entropy based IE system was developed and
improvement in performance using weakly labeled data was
investigated. |