Publication Date

1995

Abstract

One of the central knowledge sources of an information extraction (IE) system IS a dictionary of linguistic patterns that can be used to identify references to relevant information in a text Automatic creation of conceptual dictionaries is important for portability and scalability of an IE system This paper describes CRYSTAL, a system which automatically induces a dictionary of "concept-node definitions" sufficient to identify relevant information from a training corpus Each of these concept-node definitions is generalized as far as possible without producing errors, so that a minimum number of dictionary entries cover the positive training instances Because it tests the accuracy of each proposed definition, CRYSTAL can often surpass human intuitions in creating reliable extraction rules.

Comments

This paper was harvested from CiteSeer

Share

COinS