Extracting Symmetric Patterns from Plain Text

This page contains the source code of the (Davidov & Rappoport, 2006) algorithm (dr06) for extracting symmetric patterns (SPs) from plain text.
Symmetric patterns are useful for a range of NLP tasks, including lexical acquisition, word clustering, word classification and word similarity.
The following code extracts the set of patterns automatically from plain text.

The algorithm was developed in the The Hebrew University of Jerusalem by Dmitry Davidov (1975-2010) and Ari Rappoport. The code in this package was developed by .
A complete description of the dr06 can be found in the paper:

Efficient Unsupervised Discovery of Word Categories Using Symmetric Patterns and High Frequency Words; Dmitry Davidov and Ari Rappoport. In proceedings of ACL-Coling 2006

Usage

For usage, see the README file.

Download

The code can be is available here.

License

The code is distributed under the GNU GENERAL PUBLIC LICENSE (a detailed license file is included).

Credit

If you make use of this code, please cite one of the following papers:

Efficient Unsupervised Discovery of Word Categories Using Symmetric Patterns and High Frequency Words; Dmitry Davidov and Ari Rappoport. In proceedings of ACL-Coling 2006.
Symmetric Pattern Based Word Embeddings for Improved Word Similarity Prediction; Roy Schwartz, Roi Reichart and Ari Rappoport. In proceedings of CoNLL 2015
Minimally Supervised Classification to Semantic Categories using Automatically Acquired Symmetric Patterns; Roy Schwartz, Roi Reichart and Ari Rappoport. In proceedings of Coling 2014

Feedback

For any questions or feedback, please email