Extracting Symmetric Patterns from Plain Text

This page contains the source code of the (Davidov & Rappoport, 2006) algorithm (dr06) for extracting symmetric patterns (SPs) from plain text.
Symmetric patterns are useful for a range of NLP tasks, including lexical acquisition, word clustering, word classification and word similarity.
The following code extracts the set of patterns automatically from plain text.

The algorithm was developed in the The Hebrew University of Jerusalem by Dmitry Davidov (1975-2010) and Ari Rappoport. The code in this package was developed by .
A complete description of the dr06 can be found in the paper:

Efficient Unsupervised Discovery of Word Categories Using Symmetric Patterns and High Frequency Words; Dmitry Davidov and Ari Rappoport. In proceedings of ACL-Coling 2006


For usage, see the README file.


The code can be is available here.


The code is distributed under the GNU GENERAL PUBLIC LICENSE (a detailed license file is included).


If you make use of this code, please cite one of the following papers:


For any questions or feedback, please email