Extracting Symmetric Patterns from Plain Text
This page contains the source code of the
(Davidov & Rappoport, 2006) algorithm (dr06) for extracting symmetric patterns (SPs) from plain text.
Symmetric patterns are useful for a range of NLP tasks, including lexical acquisition, word clustering, word classification and word similarity.
The following code extracts the set of patterns automatically from plain text.
The algorithm was developed in the The Hebrew University of Jerusalem by Dmitry Davidov (1975-2010) and Ari Rappoport.
The code in this package was developed by .
A complete description of the dr06 can be found in the paper:
Efficient Unsupervised Discovery of Word Categories Using Symmetric Patterns and High Frequency Words;
Dmitry Davidov and Ari Rappoport. In proceedings of ACL-Coling 2006
Usage
For usage, see the
README file.
Download
The code can be is available
here.
License
The code is distributed under the GNU GENERAL PUBLIC LICENSE (a detailed license file is included).
Credit
If you make use of this code, please cite one of the following papers:
-
Efficient Unsupervised Discovery of Word Categories Using Symmetric Patterns and High Frequency Words; Dmitry Davidov and Ari Rappoport. In proceedings of ACL-Coling 2006.
-
Symmetric Pattern Based Word Embeddings for Improved Word Similarity Prediction; Roy Schwartz, Roi Reichart and Ari Rappoport. In proceedings of CoNLL 2015
-
Minimally Supervised Classification to Semantic Categories using Automatically Acquired Symmetric Patterns; Roy Schwartz, Roi Reichart and Ari Rappoport. In proceedings of Coling 2014
Feedback
For any questions or feedback, please email