Dependency Random Indexing (DRI) extends Random Indexing by restricting a word's context to be set of words with which it has a syntactic relationship. Full word co-occurrence models have shown that this restricted interpretation of a context can improve the semantic representations. DRI uses the same approximation technique as Random Indexing to project this full co-occurrence space into a significantly smaller dimensional space. This projection is done through use of index vectors, each of which are sparse and mostly orthogonal to all other index vectors. The summation of a word's index vectors corresponds directly to that word's occurrence in a context.
While Random Indexing uses permutations of these index vectors to encode lexical position, a shallow form of syntactic structure, DRI extends the notion of permutations to allow for the encoding of dependency relationships. Through this modification, the set of relationships between any two co-occurirng words in a sentence can be encoded, as can the distance between the two words. Under this model, each possible dependency relationship could have it's own permutation function, as could each possible distance between co-occurring words.
This class defines the following configurable properties that may be set using either the System properties or using the {@link DependencyRandomIndexing#DependencyRandomIndexing(DependencyExtractor,DependencyPermutationFunction,Properties)} constructor. {@value #DEPENDENCY_ACCEPTOR_PROPERTY}
{@value #DEPENDENCY_PATH_LENGTH_PROPERTY}
{@value #VECTOR_LENGTH_PROPERTY}
|
|