Flags for sequence classifiers. Documentation for general flags and flags for NER can be found in the Javadoc of {@link edu.stanford.nlp.ie.NERFeatureFactory}. Documentation for the flags for Chinese word segmentation can be found in the Javadoc of {@link edu.stanford.nlp.wordseg.ChineseSegmenterFeatureFactory}.
TO ONLY ADD NEW VARIABLES AT THE END OF THE LIST OF VARIABLES (and not to change existing variables)! Otherwise you usually break all currently serialized classifiers!!! Search for "ADD VARIABLES ABOVE HERE" below.
Property Name | Type | Default Value | Description |
useQN | boolean | true | Use Quasi-Newton (L-BFGS) optimization to find minimum. NOTE: Need to set this to false if using other minimizers such as SGD. |
QNsize | int | 25 | Number of previous iterations of Quasi-Newton to store (this increases memory use, but speeds convergence by letting the Quasi-Newton optimization more effectively approximate the second derivative). |
QNsize2 | int | 25 | Number of previous iterations of Quasi-Newton to store (used when pruning features, after the first iteration - the first iteration is with QNSize). |
useInPlaceSGD | boolean | false | Use SGD (tweaking weights in place) to find minimum (more efficient than the old SGD, faster to converge than Quasi-Newton if there are very large of samples). Implemented for CRFClassifier. NOTE: Remember to set useQN to false |
tuneSampleSize | int | -1 | If this number is greater than 0, specifies the number of samples to use for tuning (default is 1000). |
SGDPasses | int | -1 | If this number is greater than 0, specifies the number of SGD passes over entire training set) to do before giving up (default is 50). Can be smaller if sample size is very large. |
useSGD | boolean | false | Use SGD to find minimum (can be slow). NOTE: Remember to set useQN to false |
useSGDtoQN | boolean | false | Use SGD (SGD version selected by useInPlaceSGD or useSGD) for a certain number of passes (SGDPasses) and then switches to QN. Gives the quick initial convergence of SGD, with the desired convergence criterion of QN (there is some rampup time for QN). NOTE: Remember to set useQN to false |
evaluateIters | int | 0 | If this number is greater than 0, evaluates on the test set every so often while minimizing. Implemented for CRFClassifier. |
evalCmd | String | | If specified (and evaluateIters is set), runs the specified cmdline command during evaluation (instead of default CONLL-like NER evaluation) |
evaluateTrain | boolean | false | If specified (and evaluateIters is set), also evaluate on training set (can be expensive) |
tokenizerOptions | String (null) | Extra options to supply to the tokenizer when creating it. |
tokenizerFactory | String (null) | A different tokenizer factory to use if the ReaderAndWriter in question uses tokenizers. |