--test_classifiers

Switch

--test_classifiers

Description

Splits the data into test and training set, trains a classifier on training set and predicts it on the test set.

Argument and Default Value

None

Details

This switch split the data into test (1/5) and training (4/5), then creates a classification model (aka a classifier) on the training data only. It then predicts the outcome class for the data in the test set, and yields accuracies for the created model. This technique is called out of sample prediction, and is used to avoid over:doc:fwflag_fitting. It is usually better to either use --nfold_test_classifiers, which does the same thing as --test_classifiers but multiple times. Alternatively, you can manually create a test/training set by splitting your data in MySQL. If you're doing this, it's preferable to put "wordy" users in the training set, to boost the accuracy.

Other Switches

Required Switches: -d, -g, -t, -f, --outcome_table, --outcomes Optional Switches: --group_freq_thresh --no_standardize --model --sparse etc.

Example Commands