When selecting the Training and Test Data in the Data and Analysis Control panel, a filter may be applied to either the training or test dataset. This is useful if the same dataset is going to be used for training and test data (90% of it for training data and 10% of it for test data, for example), or if a subset of either of the datasets is intended for use rather than the whole set of data.
To add a filter, click on the "Click to add a filter" label. If a filter already exists, it can be edited by clicking on the already-existing where clause.
This will open up the Where Clause Builder window where the user can either type in a where clause directly if he/she is comfortable with SQL syntax, or can use the drop-down boxes to help build the where clause.
The first drop-down contains a list of all the fields available in the Training/Test Table already selected. Use it to select a field to be filtered on:
The next drop-down contains the various SQL Operators that can be used in the where clause against the field that has just been selected:
The final drop-down contains all the unique values of the field that has already been selected:
After selecting from all three of the drop-downs, a valid where clause will have filled in to the right. At this point the user can press the "Validate" button to see what effect the where clause has on the Training/Test Data, or can continue to build the where clause by selecting another operator:
When the where clause is complete, the "Validate" button may be used to first double check that the filter is valid and second to display how many records are left in the Training/Test dataset after applying the filter1. Pressing the OK button also validates the where clause before closing the Where Clause Builder window.
1Note that the where clause affects the dataset that is used in the MultiRate analysis, but does not actually physically change the SQL Table selected.