To ensure that your results are valid, consider the following guidelines when you collect data, perform the analysis, and interpret your results.
- The response variable should be categorical
- Categorical variables contain a finite, countable number of categories or distinct groups. Categorical data may or may not have a logical order. For example, categorical variables include gender, material type, and payment method.
- If your response variable has two categories, such as pass and fail, then the response is binary.
- If your response variable contains three or more categories, then the response is multinomial.
The data for the response variable must be either text values or numeric values. Date/time values are not allowed.
If your response variable is continuous, use Random
Forests® Regression.
- Predictor variables may be continuous or categorical
-
You can use a combination of continuous or categorical predictors; however, the column lengths for each predictor must be the same length as the response column. Missing values are allowed.
- All continuous predictors must be numeric.
- Categorical predictors can be text or numeric values.