A scientist at a food chemistry laboratory analyzes 60 soybean flour samples. For each sample, the scientist determines the moisture and fat content, and records near-infrared (NIR) spectral data at 88 wavelengths. The scientist randomly selects 54 of the 60 samples and estimates the relationship between the responses (moisture and fat) and the predictors (the 88 NIR wavelengths) using PLS regression. The scientist uses the remaining 6 samples as a test data set to evaluate the predictive ability of the model.
You can use this data to demonstrate Partial Least Squares Regression.
Worksheet column | Description | Variable type |
---|---|---|
C1-C88 | NIR spectral data for 88 wavelengths of 54 samples | Predictor |
Moisture | Moisture of each flour sample | Response |
Fat | Fat content of each flour sample | Response |
C91-C178 | NIR spectral data for 88 wavelengths of 6 samples used as a test set | Predictor |
Moisture2 | Moisture of each test set flour sample | Response |
Fat2 | Fat content of each test set flour sample | Response |