Document Type
Article
Publication Date
6-2026
Publication Title
Data in Brief
Abstract
Machine learning has become an increasingly important tool for overcoming agricultural challenges by enabling efficient and consistent classification of crop-related data. Training such supervised models requires high quality labeled datasets. This work presents a dataset consisting of raw and preprocessed hyperspectral imaging (HSI) files capturing reflectance in the visible to near-infrared range (400–1000 nm) from two problematic weed species on California’s Central Coast: annual sowthistle (Sonchus oleraceus) and little mallow (Malva parviflora). Hyperspectral imaging provides rich spectral-spatial data cubes that can support the development of deep learning models and autonomous technology for precision weed management. Plants were grown in a greenhouse under five conditions: standard, drought, overwatering, excess fertilizer, and no fertilizer. Custom MATLAB scripts were utilized for preprocessing, including k-means clustering to define regions of interest (ROIs), and extraction of spectral metrics. Data visualization was performed using Wolfram language and MATLAB. The dataset includes both raw and ENVI-formatted hyperspectral cubes and pre-processed MATLAB outputs, supporting spectral feature engineering, benchmark development, and exploratory machine learning workflows for controlled environment stress classification.
Recommended Citation
Brunnengraeber, Elijah; Cassidy, Tyler; Woodbridge, Dylan; Jani, Arun D.; and Sharma, Arun K., "Handheld Hyperspectral Imaging Dataset of Annual Sowthistle and Little Mallow Under Abiotic Stress for Machine Learning" (2026). Biology, Agriculture and Chemistry Faculty Publications and Presentations. 58.
https://digitalcommons.csumb.edu/biochem_fac/58
Comments
Published in Data in Brief by Elsevier Inc. Available via doi: 10.1016/j.dib.2026.112858.
This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)