• Repositories
Below there are some links for some repository data sets for machine learning tasks, including regression tasks, classification tasks, etc. The datasets given below include some soft sensors datasets (which is one of my areas of research). The datasets are also discriminated regarding if they are static or dynamic and if they come from a soft sensors application or not.
• General Datasets Repositories
• Soft Sensors Datasets
Download: here;
Description: Stationary; Extracted from a real WWTP plant, more info can be found in page 7 here or here;
Data Info: Continuous; Stationary; Number of inputs: 8; Number of samples: 1000; Output: Fluorine at efluent stage;
Objective: Predict fluorine at the efluent stage;
In case of publication please cite: Francisco Souza, Rui Araújo, Tiago Matias, Jérôme Mendes. A Multilayer-Perceptron Based Method for Variable Selection in Soft Sensor Design. Journal of Process Control, 23(10):1371-1378, November 2013. [ bib |
DOI | .pdf ].
SRU Unit:
Download: Data for SRU Unit and Debutanizer Column (original link);
Description: Stationary; Extracted from a real Debutanizer plant, more info can be found in page XX of Fortuna et al. Book;
Data Info: Continuous; Stationary; Number of inputs: 7; Number of samples: 2393; Output: Butane concentration;
Objective: Predict the butane concentration on a Debutanizer column;
In case of publication cite the original book: Fortuna et al. Book.
• Regression Datasets
Boston Housing data set;
Description: regression dataset, also available here;
Further information about this dataset and its application in our research can be found at the following publication: Symone G. Soares, Carlos H. Antunes, and Rui Araújo. A genetic algorithm for designing neural network ensembles. In Proc. Genetic and Evolutionary Computation Conference (GECCO 2012), a recombination of the 21st International Conference on Genetic Algorithms (ICGA), and the 17th Annual Genetic Programming Conference (GP), pages 681–688, Philadelphia, USA, July 07-11 2012. ACM. [ bib | DOI | .pdf ]