A mapreduce relational-database index-selection tool
The physical design of data storage is a critical administrative task for optimizing system performance. Selecting indices properly is a fundamental aspect of the system design. Index selection optimization has been widely studied in DataBase Management Systems (DBMSs). However, current DBMS are not appropriate platforms for many data nowadays. As a result, several systems have been developed to deal with these data. An index-selection optimization approach is still needed in these systems. In fact, it is even more necessary since they process Big Data. Under these circumstances, developing an index-selection tool for large-scale systems is a vital requirement. This thesis focuses on the index-selection process in HadoopDB. The main contribution of the thesis is to utilize data mining techniques to develop a tool for recommending an optimal index-set configuration. Evaluation shows significant performance improvement on the tasks running time with the tool index-set configuration.
History
Language
EnglishDegree
- Master of Science
Program
- Computer Science
Granting Institution
Ryerson UniversityLAC Thesis Type
- Thesis