Abstract—The real estate business is a difficult market to understand, but can return very high profits with the right techniques. People spend years in the market to gain the necessary experience to be able to evaluate a property and assess its price based on several factors, those factors have to be extracted from big datasets. This report aims to show an approach to automate that process, the algorithm should extract the important factors and apply them to new data to predict the prices of the new properties without human assistance. This project was divided into several parts, first the data are collected from the City of Edmonton, then these data are scanned, grouped, and prepared for classification. The next step was to use unsupervised learning clustering techniques to divide the data to classes according to the patterns or groups existing in the data. The last step was to apply supervised learning by using decision trees and ensemble bagged trees to predict the classes of the new data. By the end of this project, the results are interpreted and the workflow of how each technique was used to perform prediction through classification is explained.
Class | Min | Max |
---|---|---|
Class 1 | 0 | 50000 |
Class 2 | 50000 | 100000 |
Class 3 | 100000 | 150000 |
Class 4 | 150000 | 200000 |
Class 5 | 200000 | 250000 |
Class 6 | 250000 | 300000 |
Class 7 | 300000 | 400000 |
Class 8 | 400000 | 500000 |
Class 9 | 500000 | 1.5 M |
Class 10 | 1.5 M | 10 M |
Class 11 | 10 M | ∞ |