International Research Journal of Engineering and Technology (IRJET)
e-ISSN: 2395-0056
Volume: 11 Issue: 08 | Aug 2024
p-ISSN: 2395-0072
www.irjet.net
A KNN-Linear Regression Fusion Approach for Improved Real Estate Price Estimation Prit J. Kanadiya1, Pramila M. Chawan2 1Final Year B. Tech, Department of Computer Engineering and Information Technology, VJTI Mumbai,
Maharashtra, India
2 Associate Professor, Department of Computer Engineering and Information Technology, VJTI Mumbai,
Maharashtra, India ---------------------------------------------------------------------***---------------------------------------------------------------------
Abstract – In this study, we introduce a novel hybrid
information can provide insights into aspects like accessibility and local services that are difficult to quantify but crucial for accurate price predictions.
approach for house price prediction by integrating K-Nearest Neighbors (KNN) with Linear Regression. Our method leverages the strengths of both models to enhance predictive accuracy. Initially, we evaluate the effectiveness of geospatial features in Linear Regression and K-Nearest Neighbors for predicting house prices in Mumbai. We utilize two distinct datasets: one containing traditional features such as the number of bedrooms, square footage, and other property characteristics, and another incorporating geospatial data represented by latitude and longitude. Building on this analysis, we propose a method that first identifies the K nearest houses using KNN, and then applies Linear Regression on this localized subset to predict the price of a test property. Our hybrid model demonstrates significant improvements in predictive performance, highlighting the critical role of spatial information in real estate valuation.
In this study, we propose a new method that combines KNearest Neighbors (KNN) with Linear Regression to enhance house price prediction. We test this hybrid approach by comparing it with Linear Regression and KNN. Using two datasets—one with traditional features like the number of bedrooms and property size, and another with geospatial features such as latitude and longitude—we aim to assess how incorporating geospatial data improves the accuracy of house price predictions. Our research is structured as follows: Section 2 reviews related literature to contextualize our approach. Section 3 provides details on the methodology which involves dataset, feature engineering, evaluating the impact of geospatial features, and describing our proposed hybrid model. The results of our studies are presented and interpreted in Section 4, and the paper's conclusion and discussion of the ramifications of our findings are covered in Section 5.
Key Words: House price prediction, Hybrid Prediction Model, Geospatial features, Predictive analysis, Feature Engineering, Machine learning
1. INTRODUCTION
2. LITERATURE REVIEW
Predicting house prices is a key challenge in machine learning with broad applications in real estate and urban planning. Despite significant advancements in this field, accurately forecasting property values remains difficult due to the numerous factors that influence prices. Traditionally, house price prediction models rely on features such as the number of bedrooms, the size of the property in square feet, and the building’s age. However, these factors alone often fail to capture the full picture of property value.
House price prediction can be modeled as a supervised learning problem, where the goal is to predict the price of a house based on certain characteristics. Mathematically, we have a dataset D = {(Xi, Yi) | i = 1, 2, ...,n}, where Xi ∈ Rm represents the feature vector of the i-th house and Yi ∈ R denotes the corresponding house price. Our goal is to learn a hypothesis function that accurately models the house prices by mapping the features of the house i.e. Xi to its price Yi. We need to find the optimal hypothesis f that minimizes the cost function represented as J(θ).
One crucial aspect that traditional models might overlook is the exact location of a property. Location, represented by latitude and longitude, can significantly impact house prices by reflecting factors like connectivity, neighborhood quality, and proximity to amenities. For example, the value of a house in a well-connected area with good schools and nearby public transport can differ greatly from a similar house located in a less desirable area. It might not be possible to capture these features explicitly. The location of a house implicitly models all these features. This spatial
© 2024, IRJET
|
Impact Factor value: 8.315
Here, ℓ(f(Xi), Yi) represents the loss function that measures the error between the actual price Yi. and the predicted price f(Xi).
|
ISO 9001:2008 Certified Journal
|
Page 688