Atma Mani
I am an avid geospatial scientist / engineer with backgrounds in software development, thermal & hyperspectral remote sensing and several other areas of geospatial tech.
One of the fundamental questions in real estate is the question of ‘where’. Numerous studies indicate the place you live can impact a multitude of wellness factors, including, your life expectancy. Home buyers try to weigh several factors such as cost, the distance to major facilities, noise, air quality, community, neighborhood, school district, risks due to natural calamities etc. while looking for a place to live. Such analysis is not limited to just house hunting, business analysts and entrepreneurs run a similar multi-criteria analysis for a multitude of problems such as finding a suitable spot for a new grocery store, dentist office, coffee shop, etc. In this talk, using house-hunting as an example spatial analysis problem, we will explore how to read spatial and non-spatial data in Python as Pandas DataFrame objects, perform exploratory and statistical analysis and visualize them on a map in a Jupyter notebook. We then score properties based on the criteria. We will finally teach a machine learning model (in scikit-learn) to understand our preferences and let it predict for us in the future. We will use the free ArcGIS API for Python to perform spatial analysis and learn how it can easily interoperate with popular data analysis libraries in the scientific Python and geospatial Python ecosystems.
Buying a house is a huge financial and personal undertaking for most people. Whether we realize or not, a lot of decisions we make are heavily influenced by the location of the houses. In this talk, I show how Python's data analysis and geospatial analysis packages can be used to analyze the whole gamut of available listings in a market, evaluate and score properties based on various attribute and spatial parameters and arrive at a shortlist. I extend by showing how this process can be used to build a machine learning model that will understand our preferences and continue to learn as more data is fed. I conclude with ideas for future work and how rest of the industry is progressing in this field.