Orthogonalization and machine learning methods for residential energy estimation with social and economic indicators

Publication Year
2020

Type

Journal Article
Abstract

The objective of this study is to identify the key factors that influence residential energy use in energy modeling. In doing so, we explore the impact of data transformations and analysis methods in developing residential energy models using social, economic, and demographic indicators at the zip code level in Atlanta, GA and for the entire state of Georgia. Orthogonalization algorithms, machine learning and variable selection techniques and ordinary least squares (OLS) are used to generate models for annual energy use for each zip code. Using log transformed yearly electricity estimation with orthogonalization yielded better estimates than other transformations [R2 = 0.80, normalized root mean squared error (NRMSE) = 0.33, parameters = 15] and results for natural gas estimate were better (R2 = 0.95, NRMSE = 0.15, parameters = 9). As expected, both models showed that socio-demographic factors are significant predictors. For natural gas, income and household make-up are the most important factors while electricity has a broader variety of indicator types. For electricity, despite the model accounting for 80% of electricity variation, the NRMSE was still moderately high (0.33). When electricity use was separated into two clusters (high and low usage), the high use clusters appeared to match the interstate infrastructure morphology. These results show that electricity use, unlike natural gas use, is influenced by the morphology of the interstate roadway infrastructure and other social demographic factors.

Journal
Applied Energy