The proposed model is stacked with two LSTM layers to produce a high prediction accuracy based on historical meteorological time series.. A linear time series prediction model was propos
Trang 1LS-SPP: A LSTM-Based Solar Power Prediction Method from Weather
Forecast Information
Nhat-Tuan Pham∗, Nhu-Y Tran-Van† Kim-Hung Le‡ University of Information Technology, Vietnam National University Ho Chi Minh City
Ho Chi Minh, Vietnam Email:∗17521219@gm.uit.edu.vn,†17521287@gm.uit.edu.vn‡hunglk@uit.edu.vn
Abstract—Solar radiation is an unlimited source of clean
energy with huge exploitation potential To effectively exploit
this valuable resource, the arrival of the solar forecast has
shown an improvement in incorporating renewable energy into
the grid system Having accurate solar prediction would yield
useful information to ensure the power grid’s stability, gain the
advantage of renewable energy, and minimize mineral resource
consumption In this paper, we introduce a novel deep learning
model, namely LSTM-Based Solar Power Prediction (LS-SPP),
combining long short-term memory and a recurring neural
network (LSTM-RNN) The proposed model is stacked with two
LSTM layers to produce a high prediction accuracy based on
historical meteorological time series Our practical experiment
on real datasets shows that the LS-SSP model achieves up
to 96.78% accuracy in performance, higher than the best of
competitors reported about 94.19%
Index Terms—Solar power prediction, Long short term
mem-ory, Industrial Internet of Things
I INTRODUCTION Solar power is a renewable, infinite, and friendly energy
source to the environment that lowers pollutants and
green-house gas emissions According to the European Photovoltaic
Industry Association (EPIA), solar PV installations have been
strongly invested, with the total installed solar PV capacity
globally in 2014 up to 177GW, and CO2 emissions have
decreased by about 53 million tons per year [1] In Africa,
many nations, especially those around the deserts, receive a
great deal of sunlight every day These countries have an
opportunity for the development of solar technology across
the region The distribution of PV systems is almost uniform
in Africa, with most countries receiving about 2000 kW h/m2
every year Asia alone contributed to 66.66% of the global
amount of solar power installed in 2016, with about 50%
coming from China [2]
Forecasting the capacity of renewable power sources,
espe-cially solar and wind power, has become more critical, along
with benefits such as power supply into the electrical system,
taking advantage of on-site energy sources [3] The
deploy-ment and connection of solar power plants to the national grid
system also affects the grid’s operations Firstly, the output
power of solar PV is not stable, frequently changing with
high variation For example, the peak summers may lead to
higher output of solar plants, whereas rainy days generate
small electrical output A direct consequence of this is that
the electricity system must have a redundant high capacity to
ensure an adequate supply of power to the system load [4]
Secondly, solar power is often interrupted suddenly There is
no dynamic reserve like rotating generators, so joining the grid system with a high density will reduce the system’s ro-tational inertia, causing reduced storage capacity and stability
to the grid system Due to the mentioned uncertainty, accurate forecasting of renewable power source’s capacity plays a significant role in economic aspects, ensuring efficiency and stability Having greater insight into predicted solar values allows grid operators to manage variable output proactively and thus integrating solar resources into the existing grid at lower costs [5]
In recent years there has been renewed interest in applying machine learning to solar forecasting A linear time series prediction model was proposed to predict solar energy values along the horizon up to 36 hours with a 15-minute observation time based on global radiation forecast with data from the Danish Meteorological Institute for every 3 hours [6] Ran-dom forest regression models that have been introduced give positive results in solar energy prediction based on weather data for one day ahead [7] A model using Expanded Extreme Learning Machine (EELM) was shown to predict solar energy for about 5 minutes, and 1 hour ago with data collected from National Renewable Energy Laboratory (NREL) [8] Artificial Neural Network (ANN) was developed for the 24 hourly solar PV production predictions in Amman, Jordan, which gave better results than Extreme Learning Machines (ELM)[9] ANN can also be used to predict small scale solar
PV systems with 750W solar panels [10] Prediction models were developed based on information obtained from weather forecasting and cloud cover to apply in solar forecasting [11] Based on previous studies, apply machine learning Support Vector Machines (SVM) to predict with data provided by National Weather Service (NWS) with time frame per hour [12] A predictive model based on images from different satellites applying SVM with 4-year data from satellites
to configure inputs and outputs data sets [13] The least-square SVM model predicts using atmospheric transmissivity history as input data and returns solar level based on the latitude of place and time of day[14] The hybrid model has been applied heterogeneous regression algorithms to predict solar power supply capacity before 6 o’clock, based on past data in Rockhampton, Australia [15] Hybrid models are mentioned as methods proposed combination models discrete wavelet transform (DWT) and Auto-Regressive Moving Av-erage (ARMA)
Trang 2Machine learning is the process of the algorithm changing
its performance in response to data The learning algorithm
then creates a set of rules based on inferences from the
data [16] It is easy to apply to various scenarios but produces
low performance and accuracy The deep learning model
in-spired by the neural architecture of the human brain has been
developed to overcome the above problem Some ordinary
neural networks in deep learning are Convolution Neural
Net-works (CNN) and Recurrent Neural NetNet-works (RNN) [17]
Models using CNN often have high complexity and heavy
processing, leading to resource consumption, whereas RNN
is designed to process data in sequence or time [18] It shows
that RNN is suitable for research and development to predict
solar energy In this study, we aim at increasing the solar
prediction accuracy by proposing a novel RNN model The
main idea is to use memory to save information slowly,
preprocessing steps to make the most accurate prediction for
the current prediction step To do this, the long short-term
memory (LSTM), a particular form of RNN, is leveraged to
avoid long-term dependency in historical solar data, resulting
in quickly and appropriately improving predictive accuracy
in many contexts In our evaluation, the proposed model
is compared with existing models such as linear regression,
random forest regression over the practical datasets provided
by HI-SEAS, meteorological data from the weather station
HI-SEAS in 4 months from September to December 2016
The experimental results show that our proposed model
outperforms competitors The explained variance score is
reported at 96.78%, while the best of competitors (Random
Forest Regression) is about 94.19%
The rest of this paper is organized as follows We briefly
introduce LSTM and then describe our proposal in Section II
Section III describes the model’s implementation in detail,
including the prediction network’s training, data processing,
and experiment results In Section IV, we conclude the whole
of our work
II THELS-SPPMETHOD
In this section, we briefly explain about LSTM model
before describing in detail how our proposal could produce
an effective solar forecast
A LSTM model
LSTM is an enhanced version of RNN that has encountered
a vanishing gradient problem in the backpropagation [19] In
more detail, the backpropagation of a small gradient value
over time leads to forgetting what was seen before
(short-term memory) LSTM has internal gates to regulate the flow
of information through learning and deciding which important
data are cached It means that LSTM could learn how to retain
only relevant information to make predictions As a result, the
prediction results produced by LSTM achieve high accuracy
Fig 1 LSTM memory unit
LSTM operations are based on the status of the cells and the different gates The cell state carries relevant information during sequence processing The gates are the place to decide whether to memorize or discard information into a cell state during the training process Includes Forget gate, Input gate, Output gate Figure 1 is used to illustrate the LSTM memory unit
ft= σ(Wf˙[ht−1, xt] + bf) (1)
it= σ(Wi˙[ht−1, xt] + bi) (2)
f
Nt= tanh(Wc˙[ht−1, xt] + bC) (3)
Nt= ft∗ Nt−1+ it∗ Nt (4)
Ot= σ(Wo˙[ht−1, xt] + bo) (5)
ht= Ot∗ tanh(Nt) (6) Forget gate operates on the sigmoid function used to determine which data are removed from memory The values from the hidden state (ht−1) and current input (yt) are passed
to this function (ft) These data are dropped if they are closer
to 0 and retained if closer to 1, as represented in (1) Input gate
is responsible for updating cell state and the data are passed through the function (it), (fNt) with (2), (3) At (it), the data
go through sigmoid function [0,1] and (fNt) pass through tanh function [-1,1] then multiplied together The values of the trigger function close to 1 are be saved for use again Cell memory updates itself by multiplying the value from Forget gate with the cell state in the previous state and then adding the value from the Input gate (Nt) follow (4) Output gate is tasked to return results based on the value of memory The data pass through the sigmoid function (Ot) by (5), whereas the values from cell state are processed in a tanh function The next hidden state is carry this value by (ht) follow (6)
Trang 3B LS-SPP model
In this article, we evaluated and analyzed many models
with different parameters to improve solar prediction
accu-racy These models are mentioned as Elastic Regression [20],
Gradient Boosting Regression [21], Decision Tree Regression
[22], XGBoost Regression[23], and Random Forest
Regres-sion [24] From the experiments, it can be seen that applying
our proposal produces better results than its competitors
LS-SPP has memory, which makes processing large datasets
more accurate Besides, it is also efficient without requiring
knowledge about the relationships between features or classes
As shown in Figure 2, the Input layer passes into 2 LSTMs
before reaching Dropout and Dense layers The proposed
model could enhance the accuracy of the predicted value and
accelerate the time-series calculation process
Fig 2 The proposed model summary
In more detail, we propose a deep learning model
including 2 LSTMs, 1 Dropout, and 1 Dense The input layer
has 10x1 shape input and output values, including features
necessary for learning and training In the learning process,
the results returned from the previous layer are the next
layer’s input The first LSTM layer consists of 224 kernels
with input values from the Input layers to maximize the
data’s attributes The next LSTM layer will have input values
of the shape 10x224 with the number of kernels of 96 and
the output values of the shape 1x96 To limit the overfitting,
the Dropout layer has the role of randomly removing the
cell units in the learning process of LSTM These cell units
do not receive and transmit information, which reduces the
number of parameters that minimize the algorithm’s training
time and complexity Table I describes the number of params
in each layer The total number of proposed model params
is 325,857, where the first LSTM layer has 202,496 params, the next LSTM layer has 123,264 params, and the Dense layer has 97 params The Dense layer is responsible for transforms 96 attributes into one figure on the level of solar radiation energy using to predict
Total params: 325,857 TABLE I The proposed model layers and their params
III EVALUATION
A Data Description Our evaluation data are gathered from Hawai’i Space Exploration Analog organizations, and Simulation (HI-SEAS)
is from NASA’s Hackathon in Solar Radiation Prediction HI-SEAS is a research station that explores signals from Mars and the Moon to collect and analyze data The datasets are meteorological data collected from the HI-SEAS weather sta-tion over the past four months from September to December
2016 [25] The columns in the data include information about temperature (°F), humidity (%), barometric pressure (Hg), wind direction (°), wind speed (mph), time sunrise and time sunset (Hawaii time), and solar radiation (W/m2) The total number of samples in the dataset is 32686, with a 5-minutes interval between samples Sample solar radiation data in 24 hours is shown in Figure 3 The goal is to achieve the results
of solar radiation prediction based on values in the past
Fig 3 Solar radiation in 24 hours
After exploring and analyze data to extract key variables and determine optimal factor settings We added several columns to maximize data insights and improve the accuracy
Trang 4of the prediction, such as “DayLengthinsec” column to
de-termine the time with the sun in the day calculated by taking
“TimeSunSet” - “TimeSunRise” and converting it to seconds
“time in sec” is used to convert the collect time of data
to seconds based on the value of the “Time” column The
two columns “Month” and “Day of Month” are based on
the value from the “Data” column to identify the month and
days in that month The predicted value returned is Radiation
Training for data will have 21899 samples and 10787 samples
for testing
B Index of Performance
This section is used to describe the performance evaluation
criteria in the prediction of solar radiation In this paper,
two criteria are selected to evaluate the error and accuracy
of the LSTM model and for comparison with other models
These criteria are Mean Squared Error (MSE) and Explained
Variance Score (EVS)
M SE ≡ 1
n
n X
i=1 (Yi− ˆYi)2 (7)
EV S ≡ 1 −[V ar(Yi− ˆYi)]
MSE is used to find errors or deviations in the learning
process, one of the most popular methods for measuring
mean error values With the main purpose of testing and
comparing the difference between the actual value and the
predicted value In (7), with the size of the data set n, Yi
and ˆYi are the actual and predicted values at the time ith,
respectively
EVS is used to evaluate the performance of a model by
measuring the difference between predicted results compared
to actual data, which indicates the model accuracy According
to the formula, the highest value the model can achieve is 1
C Results
The evaluation criteria mentioned above are applied in
this section to show the efficiency and accuracy of the
proposed model The training process showed that errors were
significantly reduced from the 200 epoch and increasingly
close to the actual value Loss results in the training process
are shown in Fig 4 The proposed model gives low error
results and fast convergence in the learning process
Fig 4 Model training loss
Our proposed model achieves an accuracy of 96.78 % and MSE of 0.0021 This means that it could accurately predict future values learning from historical data We visualize sample values of predicted and actual data in Figure 5 As we can see, LS-SPP prediction results are similar to the ground truth values, except for a few individual points
Fig 5 Predicted and actual data samples
To demonstrate the superiority of our proposal, we also compare it with existing solar prediction algorithms Our evaluation is based on the algorithms presented and source code available on github [26] Comparative evaluation values are obtained after running experiments based on the same dataset and summarized in Table II We note from Table II that the MSE index of our proposal is much lower than our competitors MSE of LS-SPP is about 0.0021, whereas the best of competitors is 5674.32 (Random Forest Regression)
In addition, our proposal has a high EVS rating compared to the rest of the models The EVS result for LSTM is 0.96, while the best results for the models listed are only 0.94 and 0.92 by Random Forest Regression and Gradient Boosting Regression, respectively Lasso Regression is the lowest and recorded about 0.62 In short, we can easily see that LS-SPP outperforms all of our competitors
IV CONCLUSION This article aims at building a solar forecasting model using Deep Learning, namely LS-SPP Prediction accuracy is a factor influencing the integration of solar energy into the grid system Precise solar energy forecasting aims to move towards building a renewable energy plant to reduce greenhouse gas emissions Our proposed model makes predictions based
Trang 5Model MSE EVS
Random Forest Regression 5674.32 94.19
Gradient Boosting Regression 6594.22 93.25
XGBoost Regression 7193.13 92.64
Decision Tree Regression 10771.67 88.98
Ada Boost Regression 14649.23 85.02
Neural Network Model 15695.12 83.95
Elastic Net Regression 36504.79 62.67
Lasso Regression 36505.27 62.67
TABLE II Comparing MSE and EVS values between LS-SPP
and competitors
on time series data learned in the past It was evaluated
experimentally on meteorological historical time-series data
provided by HI-SEAS From the experimental resutls, LS-SPP
shows the best results compared to other machine learning
models The proposed method’s accurate results up to 96.78
% higher than 2.58 %, when compared with Random Forest
Regression is 94.19 Besides, the proposed method’s error is
0.0021, which is much lower than the number 5674.32 of
Random Forest Regression
Our future works will focus on using different
architec-tural models and analysis techniques to improve prediction
results’ accuracy Develop a model that is adaptable in many
contexts and apply them to IoT devices and Edge computing
Additionally, research and exploitation of renewable energy
sources (e.g., wind, water) that take advantage of clean energy
sources and integrated them into the grid system
REFERENCES [1] A J¨ager-Waldau, “The photovoltaic business: Manufacturers and
mar-kets,” CLEAN ELECTRICITY FROM PHOTOVOLTAICS, p 613, 2015.
[2] “Global trends in solar energy sector.” https://www.bioenergyconsult.
com/trends-in-solar-energy-sector/.
[3] M A Delucchi and M Z Jacobson, “Providing all global energy with
wind, water, and solar power, part ii: Reliability, system and
transmis-sion costs, and policies,” Energy policy, vol 39, no 3, pp 1170–1190,
2011.
[4] P V Gomes, J T Saraiva, L Carvalho, B Dias, and L W Oliveira,
“Impact of decision-making models in transmission expansion planning
considering large shares of renewable energy sources,” Electric Power
Systems Research, vol 174, p 105852, 2019.
[5] J Cochran, P Denholm, B Speer, and M Miller, “Grid integration and
the carrying capacity of the us grid to incorporate variable renewable
energy,” tech rep., National Renewable Energy Lab.(NREL), Golden,
CO (United States), 2015.
[6] P Bacher, H Madsen, and H A Nielsen, “Online short-term solar
power forecasting,” Solar energy, vol 83, no 10, pp 1772–1783, 2009.
[7] S.-G Kim, J.-Y Jung, and M K Sim, “A two-step approach to solar power generation prediction based on weather data using machine learning,” Sustainability, vol 11, no 5, p 1501, 2019.
[8] S Mishra, L Tripathy, P Satapathy, P K Dash, and N Sahani, “An efficient machine learning approach for accurate short term solar power prediction,” in 2020 International Conference on Computational Intel-ligence for Smart Power System and Sustainable Energy (CISPSSE),
pp 1–6, 2020.
[9] S Al-Dahidi, M Louzazni, and N Omran, “A local training strategy-based artificial neural network for predicting the power production of solar photovoltaic systems,” IEEE Access, vol 8, pp 150262–150281, 2020.
[10] E Izgi, A ¨ Oztopal, B Yerli, M K Kaymak, and A D S¸ahin, “Short– mid-term solar power prediction by using artificial neural networks,” Solar Energy, vol 86, no 2, pp 725–733, 2012.
[11] N Sharma, J Gummeson, D Irwin, and P Shenoy, “Cloudy computing: Leveraging weather forecasts in energy harvesting sensor systems,” in
2010 7th Annual IEEE Communications Society Conference on Sensor, Mesh and Ad Hoc Communications and Networks (SECON), pp 1–9, 2010.
[12] N Sharma, P Sharma, D Irwin, and P Shenoy, “Predicting solar generation from weather forecasts using machine learning,” in 2011 IEEE International Conference on Smart Grid Communications (Smart-GridComm), pp 528–533, 2011.
[13] H S Jang, K Y Bae, H.-S Park, and D K Sung, “Solar power prediction based on satellite images and support vector machine,” IEEE Transactions on Sustainable Energy, vol 7, no 3, pp 1255–1263, 2016 [14] J Zeng and W Qiao, “Short-term solar power prediction using a support vector machine,” Renewable Energy, vol 52, pp 118–127, 2013 [15] M R Hossain, A M Oo, and A S Ali, “Hybrid prediction method
of solar power using different computational intelligence algorithms,”
in 2012 22nd Australasian Universities Power Engineering Conference (AUPEC), pp 1–6, IEEE, 2012.
[16] K Smagulova and A P James, “A survey on lstm memristive neural network architectures and applications,” The European Physical Journal Special Topics, vol 228, no 10, pp 2313–2324, 2019.
[17] M Liang and X Hu, “Recurrent convolutional neural network for object recognition,” in Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3367–3375, 2015.
[18] Z C Lipton, J Berkowitz, and C Elkan, “A critical review of recurrent neural networks for sequence learning,” arXiv preprint arXiv:1506.00019, 2015.
[19] S Xingjian, Z Chen, H Wang, D.-Y Yeung, W.-K Wong, and W.-c Woo, “Convolutional lstm network: A machine learning approach for precipitation nowcasting,” in Advances in neural information processing systems, pp 802–810, 2015.
[20] J D Tucker, J R Lewis, and A Srivastava, “Elastic functional principal component regression,” Statistical Analysis and Data Mining: The ASA Data Science Journal, vol 12, no 2, pp 101–115, 2019 [21] Y Zhang and A Haghani, “A gradient boosting method to improve travel time prediction,” Transportation Research Part C: Emerging Technologies, vol 58, pp 308–324, 2015.
[22] M Xu, P Watanachaturaporn, P K Varshney, and M K Arora,
“Decision tree regression for soft classification of remote sensing data,” Remote Sensing of Environment, vol 97, no 3, pp 322–336, 2005 [23] T Chen and C Guestrin, “Xgboost: A scalable tree boosting system,”
in Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining, pp 785–794, 2016.
[24] P F Smith, S Ganesh, and P Liu, “A comparison of random forest re-gression and multiple linear rere-gression for prediction in neuroscience,” Journal of neuroscience methods, vol 220, no 1, pp 85–91, 2013 [25] “Solar radiation prediction.” https://www.kaggle.com/dronio/ SolarEnergy/.
[26] shashanksira, “Solar-radiation-prediction.” https://github.com/ shashanksira/Solar-Radiation-Prediction, 2017.