Pauli Murto (1998), Neural network models for short-term load forecasting

As the models presented in the literature are either intended for forecasting the wholedaily load curve at once, or forecasting the load hourly, this division is used here intesting diff

Trang 1

Department of Engineering Physics and Mathematics

NEURAL NETWORK MODELS FOR SHORT-TERM

LOAD FORECASTING

Pauli Murto

Supervisor: Professor Raimo P Hämäläinen

Instructor: Lic Tech Arto Juusela

Helsinki, January 5, 1998

Trang 2

Author: Pauli Murto

forecasting by a large number of researchers This work studies the applicability

of this kind of models The work is intended to be a basis for a real forecasting application.

First, a literature survey was conducted on the subject Most of the reported models are based on the so-called Multi-Layer Perceptron (MLP) network There are numerous model suggestions, but the large variation and lack of comparisons make it difficult to directly apply proposed methods It was concluded that a comparative study of different model types seems necessary.

Several models were developed and tested on the real load data of a Finnish electric utility Most of them use a MLP network to identify the assumed relation between the future load and the earlier load- and temperature data.

The models were divided into two classes First, forecasting the load for a whole day at once was studied Then hourly models, which are able to update the forecasts as new data arrives, were considered.

The test results showed, that the hourly models are more suitable for a forecasting application The forecasting errors were smaller than with a SARIMAX model, which was tested for the comparative purpose The work suggests that this kind

of an hourly neural network model should be implemented for a thorough on-line testing in order to get a final opinion on its applicability.

(MLP) networks

(filled by the secretary)

Trang 3

Tekijä: Pauli Murto

sähkönkulutuksen lyhyen aikavälin ennustamiseen Tässä työssä tutkitaan tällaisten mallejen soveltuvuutta Työ on tarkoitettu perustaksi varsinaiselle ennustesovellukselle.

Työssä perehdyttiin ensin aiheeseen liittyvään kirjallisuuteen Useimmat kirjallisuudessa esitetyistä menetelmistä perustuvat niin kutsuttuun monikerros- perceptron (Multi-Layer Perceptron, MLP) verkkoon Vaikka ehdotettuja malleja

on paljon, niiden vaihtelevuus ja keskinäisen vertailun puuttuminen tekevät menetelmien suoran soveltamisen vaikeaksi Pääteltiin, että on välttämätöntä vertailla kokeellisesti erilaisia mallityyppejä.

Työssä kehitettiin erityyppisiä malleja, joita tutkittiin erään suomalaisen laitoksen todellisella kulutusaineistolla Useimmat mallit perustuvat monikerros- perceptron verkkoon Tällä pyrittiin mallintamaan tulevien kulutusarvojen oletettu riippuvuus aiemmista kulutus- ja lämpötila-arvoista.

sähkö-Mallit jaettiin kahteen luokkaan Ensin tutkittiin koko vuorokauden kulutuskäyrän ennustamista yhdellä kertaa Sitten siirryttiin tarkastelemaan tunneittaisia malleja, joilla ennustetta voidaan päivittää aina kun saadaan uutta tietoa.

Koetulokset näyttivät, että tunneittaiset mallit ovat sopivampia ennustesovellusta ajatelleen Ennustusvirheet olivat pienempiä kuin SARIMAX mallilla, jota tarkasteltiin vertailukohteena Työssä ehdotetaan, että tälläinen tunneittainen neuroverkkomalli tulisi ottaa perusteelliseen koekäyttöön, jotta saataisiin lopullinen käsitys sen soveltuvuudesta.

(osastosihteeri täyttää)

Trang 4

I have prepared this thesis at Control Systems unit of ABB Power Oy I want to thankthe head of the unit, Aimo Sorsa, for providing the opportunity to do the work Iwould also like to thank my instructor Arto Juusela and supervisor Raimo P.Hämäläinen Juha Toivari deserves my thanks for many practical hints.

Particularly, I want to thank Tuomas Raivio for critical comments and valuable adviceregarding this work

Pauli Murto

Helsinki, January 5, 1998

Trang 5

1 INTRODUCTION 7

1.1 B ACKGROUND 7

1.2 P URPOSE OF THE WORK 8

1.3 S TRUCTURE OF THE WORK 9

2 LOAD FORECASTING 10

2.1 F ACTORS AFFECTING THE LOAD 10

2.2 P ROPERTIES OF THE LOAD CURVE 11

2.3 P OSSIBLE APPROACHES 15

Classifications of methods 15

Some of the most popular methods 16

Time-of-day models 16

Regression models 17

Stochastic time series models 17

State-space models 19

Expert systems 20

3 NEURAL NETWORKS IN LOAD FORECASTING 21

3.1 M ULTI -L AYER P ERCEPTRON NETWORK (MLP) 22

Description of the network 22

Learning 24

Generalization 25

3.2 MLP NETWORKS IN LOAD FORECASTING 26

3.3 L ITERATURE SURVEY 28

Basic MLP-models 28

Peak, valley, and total load forecasting 29

Hourly forecasting 30

Unsupervised learning models 31

Other reported approaches 33

Summary 35

4 FORECASTING THE DAILY LOAD PROFILE 37

4.1 F ORECASTING WITH PEAK , VALLEY , AND AVERAGE LOADS 37

Description of the MLP network models 37

Predicting the shape of the load curve 40

Classifying the days 40

The load shape based on peak- and valley loads 41

Trang 6

Error measure 42

Peak, valley, and average load forecasts 43

Load shape forecasts 48

Combining peak, valley, and average load forecasts with load shape predictions 52

4.2 F ORECASTING DAILY LOAD ON THE BASIS OF THE PREVIOUS DAY 54

Basic idea 54

Using Kohonen's self organizing feature map 55

Overview of the model 55

Test results 56

A simple selection model 58

4.3 S UMMARY 60

5 MODELS FOR HOURLY FORECASTING 61

5.1 C HOOSING THE STRUCTURE OF THE MODELS 61

5.2 T HE MODEL FORECASTING HOUR BY HOUR 63

Description of the model 63

Results in forecasting one day at a time 65

Forecasting without temperature data 65

Including the temperature 70

The average errors for different lead times 74

Results in forecasting for one week at once 76

5.3 C OMPARISON TO A SARIMAX MODEL 77

5.4 U TILIZING ONLY THE MOST RECENT LOAD VALUES 80

Test results 80

One network for all hours 80

Separate networks for different hours 83

5.5 S UMMARY 85

6 CONCLUSIONS 86

Trang 7

1 Introduction

1.1 Background

Load forecasting is one of the central functions in power systems operations Themotivation for accurate forecasts lies in the nature of electricity as a commodity andtrading article; electricity can not be stored, which means that for an electric utility,the estimate of the future demand is necessary in managing the production andpurchasing in an economically reasonable way In Finland, the electricity marketshave recently opened, which is increasing the competition in the field

Load forecasting methods can be divided into very short-, short-, mid- and long-termmodels according to the time span (see for example Karanta 1990) In very-short termforecasting the prediction time can be as short as a few minutes, while in long-termforecasting it is from a few years up to several decades This work concentrates on

short-term forecasting, where the prediction time varies between a few hours and

about one week

Short-term load forecasting (STLF) has been lately a very commonly addressedproblem in power systems literature One reason is that recent scientific innovationshave brought in new approaches to solve the problem The development in computertechnology has broadened possibilities for these and other methods working in a real-time environment Another reason may be that there is an international movementtowards greater competition in electricity markets (Räsänen 1995)

Even if many forecasting procedures have been tested and proven successful, nonehas achieved a strong stature as a generally applied method A reason is that thecircumstances and requirements of a particular situation have a significant influence

on choosing the appropriate model The results presented in the literature are usuallynot directly comparable to each other

A majority of the recently reported approaches are based on neural network techniques (see section 3.3) Many researchers have presented good results The

attraction of the methods lies in the assumption that neural networks are able to learnproperties of the load, which would otherwise require careful analysis to discover

Trang 8

However, the development of the methods is not finished, and the lack of comparativeresults on different model variations is a problem Therefore, to make use of thetechniques in a real application, a comparative analysis of the properties of differentmodel types seems necessary.

1.2 Purpose of the work

This work studies the applicability of different neural network models on short-termload forecasting The approach is comparative The models are divided into twoclasses: models forecasting the load for one whole day at a time, and modelsforecasting ahead hour by hour Testing is carried out on the real load data of aFinnish electric utility The objective is to accomplish suggestions on choosing themost appropriate model(s)

As there is need to forecast the load accurately at all time spans, another goal is tostudy the performance of the models for different lead-times Intuitively, it seemspossible that different models should be preferred for different time spans even withinthe short-term forecasting range

The work provides the basis for an automatic forecasting application to be used in areal-time environment The requirements for this are derived from its intended usewithin an energy management system (EMS) developed by ABB Power Oy There aresome properties, which are considered important:

- The model should be automatic and able to adapt quickly to changes in the loadbehavior

- The model is intended for use in many different cases This means that generality

Trang 9

This work does not study the forecasting for special days, such as religious and legalholidays Special days have different consumption profiles from ordinary days, whichmakes forecasting very difficult for them When implementing a real application, ameans to take these days into account has to be found The most common approach,but not necessarily the best one, is to treat them as Sundays.

1.3 Structure of the work

The following chapter concentrates on the subject of load forecasting in general First,the properties of the load curve of an electric utility and different factors affecting theload are discussed Then possible approaches to the problem are considered The mostpopular conventional methods are shortly introduced

The third chapter discusses neural network models and their use in load forecasting.First, a short general introduction to neural networks is given Then, the most popularnetwork type, the Multi-Layer Perceptron network (MLP) is described The basic idea

in applying MLP based methods to the problem at hand is given A literature survey

on neural network short-term load forecasting models is carried out at the end of thechapter

As the models presented in the literature are either intended for forecasting the wholedaily load curve at once, or forecasting the load hourly, this division is used here intesting different models

The chapter four takes two approaches to forecast the daily load curve The first oneuses a MLP network to forecast daily peak (maximum), valley (minimum), andaverage load values The shape of the load curve is predicted separately The secondapproach forecasts the whole load curve on the basis of the previous day

In chapter five, the hourly models are taken into consideration Two different methodsusing MLP networks are studied The first one is intended for an arbitrary number oflead hours, and the other one only for short lead times To get a more reliable opinion

on the performance of the models, a seasonal ARIMAX model is also tested for acomparative purpose

Trang 10

2 Load forecasting

2.1 Factors affecting the load

Generally, the load of an electric utility is composed of very different consumptionunits A large part of the electricity is consumed by industrial activities Another part

is of course used by private people in forms of heating, lighting, cooking, laundry, etc.Also many services offered by society demand electricity, for example street lighting,railway traffic etc

Factors affecting the load depend on the particular consumption unit The industrialload is usually mostly determined by the level of the production The load is oftenquite steady, and it is possible to estimate its dependency on different productionlevels However, from the point of view of the utility selling electricity, the industrialunits usually add uncertainty in the forecasts The problem is the possibility ofunexpected events, like machine breakdowns or strikes, which can cause largeunpredictable disturbances in the load level

In the case of private people, the factors determining the load are much more difficult

to define Each person behaves in his own individual way, and human psychology isinvolved in each consumption decision Many social and behavioral factors can befound For example, big events, holidays, even TV-programs, affect the load (Grossand Galiana 1987, Karanta 1990, Kim et al 1995) The weather is the most importantindividual factor, the reason largely being the electric heating of houses, whichbecomes more intensive as the temperature drops (Kallio 1985)

As a large part of the consumption is due to private people and other small electricitycustomers, the usual approach in load forecasting is to concentrate on the aggregateload of the whole utility This is also the approach taken in this work This reduces thenumber of factors that can be taken into account, the most important being (Gross andGaliana, 1987):

- In the short run, the meteorological conditions cause large variation in this

aggregated load In addition to the temperature, also wind speed, cloud cover, andhumidity have an influence (see, e.g., Chow and Leung 1996, Kallio 1985,Khotanzad et al 1996)

Trang 11

- In the long run, the economic and demographic factors play the most important

role in determining the evolution of the electricity demand

- From the point of view of forecasting, the time factors are essential By these,

various seasonal effects and cyclical behaviors (daily and weekly rhythms) as well

as occurrences of legal and religious holidays are meant

The other factors causing disturbances can be classified as random factors These are

usually small in the case of individual consumers, although large social events andpopular TV-programs add uncertainty in the forecasts Industrial units, on the otherhand, can cause relatively large disturbances

Only short-term forecasting will be dealt in this work, and the time span of theforecasts will not range further than about one week ahead Therefore, the economicand demographic factors will not be discussed

The decision to combine all consumption units into one aggregate load means that theforecasting rests largely on the past behavior of the load Time factors play the keyrole in the analysis of this work In the next section, the load behavior of a Finnishelectric utility is examined, and some basic properties of the time series are discussed

2.2 Properties of the load curve

In this work, the load curve to be forecasted consists of hourly load values, which are

in reality hourly averages This means that the load curve can be seen as a time series

of real numbers, each being the average load of one hour Although, the number of theobservations is restricted to 24 per day, the models studied can be applied with slightmodifications to cases where the interval between observations is shorter

The hourly electric load demand of a Finnish electricity utility is used throughout thiswork as the test case The hourly temperature data from the influential district is alsoavailable The data ranges from May 1996 to August 1997, so the length of the dataset is about 15 months

For a more thorough testing, load data of an even longer time period would bepreferred The problem is the fact, that some of the models studied in the work usethe load data of the whole year for parameter estimation Therefore, for the modeltesting, only the summer data is left In the case studied in this work, more data is,

Trang 12

The load curve over one year is shown in figure 2.1 The seasonal trend can be easilyseen; in the winter, the average load is about twice as high as in the summer Theextent of this property is a special characteristic of Finland's load conditions, and isdue to great differences between the weather conditions of different seasons of theyear.

Figure 2.1: load over the period May 24, 1996 to May 23, 1997

There are also shorter cyclical effects, which can be seen from the autocorrelationfunction of the time series This is shown in figure 2.2 The peaks in 24, 48, 72, …indicate the daily rhythm, and the peaks in the multiplies of 168 means that also aweekly rhythm exists

Trang 13

Figure 2.2: The sample autocorrelation function of the load series shown in figure 2.1.

The weekly rhythm originates from the working day – weekend rhythm obeyed bymost people On working days social activities are at a higher level than on Saturdaysand Sundays, and therefore the load is also higher In figure 2.3, the load over twosucceeding weeks in April 1997 is shown The series begins with five quite similarpatterns, which are the load curves of Monday-Friday Then two different patterns forSaturday and Sunday follow This same weekly pattern is then repeated

Figure 2.3: The load over the period April 14 – April 27, 1997 The first day is Monday.

Trang 14

The daily rhythm on the other hand results from the synchronous behavior of peopleduring the day Most people sleep at night, and therefore the load is low at nighthours Also during the day, many activities tend to be simultaneous for a majority ofpeople (working time, lunch hour, TV-watching etc.) The daily rhythm changesthroughout the year In figure 2.4, load curves of typical Wednesdays at different parts

of the year are shown

Figure 2.4: The load curves of four Wednesdays at different seasons.

As seen in figure 2.3, there are of course differences also between the days at the

same season Therefore, in load forecasting, days are often divided into several day types, each of which has their own characteristic load patterns It is clear that

Saturdays and Sundays have different load curves than other days Often alsoMondays and/or Fridays are separated from other working days, because the closeness

of the weekend can have a slight effect on the load (see, e.g., Kim et al 1995) Amore difficult question is the classification of the special days (for example legal andreligious holidays) Sometimes they are classified in the same category with Sundays(e.g Hsu and Yang 1991) However, different special days have different loadprofiles

In this work, classifying of the days will be used in all forecasting models Theguiding principle is that three distinct classes are: 1) Mondays-Fridays, 2) Saturdays,and 3) Sundays The classifying of the special days is not examined

20 40 60 80

H o u r s

A p r i l 1 6 , 1 9 9 7

Trang 15

2.3 Possible approaches

Classifications of methods

The system load to be forecast is a random nonstationary process composed ofthousands of individual components Therefore, the range of possible approaches tothe forecasting is wide Usually the only possibility is to take a macroscopic view onthe problem, and try to model the future load as a reflection of earlier behavior Thisstill leaves the field open to very different solutions Due to the nature of the load, theonly objective method to evaluate the approaches is through experimental evidence

There are many ways to classify the approaches One possibility is to categorize the

models into two basic classes: time-of-day (nondynamic) models and dynamic models

(Gross and Galiana 1987) In time-of-day models, the load is expressed at once as adiscrete time series consisting of the predicted values for each hour of the forecasting

period Often, the load is modeled as a sum of a standard load curve (SLC) and a

residual (Bunn and Farmer 1985) The dynamic models, on the other hand, recognizethe fact that the load is not only a function of the time of the day, but also of its mostrecent behavior In these models the forecasting is therefore typically performedrecursively; the forecast for a certain hour requires forecasts for the preceding hours

Other possible classifications of approaches are, for example, the divisions betweendeterministic / stochastic models, aggregated load / consumer category models, andpeak load / load shape models (Karanta and Ruusunen 1991, Gross and Galiana1987)

The deterministic models provide only the forecast values, not a measure for theforecasting error The stochastic models, on the other hand, provide the forecast as theexpectation of the identified stochastic process They allow calculations on statisticalproperties of the forecasting error (which of course rely on the assumptions made onthe model)

All considerations in this work concentrate on aggregated load Attempts have alsobeen made to divide the consumers into categories and forecast each categoryseparately (see, e.g., Allera and McGowan 1986, Broehl 1981, Goh et al 1983) Theforecast for the total load of the utility is then obtained by adding up the categories

Trang 16

The problem is the large amount of work needed in constructing adequate models foreach consumer category.

The division between peak load models and load shape models is quite fundamental.The peak load models only forecast the daily peak loads, and load shape modelsforecast load values for all hours (or half-hours) This work concentrates on loadshape models, although peak load forecasting is treated in chapter 4 as the first step ofcreating the hourly forecast

Some of the most popular methods

Time-of-day models

In the simplest form, a time-of-day model takes the previous week's actual loadpattern as a model to predict the present week's load Alternatively, a set of loadpatterns is stored for typical weeks with different weather conditions These are thenheuristically combined to create the forecast

More commonly, a time-of-day model is of the form (Gross and Galiana 1987):

∑

= N

i i

i f t v t t

z

1

)()()

where the load at time t is expressed as a weighted sum of explicit time functions,

usually sinusoids with a period of 24 or 168 The coefficients αi are slowly varyingconstants being usually estimated through a linear regression or exponential

smoothing The modeling error v(t) is assumed to be white noise.

Spectral decomposition is another time-of-day model The model has basically the

same form as in (2.1), but the time functions f i (t) represent the eigenfunctions

corresponding to the autocorrelation function of the load time series This kind offunctions can in principle represent the colored random loads with greater precisionthan arbitrarily selected time functions

Time-of-day models have been suggested, for example, by Sharma (1974) andThompson (1976) An example of applying spectral decomposition can be found inLaing (1985)

Trang 17

Regression models

Regression models normally assume that the load can be divided into a standard loadcomponent and a component linearly dependent on some explanatory variables Themodel can be written:

)()()

()

(

1

t t y a t

b t

z

n i i

where b(t) is the standard load, ε is a white noise component, and y (t) i (t) are the

independent explanatory variables The most typical explanatory variables areweather factors

A typical regression model has been used by Räsänen and Ruusunen (1992) Theymodel different consumer categories by separate regression models The load isdivided into a rhythm component and a temperature dependent component Therhythm component corresponds to the load of a certain hour in the averagetemperature of the modeling period

More complicated model variations have also been proposed Some models use earlierload values as explanatory variables in addition to external variables (e.g.Papalexopoulos and Hesterberg 1990)

Regression models are among the oldest methods suggested for load forecasting Theyare quite insensitive to occasional disturbances in the measurements The easyimplementation is another strength The serial correlation, which is typical whenregression models are used on time series, can cause problems

Stochastic time series models

This is a very popular class of dynamic forecasting models (see, e.g., Karanta andRuusunen 1991, Piggott 1985, Hagan and Behr 1987) There are many namesencountered in the literature for the class, for example ARMA (autoregressive –moving average) models, ARIMA (integrated autoregressive – moving average)models, Box-Jenkins method, linear time series models, etc A general treatment ofthe model type can be found in, e.g., Pindyck and Rubinfeld (1991)

The basic principle is that the load time series can first be transformed into astationary time series (i.e invariant with respect to time) by a suitable differencing

Trang 18

assume that the properties of the time series remain unchanged for the period used inmodel estimation, and all disturbances are due to this white noise componentcontained in the identified process.

The basic ARIMA model can be written:

)()()()

where

z(t) , t = 1, … , N is the modeled time series

a(t), t = 1, … , N is a white noise sequence

p

p B B

q

q B B

B is the backward shift operator ( B n (z(t)) = z(t-n) )

This basic ARIMA model is not by itself suitable for describing the load time series,since the load series incorporates seasonal variation Therefore, the differencing withthe period of seasonal variation (usually 24 and 168) is required The model thenobtained is called a seasonal ARIMA (SARIMA) model and can be written (Box andJenkins, 1976):

)()()()()

()

where

D S D

S =(1− B )

An external input variable, such as temperature in the case of load time series, canalso be included in the model Such a variant of the ARIMA model is called anARIMAX model, and can in general be written:

)()()()()()

Trang 19

The stochastic time series models have many attractive features First, the theory ofthe models is well known and therefore it is easy to understand how the forecast iscomposed The properties of the model are easy to calculate; the estimate for thevariance of the white noise component allows the confidence intervals for theforecasts to be created.

The model identification is also relatively easy Established methods for diagnosticchecks are available Moreover, the estimation of the model parameters is quitestraightforward, and the implementation is not difficult

The weakness in the stochastic models is in the adaptability In reality, the loadbehavior can change quite quickly at certain parts of the year While in ARIMAmodels the forecast for a certain hour is in principle a function of all earlier loadvalues, the model can not adapt to the new conditions very quickly, even if modelparameters are estimated recursively A forgetting factor can be used to give moreweight to the most recent behavior and thereby improve the adaptability

Another problem is the handling of the anomalous load conditions If the loadbehavior is abnormal on a certain day, this deviation from the normal conditions will

be reflected in the forecasts into the future A possible solution to the problem is toreplace the abnormal load values in the load history by the corresponding forecastvalues

Trang 20

The state vector at time t is x(t), and u(t) is a weather variable based input vector w(t)

is a vector of random white noise inputs Matrices A, B, and the vector c are assumed

models is the possibility to use a priori information in parameter estimation via

Bayesian techniques Yet, they point out that the advantages are not very clear andmore experimental comparisons are needed

Expert systems

Expert systems are heuristic models, which are usually able to take both quantitativeand qualitative factors into account Many models of this type have been proposedsince the mid 1980's A typical approach is to try to imitate the reasoning of a humanoperator The idea is then to reduce the analogical thinking behind the intuitiveforecasting to formal steps of logic (Rahman and Bhatnagar 1988, Rahman andHazim 1993)

A possible method for a human expert to create the forecast is to search in historydatabase for a day that corresponds to the target day with regard to the day type,social factors and weather factors Then the load values of this similar day are taken

as the basis for the forecast An expert system can thereby be an automated version ofthis kind of a search process (Jabbour et al 1988)

On the other hand, the expert system can consist of a rule base defining relationshipsbetween external factors and daily load shapes Recently, a popular approach has been

to develop rules on the basis of fuzzy logic (see, e.g., Hsu and Ho 1992, Kim et al.

1995, Momoh and Tomsovic 1995)

The heuristic approach in arriving at solutions makes the expert systems attractive forsystem operators; the system can provide the user with the line of reasoning followed

by the model (Asar and McDonald 1994)

Trang 21

3 Neural networks in load forecasting

Neural networks, or artificial neural networks (ANN) as they are often called, refer to

a class of models inspired by biological nervous systems The models are composed

of many computing elements, usually denoted neurons, working in parallel The elements are connected by synaptic weights, which are allowed to adapt through a learning process Neural networks can be interpreted as adaptive machines, which can store knowledge through the learning process.

The research in the field has a history of many decades, but after a diminishinginterest in the 1970's, a massive growth started in the early 1980's Today, neuralnetworks have applications, for example, in pattern recognition, identification, speechrecognition, vision, classification, and control systems

There are many types of neural network models, the common feature in them beingthe connection of the ideas to biological systems The models can be categorized inmany ways One possibility is to classify them on the basis of the learning principle

A neural network uses either supervised or unsupervised learning In supervised

learning, the network is provided with example cases and desired responses Thenetwork weights are then adapted in order to minimize the difference betweennetwork outputs and desired outputs In unsupervised learning the network is givenonly input signals, and the network weights change through a predefined mechanism,which usually groups the data into clusters of similar data

The most common network type using supervised learning is a feed-forward (signaltransfer) network The network is given an input signal, which is transferred forwardthrough the network Eventually, an output signal is produced The network can beunderstood as a mapping from the input space to the output space, and this mapping isdefined by the free parameters of the model, which are the synaptic weightsconnecting the neurons The most popular of all neural networks, Multi-LayerPerceptron network (MLP), is of this type This network is described in 3.1

The ANN models are researched in connection with many power system applications,short-term forecasting being one of the most typical areas Most of the suggestedmodels use MLP networks (see, e.g., Park et al 1991a, Lee and Park 1992, Ho et al

1992, Peng et al 1992, Chen et al 1992, Lu et al 1993, Asar and McDonald 1994,

Trang 22

Mohammed et al 1995, Khotanzad et al 1995) The attraction of MLP has beenexplained by the ability of the network to learn complex relationships between inputand output patterns, which would be difficult to model with conventional algorithmicmethods In the models, inputs to the network are generally present and past loadvalues and outputs are future load values The network is trained using actual loaddata from the past.

In addition to MLP forecasters, models based on unsupervised learning have alsobeen suggested for load forecasting (see, e.g., Hsu and Yang 1991, Djukanovic et al

1993, Lamedica et al 1996) The purpose of these models can be the classification ofthe days into different day types, or choosing the most appropriate days in the history

to be used as the basis for the actual load forecasting

3.1 Multi-Layer Perceptron network (MLP)

Description of the network

Multi-Layer Perceptron network is the most popular neural network type and most ofthe reported neural network short-term load forecasting models are based on it The

basic unit (neuron) of the network is a perceptron This is a computation unit, which

produces its output by taking a linear combination of the input signals and by

transforming this by a function called activity function The output of the perceptron

as a function of the input signals can thus be written:

w are the neuron weights

θis the bias term (another neuron weight)

σ is the activity function

Possible forms of the activity function are linear function, step function, logisticfunction and hyperbolic tangent function

Trang 23

The MLP network consists of several layers of neurons Each neuron in a certain layer

is connected to each neuron of the next layer There are no feedback connections Athree-layer MLP network is illustrated in figure 3.1

Figure 3.1: A three-layer MLP network.

As an N-dimensional input vector is fed to the network, an M-dimensional output vector is produced The network can be understood as a function from the N- dimensional input space to the M-dimensional output space This function can be

written in the form:

) )))(

(

(()

;

where

yis the output vector

x is the input vector

i

W is a matrix containing the neuron weights of the i:th hidden

layer The neuron weights are considered as free parameters

The most often used MLP-network consists of three layers: an input layer, one hiddenlayer, and an output layer The activation function used in the hidden layer is usuallynonlinear (sigmoid or hyperbolic tangent) and the activation function in the outputlayer can be either nonlinear (a nonlinear-nonlinear network) or linear (a nonlinear-linear network)

.

Trang 24

The neural network of this type can be understood as a function approximator It hasbeen proved that given a sufficient number of hidden layer neurons, it can

arbitrary accuracy (Funahashi 1989, Hornik et al 1989)

Learning

The network weights are adjusted by training the network It is said that the network

learns through examples The idea is to give the network input signals and desired

outputs To each input signal the network produces an output signal, and the learningaims at minimizing the sum of squares of the differences between desired and actual

outputs From here on, we call this function the sum of squared errors.

The learning is carried out by repeatedly feeding the input-output patterns to the

network One complete presentation of the entire training set is called an epoch The

learning process is usually performed on an epoch-by-epoch basis until the weightsstabilize and the sum of squared errors converges to some minimum value

back-propagation algorithm This is a specific technique for implementing gradient descent

method in the weight space, where the gradient of the sum of squared errors withrespective to the weights is approximated by propagating the error signals backwards

in the network The derivation of the algorithm is given, for example, in Haykin(1994) Also some specific methods to accelerate the convergence are explained there

A more powerful algorithm is obtained by using an approximation of Newton'smethod called Levenberg-Marquardt (see, e.g., Bazaraa 1993) In applying thealgorithm to the network training, the derivatives of each sum of squared error (i.e.with each training case) to each network weight are approximated and collected in amatrix This matrix represents the Jacobian of the minimized function TheLevenberg-Marquardt approximation is used in this work to train the MLP networks

In essence, the learning of the network is nothing but estimating the modelparameters In the case of the MLP model, the dependency of the output on the modelparameters is however very complicated as opposed to the most commonly usedmathematical models (for example regression models) This is the reason why theiterative learning is required on the training set in order to find suitable parameter

Trang 25

values There is no way to be sure of finding the global minimum of the sum of

squared error On the other hand, the complicated nonlinear nature of the input-outputdependency makes it possible for a single network to adapt to a much larger scale of

different relations than for example regression models That is why the term learning

is used in connection with neural network models of this kind

correct (or close enough) for an input, which has not been included in the training set

A typical problem with network models is overfitting, also called memorization in the

network literature This means that the network learns the input-output patterns of thetraining set, but at the same time unintended relations are stored in the synapticweights Therefore, even though the network provides correct outputs for the inputpatterns of the training set, the response can be unexpected for only slightly differentinput data

Generalization is influenced by three factors: the size and efficiency of the trainingset, the model structure (architecture of the network), and the physical complexity ofthe problem at hand (Haykin 1994) The latter of these can not be controlled, so themeans to prevent overfitting are limited to affecting the first two factors

The larger the training set, the less likely the overfitting is However, the training setshould only include input-output patterns that correctly reflect the real process beingmodeled Therefore, all invalid and irrelevant data should be excluded

The effect of the model structure in the generalization can be seen in two ways First,the selection of the input variables is essential The input space should be reduced to areasonable size compared to the size of the training set If the dimension of the inputspace is large, then the set of observations can be too sparse for a propergeneralization Therefore, no unnecessary input variables should be included, becausethe network can learn dependencies on them that do not really exist in the real

Trang 26

process On the other hand, all factors having a clear effect on the output should beincluded.

The larger the number of free parameters in the model, the more likely the overfitting

is Then we speak of over-parameterization Each hidden layer neuron brings a certainnumber of free parameters in the model, so in order to avoid over-parameterization,the number of hidden layer neurons should not be too large There is a rough rule ofthumb for a three-layered MLP (Oja) Let

H = number of hidden layer neurons

N = size of the input layer

M = size of the output layer

T = size of the training set

The number of free parameters is roughly W=H(N+M) This should be smaller than the size of the training set, preferably about T/5 Thereby, the size of the hidden layer

should be approximately:

)(

T H

+

In order to be sure of a proper generalization, the network model, like anymathematical model, has to be validated This is a step in system identification, whichshould follow the choosing of the model structure and estimating the parameters Thevalidation of a neural network model can be carried out on the principle of a standard

tool in statistics known as cross-validation This means that a data set, which has not

been used in parameter estimation (i.e training the network), is used for evaluation ofthe performance of the model

3.2 MLP networks in load forecasting

The idea behind the use of MLP models in load forecasting is simple: it is assumedthat future load is dependent on past load and external factors (i.e temperature), andthe MLP network is used to approximate this dependency The inputs to the networkconsist of those temperature values and past load values, and the output is the targetload values (for example a load value of a certain hour, load values of many futurehours, the peak load of a day, the total load of a day etc)

Trang 27

Therefore, the building of a MLP model for load forecasting can be seen as a linear system identification problem The determining of the model structure consists

non-of selecting the input variables and deciding the network structure The parameterestimation is carried out by training the network on load data of the history Thisrequires choices concerning the learning algorithm and appropriate training data Themodel validation is carried out by testing on load data, which has not been used intraining

However, the modeling with neural networks is different to modeling with linearsystem models The nonlinearity and the great adaptability of the network modelsmake it possible to use specific indicators as input variables In the case of loadforecasting, the hour of the day and day type of the target hour, for instance, can beincluded as binary codes in the network input The network model can be understood

to be based on pattern recognition functions, where different input patterns aremapped in different ways This makes the models very different to, for example,ARIMA models, which assume that the load time series can be made stationary (i.e.invariant with respect to time) with suitable filters The handling of the special loadconditions is easier for neural network models than for ARIMA models

Another matter supporting neural network models, is the relatively rapid changing ofthe characteristics in the load behavior This is a problem with statistical models,because they can not always keep up with the sudden changes in the dependencies ofthe load For example, the beginnings of holiday seasons etc can change the loadbehavior rapidly As neural network models are in essence based on patternrecognition functions, they can in principle be hoped to recognize the changedconditions without re-estimating the parameters This requires of course thatconditions corresponding to the new situation have been used in training, and thatnetwork inputs contain the information necessary for recognizing the conditions

On the other hand, a problem with MLP models is the black-box like description ofthe dependencies of the future values on the past behavior The understanding of themodel is very difficult; the common sense can hardly be applied in order to see howthe outputs depend on inputs The responding of the model to an input pattern, which

is very different to any experienced during the learning, can be unexpected This canhappen in new conditions, even if the model is validated with test data

Trang 28

Another problem is the lack of general procedures to build the models As the MLPmodels do not assume a specific functional form for the modeled relations, selectingthe appropriate model structure is more heuristic than, for example, in the case ofARIMA models.

3.3 Literature survey

In this section, the literature of neural networks in load forecasting is surveyed Thisliterature offers the background for the rest of the work Although there are manyarticles on the subject, and also many quite sophisticated solutions have beenproposed, the large variation in these and the lack of comparative studies make itimpossible to use them by themselves

The interest in applying neural networks to electric load forecasting began in 1990.Most of the approaches reported since are based on the use of an MLP network as anapproximator of an unknown nonlinear relation However, the number of differentways to use this type of networks seems unlimited on the basis of the articles

The survey is divided into three subsections In the first one, the articles discussingthe normal MLP-solutions are referred In the second one, the models usingunsupervised learning techniques are considered Finally, a large scale of remainingvariations is discussed in the third one

Basic MLP-models

The literature about short-term load forecasting with MLP neural network models can

be roughly divided into three categories with regard to the forecasting target Thesedifferent model types are intended for:

- forecasting daily peak-, valley- or total load

- forecasting the whole daily load curve at one time

- forecasting the load of the next hour

The models of the first two categories are static in the sense that the forecasts are notadapted during the day The third model type is usually used recursively in order toforecast further than just one hour ahead The model is dynamic, since the forecastcan be updated every time new data arrives

Trang 29

There are also many other factors that make the models different from each other.These differences can be for example in:

- the use of the weather data

- the other input variables

- network architecture

- training algorithm

- selection of the training data

In the following, the models forecasting only peak-, valley-, or total loads are treatedfirst, following with the hourly forecasting models (also called load shape models)

Peak, valley, and total load forecasting

There are a few articles on forecasting peak-, valley- or total loads of the day with anMLP network The highest load values are usually the most crucial for the electricutilities to know in advance Also, forecasts for the peak-, valley-, and total loads of aday can be used as the first step in forecasting the whole daily load curve

Park et al (1991a) study the peak- and total load forecasting with a very basic layer MLP The input variables to the network include only the maximum, minimum,and average temperatures of the target day Thereby, the network is used in modelingonly the dependencies on temperature values; no previous load values are used aspredictors The results are very good, average forecasting errors of five different testsets being 2.04 % for the peak load forecasts, and 1.68 % for the total load forecasts.The article also discusses hourly forecasting

three-Ho et al (1992) use a much larger network structure for peak- and valley loadforecasting The 46 input variables for the three-layer MLP include the forecast hightemperatures for the target day in three different areas, three recorded temperaturevalues of the previous day, and three temperature values and the peak (valley) loadvalue in the past ten days of the similar day type as the target day 60 hidden layerneurons are used This massive network structure is trained on only 30 input-outputpatterns The training algorithm is a variation of the back-propagation, where themomentum is adapted Results are good, but only provided for a few test days

Trang 30

The motivation of Ho et al in forecasting peak- and valley loads is to use them inconnection with a rule-based expert system developed in another work of theirs Thissystem provides 24 normalized hourly values for each day type, which can be scaled

to match the peak- and valley forecasts

Peng et al (1992) forecast the daily total load, and present a simple search procedurefor selecting the most relevant training cases The proposed network architecture is amodification of a normal three-layered MLP, where the input nodes have straightlinear connections to the output layer in addition to the normal hidden layerconnections The five inputs to the network are the forecast maximum and minimumtemperatures of the target day, and maximum and minimum temperatures as well asthe total load of the previous day

Asar and McDonald (1994) compare different input structures and datanormalizations in peak load forecasting The best results were obtained with thesimplest input structure, which only contains the peak loads of the previous day andthe days one week and one month before The input structures containing temperaturevalues did not improve the results The end of the article is devoted to a discussion onpossibilities of a hybrid system combining neural networks and supervisory expertsystems

Hourly forecasting

The hourly load forecasts can be obtained either by forecasting the whole daily loadcurve at one time with a multi-output network (a static approach), or by forecastingthe load with a single-output network for one hour at a time (a dynamic approach) Atleast two articles discussed in the following compare these approaches

Lee and Park (1992) propose two different models In the static model, the day isdivided into three parts, which have separate network weights The input to thenetwork includes the load values of the corresponding part of the day of a fewprevious days of the same type The weekday load patterns and weekend-day loadpatterns are treated separately In the dynamic model, the input to the networkincludes the load values of a few previous hours, and these same hours on a fewprevious days Neither of the models uses temperature data The results with bothmodels are quite similar, but the dynamic model appears to be slightly better in thetests

Trang 31

Lu et al (1993) use historical load data of two electric utilities in different parts of theworld to investigate whether MLP models are system dependent, and/or casedependent Static and dynamic models are used for both cases The inputs to thenetworks include past load and temperature data and also the hour-of-day and day-of-week information The training sets of the length of one or two months are used Theresults indicate that there are no firm criteria to select a suitable network structure.Models are not unique, and different systems require different model structures Thedynamic approach seems again slightly better than the static one.

Another articles on hourly models are by Park et al (1991b) and Chen et al (1992).These use the dynamic approach In the former, the input to the network only containsload and temperature values of the two previous hours, the forecast temperature forthe target hour, and the hour of the day (the target hour) The average forecastingerrors are less than 2 % for all five test sets

Chen et al (1992) use a nonfully connected network structure This means that thehidden layer neurons are divided into certain clusters, and each input neuron isconnected only to some clusters The overall network is a combination of thesesmaller supporting networks There are 31 input nodes containing past load andtemperature data, temperature-forecast data, and hour-of-day and day-of-weekinformation The results show an improvement over an ARIMA model, which is used

as a comparison

Unsupervised learning models

Applying models based on unsupervised learning is proposed in a few articles Theidea is usually to classify the daily load profiles in different day types The forecastingitself is then performed by a network based on supervised learning In all referencearticles using unsupervised learning, a static approach is taken; the whole load curve

of a day is forecast at once

Hsu and Yang (1991) use Kohonen's self-organizing map to identify the different daytypes The daily load forecast is obtained in two phases In the first phase, the forecastfor the daily load shape is obtained by averaging some load patterns of the daysbelonging to the same day type as the target day In the second phase, the daily peak-and valley loads are forecast with an MLP network The actual forecast is then

Trang 32

classifications obtained with the self-organizing map are not always unique, because itcan be difficult to say whether neurons close to each other on the map representseparate day types or not On the other hand, the classification results seem quiteobvious; the result for the load data of Taiwan power system of May 1986 suggest thedays divided into categories: 1) Sundays and holidays, 2) Mondays and days afterholidays, 3) Saturdays, 4) weekdays except holidays.

Djukanovic et al (1993) discuss the supervised and unsupervised learning conceptsusing a functional link net, which allows supervised and unsupervised learning withthe same net configuration and with the same data structure The input to the networkincludes the 24 load values of the day before the target day, the maximum, minimum,and average temperatures of this day, temperature forecasts for the target day, thetariff season of the year, and the day of the week In the training phase, the networkuses unsupervised learning to classify the data into clusters and supervised learningfor the actual forecasting within each cluster In the forecasting phase, the target day

is classified to one of the existing clusters on the basis of load and temperature data ofthe previous day The actual forecast is then created within this cluster

Piras et al (1996) suggest a structure, where an unsupervised learning model called

neural gas is used in preprocessing the data into clusters The system is thereby

divided into submodels, which utilize normal MLP networks in approximatingnonlinear relations The resulting outputs are summed by a weighted fuzzy average,which allows a smooth transition between the models

Lamedica et al (1996) propose a normal three-layered MLP with an input structurecontaining the load data of all hours of two preceding days, and three binary codesindicating the day type The output consists of 24 load values of the target day Themodel is reported to work well on normal days, but the unsatisfactory accuracy onanomalous load conditions such as vacation periods and long weekends are themotivation for a more detailed day type classification Kohonen's SOM is used for theclassification, and cluster codes of the target day and the two preceding days areincluded in the input pattern of the MLP network In the forecasting stage, thepreventive classification of the target day is left to a human operator The model isvery different to the ones suggested by Djukanovic et al (1993) and Piras et al

(1996), because the supervised training is performed with a single network for all

clusters, and the cluster code is included as an input to the network The results

Trang 33

indicate that the normal MLP provides better results for normal load conditions, butthe using of SOM classification improves the accuracy on anomalous conditions.

There is one article proposing a very different use of Kohonen's SOM (Baumann et al.1993) There the network is used directly in forecasting the daily load curve instead ofclassifying load patterns The network is trained on load curves consisting of twosucceeding days The forecast for a day is obtained by associating the load curve ofthe previous day to a certain neuron on the map This neuron provides directly theforecast for the target day The model is described in more detail in section 4.2, wherealso some test results obtained with the test data of this work are presented

Other reported approaches

The rest of the articles have various suggestions to utilize neural network techniques.The variation of the approaches is large, and it is not easy to draw direct conclusions

on them Many of the articles present modular solutions This means that theforecasting system consists of modules, which are individual neural network modelscombined somehow to produce the forecasts There can, for example, be a separatemodule for modeling the temperature effect only The modules can work together inthe sense that each forecast value is obtained as a composition of separate modules, orthe modules can work separately so that different modules produce forecasts atdifferent situations

Khotanzad et al (1995) present a solution, where there are separate modules formodeling the weekly, daily, and hourly trends Each module consists of severalnetworks: one network for each day of the week in the weekly and daily modules, andone network for each hour of the day in the hourly module The outputs of themodules are combined by adaptive filters to arrive at the final forecast The system isreported to be in on-line use on several utilities across the US

Mohammed et al (1995) propose a system consisting of different MLP-networks fordifferent days and seasons The input structure varies from network to network,consisting at any case of past load and temperature values, and temperature forecasts.The day is divided into five periods, each of which is allocated a separate networkarchitecture The load is forecast hourly There is a particular adaptation mechanism

to train the networks on-line The system is implemented for Florida Power and Light

Trang 34

Sforna et al (1995) introduce a system consisting of two neural forecasters embedded

in the same environment, but able to act separately if needed These modules can also

be replaced with different traditional statistical routines The first module forecasts thedaily load curve with a normal MLP The second module works on-line, and performscorrection on the static forecast on the basis of the most recent information This uses

a recurrent neural network, where the neurons of the first hidden layer are connected

to themselves in addition to the second layer neurons

Chow et al (1996) present a neural network module for weather compensation Theidea is to forecast the deviation of the load on a certain hour from the load of thecorresponding hour on the previous day The network has 24 output nodes, so theforecast is obtained 24 hours ahead at once

Many authors feel that making the load forecasts as accurate as possible requires theutilizing of external information, such as the knowledge of different social activitiesetc Some articles propose the use of concepts of fuzzy set theory combined withneural networks for the purpose There are two possible approaches (Momoh et al.1995) First, the fuzzy logic can be used in providing a neural network with numericalinput data based on human expertise Second, fuzzy rules can be used to makecorrections on the neural network output on the basis of the human expertise

The model explained by Srinivasan et al (1995) is of the former type There, a fuzzyfront-end models the quantitative and qualitative knowledge about the system, and aneural network models the relationship between the fuzzy inputs and outputs Theoutput of the network is then defuzzified to obtain the load profile for the target day

Kim et al (1995), on the other hand, suggest a model of the latter type Theforecasting procedure is divided into two steps In the first one, a provisional forecastfor the load is obtained using a normal MLP neural network with one output node Inthe second step, fuzzy expert systems are applied to estimate the correction to the loaddue to temperature change and possible holiday-nature of the day The taking of otherfactors such as election days, rainy season, or television programs into account isconsidered as a future development

Fuzzy concepts are also used in a model called fuzzy neural network Bakirtzis et al.

(1995) propose such an integrated neural-network-based fuzzy decision system.There, a fuzzification interface, fuzzy rule base, fuzzy interference machine and

Trang 35

defuccification interface perform a mapping from the non-fuzzy input space to thenon-fuzzy output space This kind of a fuzzy system can approximate any continuousfunction to an arbitrary degree of accuracy The system can be represented by alayered network and the adapting of the model parameters is performed through atraining process similar in nature to that of a MLP network Short-term forecastingresults are reported to be similar to those of neural networks, but the training of thefuzzy neural network is faster.

Summary

The literature presents many model types Most of the models are based on forward type MLP networks, but also models using unsupervised learning, fuzzyconcepts and recurrent networks, to give a few examples, are presented A commonapproach is to build a modular system, where separate modules concentrate onspecific tasks

feed-There is a single feature, which clearly divides the models into two distinct classes.Namely, some models are based on the idea of producing the whole load curve of aday at one time, while others are able to forecast the hourly load ahead at any time ofthe day In most of the articles, either of these model types is used without givingspecific reasons for the choice There is not much comparison between the twoapproaches

The comparison of the approaches presented in the literature is difficult Most of thearticles present a specified solution to the problem, but the justifying of the choicesconcerning the utilized methods is often not given much attention Since the loadconditions are different in each case, the direct comparison of the forecasting errors isquite meaningless

This is the reason for the comparative approach taken in this work The goal is toobtain comparable information on the performance of the basic model types Thiskind of an approach is seen necessary for the purpose of building a real applicationsuited for the defined needs

Another thing lacking in the articles is the analysis of the performances at differentlead times In this work, the idea to use separate modules for different lead times is

Trang 36

considered In particular, the possibility to improve the accuracy for the closest hourswith a separate model will be examined.

Trang 37

4 Forecasting the daily load profile

In this and the following chapters, several different neural network models are builtand tested The models are mainly based on the most established ideas presented inthe literature The implementation is carried out using Matlab Neural NetworkToolbox

In this chapter, the problem of load forecasting is approached by making the forecastsfor one whole day at a time The approach is static in the sense that the forecast is notupdated during the day The more dynamic models are treated in the next chapter

As already discussed in chapter 3, the forecasting of the load on the daily basis withneural network techniques has been reported in many variations Here, twoapproaches are taken In section 4.1, a MLP network is used in forecasting the dailypeak-, valley-, and average loads The shape of the load curve is predicted separately

by averaging some load shapes in the history A similar approach has been taken by,e.g., Hsu and Yang (1991), but they use only peak- and valley loads

A more exceptional method based on the association of a load curve with the load ofthe previous day is tested in section 4.2 (Baumann and Germond 1993) This methoduses Kohonen's self-organizing feature map (see, e.g., Kohonen 1987 or Kohonen

1997 for a detailed description) The usefulness of the self-organizing map in this case

is found questionable, since a very simple model based on the same idea is alsointroduced This simplified model provided better results than the SOM model

4.1 Forecasting with peak, valley, and average loads

Description of the MLP network models

The idea in using the MLP network in this chapter is to identify the assumeddependency of the daily peak-, valley-, and average load on earlier load andtemperature data The network is trained on the past data and is hoped to learn toapproximate this unknown dependency

The data over a period of one whole year is used for the network training The peak-,valley- and average loads over the training period, from May 24, 1996 to May 23,

1997 are shown in figure 4.1 The temperature data on the same time period is also

Trang 38

used The seasonal trend on the load can be easily seen Also, the weekly loadstructure can be seen in the form of lower load values on weekends than on workingdays.

Figure 4.1: Peak-, valley- and average loads over the period May 24, 1996-May

It was concluded that using networks with only one output node gives better resultsthan forecasting peak-, valley- and average loads with one multi-output network Thismeans that separate networks are used in all cases

Another features to be decided about the architecture of the network are the inputvariables and the number of hidden layer neurons For the input variables, thefollowing symbols are used:

Trang 39

To forecast the maximum load of a certain day, at least the maximum loads of theprevious day and the corresponding day from the previous week, can be consideredpotential input variables Also the temperature data of those days may be useful iftemperature forecasts for the target day are available Maximum, minimum andaverage temperatures are considered for this purpose.

Eight different input structures are tested separately for peak, valley and averageloads These are numbered from 1 to 8, and are listed in the following for maximumload forecasting If the target is the minimum or average load, all maximum loadvalues are replaced by corresponding minimum or average values respectively

Output: Lmax(i)

Input structures:

1 Lmax(i− 1),Lmax(i− 7),Lmax(i− 8),T ave(i),T ave(i− 1),T ave(i− 7),T ave(i− 8)

2 Lmax(i− 1),Lmax(i− 7),T ave(i),T ave(i− 1),T ave(i− 7)

3 Lmax(i− 1),Tmin(i),T ave(i),Tmax(i),Tmin(i− 1),T ave(i− 1),Tmax(i− 1)

4 Lmax(i− 1)

5 Lmax(i− 1),T ave(i)− T ave(i− 1)

6 Lmax(i− 1),Tmax(i)− Tmax(i− 1)

7 Lmax(i− 1),Tmin(i)− Tmin(i− 1)

8 Lmax(i− 1),Lmax(i− 7),Lmax(i− 8),Tmax(i),Tmax(i− 1),Tmax(i− 7),Tmax(i− 8)

In addition to inputs listed above, each input structure contains four extra nodes.These get binary values and inform the network of the day type of the target day Theday type classes are: 1) Mondays, 2) Tuesdays-Fridays, 3) Saturdays, and 4) Sundays

Trang 40

Informing the network about the day type is important, because Saturdays andSundays have much lower peak loads than working days.

A hyperbolic tangent function is used as the activation function by all feed-forwardneural networks of this work This function is a mapping into the interval [-1,1] Toenable the converging of the network training within a reasonable time, the desiredoutput values should be scaled onto this interval (see, e.g., Haykin 1991) Alltemperature and load values are therefore scaled linearly between –1 and 1

Predicting the shape of the load curve

Classifying the days

The load shape is predicted here by averaging some load curves of similar days in theload history Therefore, days have to be grouped into classes of different day types

Hsu and Yang (1991) used self-organizing feature maps to classify the days intogroups, with each group comprising the days with similar load patterns Their resultsare however not surprising and similar classifications can also be made without neuralnetworks They present a classification for the load data of Taiwan Power Companyfor the period of May 1986 classes being: 1) Sundays and holidays, 2) Mondays anddays after holidays, 3) Saturdays, 4) weekdays except holidays

It should be noticed that classification with a self-organizing feature map does notsolve the obvious problem: the class of the target day has to be known in advance.The feature map can not classify the day before the load values of the day are known

Here, simple reasoning is used instead of more sophisticated methods Two differentclassifications are used:

Định dạng
Số trang	92
Dung lượng	425,17 KB