skip to primary navigationskip to content

MPhil in Engineering for Sustainable Development

global challenges, engineering solutions

Studying at Cambridge

 

Jorge Mayer Romero

Accurate short-term prediction of wind farm power output using data from SCADA: Opportunities in improving grid integration with the application of big data analytics and machine learning

In this work, the potential of applying machine learning tools to forecast the short-term power output of wind farms is explored. Parting from the premise of renewable energy being a key enabler for sustainable development, the specific opportunities and challenges of wind power are discussed. Consequently, the technical grid integration challenges of variable output power sources are reviewed. Short-term forecasting of wind power facilitates grid integration through efficient balancing of power plants. Therefore, wind farm operators are presented with a series of economic incentives to communicate accurate power output forecasts to grid operators. The present research focuses on a forecast horizon of 5 minutes into the future. Several tools exist to estimate the power output of a wind farm; deterministic models such as the blade element method offer great detail on the physical principles ruling power generation but are affected by the stochastic conditions of wind patterns.

Nevertheless, with the availability of data from current state-of-the-art turbine SCADA, an opportunity arises to implement non-deterministic models based on big data analytics and machine learning to generate predictions using low-cost computing power and accessible software tools. Consequently, the SCADA data from a wind farm is analyzed, pre-processed and submitted to a series of machine learning methods with the aim of predicting whole farm power output. Key aspects of this procedure are an adequate selection of accuracy and error metrics to evaluate each model’s ease of implementation. The work relied on the development of computational tools to analyze, visualize and pre-process the data, and a supervised learning time series regression framework. Three traditional closed-end statistical models are tested together with a set of neural network configurations. Multi-criteria analysis is used to find which model performs better in terms of effectiveness, efficiency and practicality. The work concludes there is a tradeoff between model complexity and practicality. No model is selected as a best-for-all conditions solution; however, specific details on each model’s strengths, weaknesses and optimal conditions are discussed.