See the previous post on AEMO for a general overview of handling quantitative (as opposed to categorical) case data.
If you have Ruby5 (Cloud edition) you can access this job, called GridwatchUK, at the rcspublic customer account. If you want to examine or execute the scripts (in VB.net), you will need to download first, since you can’t run an IDE in the cloud.
To see just the outputs, download this Excel file:
I’ll try to keep the XLSX and Ruby job reasonably up to date, at least quarterly unless something interesting happens.
There are 131 charts, one per sheet, in PNG format, arranged as:
Sequential (16 charts)
Cyclic (12 charts)
Wind Breakouts (19 charts)
The Nets collapse the source series as
All Gas = Open + Closed Gas Turbine
Hydro/Pump = Hydro+Pumped
All ICT = sum of all interconnectors, and
Other = Oil+Biomass+Other
The All charts are unnetted, that is, all sources are shown as their own series.
Use the Net set for overview, and the All set for the breakout of the nets. The All set is much better for looking at the ICTs, since +/- as imports/exports may sum to zero (in the worst case) and so not be visible as the All ICT net.
Within Nets and All there is a further subdivision to average and share (as percentage of total, not of the measured demand).
And within each of these the charts first show sequential periods (chronological from left to right) and then the cycles as Hour of Day (the Duck Curve), Day of Week, etc. The last two cyclic charts are actually hybrids, as Hour of Day and Day of Week within each year (2011-2018).
The Wind Breakouts show all of the above but for Wind only. Since only a single source series, average and share are shown on the same chart.
The series for Open Gas, Oil, and the Irish and East-West ICTs have only a tiny data presence, and hence will be barely visible (if at all) as plots.
The Nets and All charts have the generation types as stacked bars, with Demand as a black line series.
All plot points are labelled with their data value where resolution permits.
The Day_ charts are in pairs – unsmoothed (dense and cluttered, but shows variability), and smoothed at MA=7 with the first six points hidden. For unsmoothed daily close-ups and five-minute resolution, see the LastNDay and LastNDay5 charts.
For the northern hemisphere Quarter and Season align, so Q1=Winter, Q2=Spring, Q3=Summer and Q4=Autumn.
The LastNDay5 charts, because at minimum resolution, give the highest series amplitudes, so these charts have Y1 extended to 55,000 MW for averages and 120% for shares. Demand can exceed 100% when the ICTs are negative.
Chart series are otherwise not clipped when exceeding the Y1 bounds so that the Y1 range can be kept as consistent as possible – with averages at 0 to 50,000 MW and shares as 0 to 100%.
There is a distinct regime shift at 2016. Coal drops dramatically, Gas and the ICTs increase accordingly, new ICTs are added such that total supply thereafter exceeds the measured demand, and solar is introduced (as estimates). The regime shift is obvious in the sequential charts. For the cyclic charts, mixing 2011/2015 in with 2016/2019 would therefore be quite misleading, so all pure cyclic charts come in pairs, the first filtered to 2011/2015, and the second to 2016 ff.
The source case data was downloaded as a single file to midnight 16Jan19 (at time of writing) from
This site is pro bono, so if you download for yourself, be sure to make a donation.
(The last date will be extended when the job is next updated).
The full source file is presently about 100 megabytes, in CSV format. Each case is a five minute average, mostly 24*12=288 cases per day, for 801,882 cases in total. Some days have missing cases, and some days have more than 288 cases (due to more than one case being recorded within the same five minute period) as this chart (excluding the first half-day at 27May11) shows:
The column headers and the first case (ID#4) are
id, timestamp, demand, frequency, coal, nuclear, ccgt, wind, pumped, hydro, biomass, oil, solar, ocgt, french_ict, dutch_ict, irish_ict, ew_ict, other, north_south, scotland_england
4, 2011-05-27 15:50:04, 38874, 50.132, 9316, 8221, 18239, 1253, 309, 636, 0, 0, 0, 0, 912, 0, -414, 0, 0, 0, 0
(There are no cases IDs 1/3.)
Apart from id, timestamp and frequency, all values are in megawatts.
Unfortunately, the data does not arrive as clean as one would hope. Gridwatch is an aggregator of the primary data sources, so the dirt originates from the various suppliers, and how to deal with it is our responsibility. The following strategies are adopted:
- Any value in scientific notation, eg IDs 646150/646151
646150, 2017-07-24 17:48:31, 32612, 50.037998, 550, 7733, 15460, 3036, 522, 298, 2026, 0, 2070, 0, 1998, 0, -144, -4.617889289524495e26, 0, 1294, 558
646151, 2017-07-24 17:53:32, 32612, 50.037998, 550, 7733, 15460, 3036, 522, 298, 2026, 0, 2070, 0, 1998, 0, -144, -4.617889289524495e26, 0, 1267, 586
has been converted to 0.
- There are completely unphysical jumps in Demand, eg
7134, 2011-06-21 10:55:01, 39913, 50.004002…
7135, 2011-06-21 11:02:09, 160377, 49.993…
7136, 2011-06-21 11:05:29, 160585, 50.002998…
7137, 2011-06-21 11:11:41, 160585, 50.044998…
7138, 2011-06-21 11:16:51, 40646, 49.987…
Demand cannot jump from 40 GW to 160 GW and back again in the space of 15 minutes. Unphysical demand is defined (arbitrarily, but conservatively) as a jump up of more than 75% or a step down of more than 50%, compared to the average of the previous two values. Where this happens, the unphysical values are replaced with the average, and the cleaned data written to the variable demandc. There are still some strange bumps and dips here and there, but I cannot determine if real or not.
- There are two unphysical values for solar, at IDs 638924/638925:
638924,2017-06-29 15:35:32,33316,50.046001,1572,7946,14997,4584,299,367, 1447,0,1420000,…
638925,2017-06-29 15:40:33,33316,49.919998,1572,7946,14997,4584,299,367, 1447,0,1420000,…
These are both changed to 1420.
Some ICT values exceed the carrying capacity of the lines, eg
623954,2017-05-08 15:30:31,33281,49.967999,1106,7095,16486,2833,533,188,2047,0, 4700,0,1995,998,0,-46004024,…
The aberrant values are always negative, so any ICT < -3000 is set to 0.
Wrangling the final charts from the CSV is not a trivial exercise. For all the details, see the VB.Net VSX project. Some interesting techniques are exampled – particularly how to get a full set of period and cyclic variables from a time stamp string (see Variables.vb), and how to generate repeating charts under different parent conditions and states (Reports.vb). If you execute Reports.vb, you will see that the Share reports generate almost instantly. This is because the average tables are already in RAM – the only difference is quantity versus percents, and percents are always calculated on the fly at display or Save time.