Ruby is primarily a survey analysis tool, with default behaviour generally oriented towards making it as easy as possible to count the number of numbers (as categorical cross tabulations). But many surveys also have quantitative case data (price paid, distance travelled, container volumes etc) and so functionality must include ways to get totals and averages in the same sorts of ways as you would when working with columns of numbers in Excel.
The typical scenario for quantitative data is to take an XLSX file (usually as CSV) and turn it into a Ruby job.
To demonstrate the techniques, we have set up three public quantitative jobs:
- AEMO (Australian Energy Market Operator), the Australian eastern power grid demand (in gigawatts) and spot price (30 minute intervals since 1999)
- GridWatchUK, the UK power supply by all sources against total demand (5 minute intervals since May 2007)
- StockMarkets, the major international stock markets aligned at daily resolution
There will be later posts on GridWatch and StockMarkets.
If you have Ruby5 (Cloud edition) you can access these jobs at the rcspublic customer account. If you want to examine or execute the scripts (in VB.net), you will need to download first, since you can’t run an IDE in the cloud.
To see just the outputs, download these Excel files:
By State shows demand and price as a pair of series. There is a set of charts for each state.
All States compares the states as five series (one per state) on each of demand and price.
The data for the preceding month becomes available within the first few days of the current month, so I’ll aim to update the XLSX and the Ruby job by the second week of each month.
The source case data was downloaded from
There is one file per month per state from January 1999, with Tasmania starting in May 2005. The files are all of the form
The above shows the time, demand and price for the first six cases as 30 minute periods for NSW on 1Jan1999. PERIODTYPE=TRADE is ignored. From these 1,124 files (as of end of 2018), I assemble a single file per month, merged horizontally so that each 30 minute period is a single row for all states. Importing all, up to 1Jan2019, gives a Ruby job of 350,626 cases. At 48 30 minute periods per day, times 365.25 days per year, times 20 years = expected 350,640. Coverage therefore, with only 14 missing periods (and none since 2001), is excellent. Fortunately, all the states line up – none has any data in isolation.
From the date/time string, we need to extract a set of useful period variables: minute, hour, day, week, month, quarter, year, climatology year (Dec-Nov), seasons, and all the cycles, as hour of day, day of week, week of year, month of year, and season of year. See the Variables.vb script for details on how this was achieved.
I then associate each of the Demand and Price values with a state code, as 2=NSW, 3=VIC, 4=QLD, 5=SA and 7=TAS, using the coded increment notation – 2*27.64 means to increment the cell indexed by code 2 by 27.64. To get the average across all cases within a period, set column%s ON, and (this is the important bit) Properties | Flags | Percents as Proportions ON. The effect of this flag is to un-multiply by 100, converting the percentage to a ratio of sum of increments/number of cases, that is, the average price or demand within the X axis period. Again, the details are in Variables.vb.
The charts are all specified in Reports.vb. Arrays and loops have been used to reduce the specification for all 283 charts to about 250 lines of script.
The scaled reports rescale each series to a range between -1 and +1. Thus, all scaled series have the same amplitude (max at 1, min at -1), regardless of market share (NSW biggest, Tasmania smallest). This makes it easy to compare maxima and minima across states.
I’m not an energy analyst, so I’ll leave interpretation of and commentary on the charts to others, but it is obvious that there is a change in series behaviour from 2015. I have therefore split the cyclic charts into 1999-2018 (all data), 1999-2014 (generally stable) and 2015-2018 (increasingly dynamic).
And if you are an energy analyst, or understand this market, I’d like to know why overall demand peaked in 2008, except for Queensland where demand from 2015 ff is accelerating, apparently at Victoria’s expense.