Eight Ways to Calculate a Code Mean


This is an interesting way to investigate some of the computation facilities in Ruby.

Taking the NCHLD=Number of Children variable in the Ruby Demo job, the eight ways (numbered at the right in bold) in Diagnostic view mode are

The GUI specification is

And the script equivalent (in VBScript syntax) is

   name="EightWaysToCodeMean"
   top="Year(1/4)"
   side="NCHLD(1/6;tot;cmn;#cmn#(*);@cmn#NCHLD(*);" & _
         "#((c1*1+c2*2+c3*3+c4*4+c5*5+c6*6)/sum#(1/6));" & _
         "@(v1*1+v2*2+v3*3+v4*4+v5*5+v6*6)/v7)@," & _
         "-       ---Self-weighted---," & _
         "[NCHLD],NCHLD(#cwf/cuf)@,[]," & _
         "-       ---Coded increment---," & _
         "incNCHLD(#c1/cwf)," & _
         "incNCHLD(1)"
   filt="NCHLD(*)"
   wght=""
   Set rep = rub.GenTab(name,top,side,filt,wght)

Note that Allow% is off for all side nodes except the last.

The last two side nodes use a coded increment construction, incNCHLD, which increments a constructed code 1 by the number of children.

The relevant table properties are Column% ON, Flags | Data Modifiers | Force Percents to Proportions (divides by 100) ON and 2DP for percentages. These settings are for Way8 only so that I can get all eight ways into a single table specification.

Way1 – Pseudo-code – cmn

This is the pseudo-code cmn, always available for any variable, and is calculated against defined codes (if categorical) or all values (if uncoded).

Way2 – Codeframe Function – cmn#(*)

This is the codeframe function version of cmn which additionally allows scoping for the operands. This is the form to use if you want the mean built into the codeframe with excluded values, such as 0 or 98/99 = None/DK, or if you want banded means such as Top, Middle, Bottom Box, eg cmn#(1/2)=Small Family Mean.

Way3 – Variable Function – cmn#NCHLD(*)

This is the variable function. The expression is stand-alone, so you could have a table of means from several different variables. Scoping is also supported.

Way4 – Codeframe Arithmetic – #((c1*1+c2*2+c3*3+c4*4+c5*5+c6*6)/sum#(1/6))

This is a codeframe expression which calculates the mean from first principles. The advantage is you can change the codes used, the code weights and the denominator for custom analyses.

Way5 – Vector Expression – @(v1*1+v2*2+v3*3+v4*4+v5*5+v6*6)/v7)

Similar to Way4, but with further flexibility to pick up any arbitrary table rows. The denominator v7=Vector7 is totals.

Way6 – Self-Weighted – [NCHLD],NCHLD(#cwf/cuf)

Ways 4/5 are not much use if there are hundreds or thousands of codes or values. And if tens or hundreds of thousand unique values, then Ways 1/3 can be slow to evaluate. Crosstabs don’t care much about the number of unique values, so, noting that Ways 4/5 are really just an explicit form of self-weighting (each code is weighted by itself), then sum of values/number of values can be calculated as the ratio of Cases Weighted Filtered:Cases Unweighted Filtered, that is, as cwf/cuf.

Way7 – Coded Increment – incNCHLD(#c1/cwf)

Another way to do self-weighting is by coded increments. A constructed code 1 carries the number of children as an increment. At crosstab time, the c1 vector is the sum of increments = the total number of children across all respondents, and the mean is total children / respondents = c1/cwf.

Way8 – Percents as Proportions – c1

A table based on cwf does the division at Way7 for you, but also multiplies by 100. The multiplication by 100 can be prevented by setting the table flag property Force Percents to Proportions ON.

Leave a comment

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.