Despite our best efforts to get Triple-S established as the preferred format for survey data interchange between the collection system and Ruby, the SAV (or SPS/ASC – logically identical) continues to reign supreme. Despite the rapidly diminishing market share of SPSS elsewhere, it seems entrenched in MR – for the near term at least. The appeal of a SAV would seem to be

  1. Can do extra stats not possible in Ruby
  2. Easy to manipulate variables in Ruby, then back-import to the SAV for client delivery
  3. Is the lingua franca in MR by default
  4. Can verify Ruby tables in SPSS to assure clients that Ruby processing is sound
  5. Large (if slowly decreasing) pool of SPSS expertise within MR.

So, we have to import a lot of SAVs or SPS/ASC pairs. The problem is, to avoid a one-to-one match of SPSS to Ruby variables (making a practically unusable job in Ruby) a blend file has to be written which configures the variables for multi-response and loop structures at import time. And this is a pain. I know because I’ve done it many many times.

The last time I needed to write a blend file it was for a huge questionnaire with hundreds of multi-response sets, so I had a think about how to automate the process, at least for multi-response. What I came up with was this:

  1. Check that all dichotomous variables in the SAV have the same format
  2. Copy the Name column to Excel
  3. Copy the Values column to Excel

Save as a text file called VarList.txt to the Source subdirectory of your job.

Note that all multi-response sets are stored in the same way, with {0, No} as the first code. This is used to identify candidates for blending. This VB.Net subroutine…

Sub WriteBlendFile()

   Dim fpath As String = sourcedir & "VarList.txt"
   Dim infile As New System.IO.StreamReader(fpath, True)
   Dim line, subarray(), subsubarray(), varname, varparams As String
   Dim blendvarlist As List(Of String()) = New List(Of String())

   While Not infile.EndOfStream   ' copy the code definitions into string lines
       line = infile.ReadLine
       varparams = ""
       If InStr(line, "{0, No}") Then
           subarray = Split(line, vbTab)
           varname = subarray(0)
           subsubarray = Split(varname, "_")
           If UBound(subsubarray) = 1 Then     '' like QUOTA_01
               varparams = subsubarray(0)
               If IsNumeric(subsubarray(1)) Then
                   If Left(subsubarray(1), 1) = "0" Then       '' like q4_01
                        '' store the blend target varname and the #width
                       varparams = varparams & vbTab & subsubarray(1).Length
                   Else
                      varparams = varparams & vbTab & "-1"  
                   End If
              End If
           ElseIf UBound(subsubarray) = 2 Then   '' like Q4_1_01
               varparams = subsubarray(0) & "_" & subsubarray(1)
               If IsNumeric(subsubarray(2)) Then
                   If Left(subsubarray(2), 1) = "0" Then        '' like q4_01
                       '' store the blend target varname and the #width
                       varparams = varparams & vbTab & subsubarray(2).Length
                    Else
                       varparams = varparams & vbTab & "-1"  
                   End If
               End If
           Else
               '' do nothing
           End If

           If InStr(varparams, vbTab) Then blendvarlist.Add(varparams.Split(vbTab))
       End If
   End While
   infile.Close()


   fpath = sourcedir & "ToyotaBCT.bln"
   Dim outfile As New System.IO.StreamWriter(fpath, False, System.Text.Encoding.UTF8)
   For i = 1 to blendvarlist.Count-2
       '' next one found, so write the entry
       If blendvarlist(i)(0) <> blendvarlist(i-1)(0) Then
           outfile.WriteLine("[" & blendvarlist(i)(0) & "]")
           If blendvarlist(i)(1) <> "-1" Then
               outfile.WriteLine("pattern=" & blendvarlist(i)(0) & "_#" & blendvarlist(i)(1))
           Else
               outfile.WriteLine("pattern=" & blendvarlist(i)(0) & "_#")
           End If
           outfile.WriteLine("label=L")
           outfile.WriteLine(vbNewLine)
       End If
   Next
   outfile.Close()

End Sub

…writes this blend file

[QUOTA1]
pattern=QUOTA1_#2
label=L

[QUOTA6]
pattern=QUOTA6_#2
label=L

[S1a]
pattern=S1a_#
label=L
… etc

The above subroutine handles extensions like _01, _02 (a leading zero). It is not production code, however, and you will probably have to modify it for your circumstances.

Blend files work best when the SAV is internally consistent. Some simple guidelines are

  • Names as var_1, var_2, … and not var_01, var_02,…
  • Multi-response and grid/cube variable sets all consistently named
  • Variable descriptions to all have the same format, eg.
Q1_1 Brand Last Bought – McDonald’s
Q1_2 Brand Last Bought – Hungry Jack’s

Here, the format is <varname>_<var index> <var description> <hyphen> <code label>.

We do NOT want something like

Q1_2 Brand Last Bought – Hungry – Jack’s

(has two hyphens)

or

Q1_3 Wendy’s – Brand Last Bought

(var description and code label reversed)

If the SAV (or SPS) file is internally consistent, then you should be able to script a blend file writer appropriate for each job.

Tags:

Comments are closed