Cross Survey View Details

This article contains detailed information about the restrictions that apply to mapping variables across surveys. It also points out the potential consequences of mapping variables with mismatched scales.

Data Type Restrictions

Like survey questions, virtual survey variables have a data type. A cross survey variable’s data type is the type of the first question to which you map the variable. If the first question to which you map your cross survey variable is a Whole Number, then your cross survey variable is a Whole Number.

Once the variable’s data type is set, you can map only to other survey questions of the same data type.

There is a little bit of flexibility in this rule. All numeric types are considered to be simply numeric; the variable mappings do not distinguish between “Whole Numbers”, “Decimal Numbers”, and “Whole Numbers = 0”. Mappings also do not distinguish between Text type questions (short text) and commentary questions (long text). Both are considered to be simply text.

The following table summarizes which type mappings are legal.

 

Data Type Can Map To…
Yes/No Yes/No
Text Text
Whole Numbers Whole Numbers
Whole Numbers = 0
Decimal Numbers
Whole Numbers = 0 Whole Numbers
Whole Numbers = 0
Decimal Numbers
Decimal Numbers Whole Numbers
Whole Numbers = 0
Decimal Numbers
Date/Time Date/Time
Date Date
Time Time
Currency Currency
Checkall Summary Checkall Summary

Scale Values for Virtual Survey Variables

When you create a cross survey variable, the variable’s scale is set to the scale of the first question to which you map the variable.

For example, assume you create a cross survey variable called GENDER to collect gender information from Survey_A and Survey_B.

This is how the gender question appears on Survey_A (response values appear in parentheses):

What is your gender?

  • Male (1)
  • Female (2)

This is how the question appears on Survey_B (response values in parentheses):

What is your gender?

  • Male (1)
  • Female (2)
  • Other (3)

If you map the cross survey variable to the gender question on Survey_A first, then the scale for the variable will have two values: Male (1) and Female (2). If you map it to the gender question on Survey_B first, the scale will have three values: Male (1), Female (2) and Other (3).

This has implications when you view query results. If you map the variable first to the item on Survey_A, so that it has only the Male/Female scale, you will see results similar to the following when you query:

 

Response Value Response Label Count Percent
1 Male 98 49%
2 Female 100 50%
Unknown Unknown 2 1%

 

The two unknown responses came from Survey_B. Because the scale from the virtual survey’s GENDER question includes only Male (1) and Female (2), all responses outside of the scale are lumped together and reported simply as “unknown.”

In this case, it would be wise to map GENDER to the Survey_B question first, because that one has the broader scale.

The following case represents a more problematic scenario. Assume again that we create a virtual survey variable called GENDER that maps to gender questions on Survey_C and Survey_D.

This is how the question appears on Survey_C:

What is your gender?

  • Male (1)
  • Female (2)

This is how it appears on Survey_D:

What is your gender?

  • Female (1)
  • Male (2)

These questions have mismatched scales, and mapping them to the same virtual survey variable is guaranteed to yield unreliable results.

Again, the virtual survey variable uses the scale of the first item mapped as its own scale. If you map first to the question on Survey_C, the scale will be Male(1), Female(2).

The problem here is that all of the “1” answers collected on Survey_D will be reported as Male, when in fact, anyone choosing option 1 on Survey_D was indicating “Female.”

Assume, for example, that Survey_C had 50 respondents, and they were all men. Survey_D had 100 respondents, and they were all women. When you get the summary statistics for the GENDER variable on this virtual survey, you will see something like this:

Response Value Response Label Count Percent
1 Male 150 100%
2 Female 0 0%

 

All of the “Female” responses from Survey_D are incorrectly represented as Male due to the mismatched scales.

There is currently no solution or work-around to this issue.

The best preventive measure is to ensure that question scales are consistent from survey to survey. Illume’s repository was designed specifically to enforce this kind of consistency. Repository questions include version numbers and strict controls to ensure they produce meaningful results when queried across surveys.