Welcome to WuJiGu Developer Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
718 views
in Technique[技术] by (71.8m points)

filter - Select random 50% of sample, but only 1 person per couple

Essentially, I'm trying to do stratified random sampling. I want to run an analysis on data with heterosexual couples. I need to select a random 50% of women and a random 50% of men they are not married to. I know how to filter out a random percentage of the total sample, but not how to ensure that only one person per household is selected.

My data look like this:

couple person gender Q1 Q2 Q3 Q4 Q5

1 1 0 3.5 4.2 2.3 3.3 4.3

1 2 1 3.2 2.5 2.1 3.7 5.6

2 1 1 3.7 2.6 3.3 4.2 5.1

2 2 0 3.0 3.5 2.1 3.6 5.4

It's in long format, so each row represents a person and there are two people per couple.

EDITED for more details

hhid = couple hhidpn = person ragender = gender in which 1 = male, 2 = female SPAQ1-8 = items 1-8 of a self-perceptions of aging scale

[1]: https://i.stack.imgur.com/yKQ7k.png
question from:https://stackoverflow.com/questions/66051822/select-random-50-of-sample-but-only-1-person-per-couple

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

My suggestion is to randomly divide the couples in two equal groups, and then select the women in one group and the men in the other group.

First I'll reconstruct your example data to demonstrate on:

data list list/couple person gender (3f1) Q1   Q2   Q3   Q4    Q5 (5f2.1).
begin data
1  1  0   3.5  4.2  2.3  3.3  4.3
1  2  1   3.2  2.5  2.1  3.7  5.6
2  1  1   3.7  2.6  3.3  4.2  5.1
2  2  0   3.0  3.5  2.1  3.6  5.4
3  1  0   3.5  4.2  2.3  3.3  4.3
3  2  1   3.2  2.5  2.1  3.7  5.6
4  1  1   3.7  2.6  3.3  4.2  5.1
4  2  0   3.0  3.5  2.1  3.6  5.4
end data.

Now we have a dataset, we can do the sampling as you need:

EDITED for a shorter process - using a version of @rossum's suggestion:

* first we give each couple a random number, and then use it to sort the couples randomly.
sort cases by couple.
compute randorder=uniform(100).
if couple=lag(couple) randorder=lag(randorder).
sort cases by randorder.

* now we create a running index for the couples, and use it to select males 
or females according to odd or even index.
compute coupleNum=1.
if $casenum>1 coupleNum=lag(coupleNum)+(couple<>lag(couple)).
compute selected=(mod(coupleNum, 2)=gender).
exe.

Now you created your selection variable, you can use it with filter or with select to continue to the analysis.

EDIT: The above code works for gender having values 0,1. The edit to the OP shows the values for gender are actually 1,2. So the final computation of selected should be done this way instead of as above:

compute selected=(mod(coupleNum, 2)+1=gender).

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to WuJiGu Developer Q&A Community for programmer and developer-Open, Learning and Share

2.1m questions

2.1m answers

62 comments

56.7k users

...