Welcome to WuJiGu Developer Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
273 views
in Technique[技术] by (71.8m points)

r - Aggregate and reshape from long to wide

I have asked this question earlier and received a reply which was not in accordance with my wish. At the time I used stata to do the job. However as I routinely work with such data, I wish to use R to create what I wanted. I have a data set of daily hospital admission by age, sex and diagnoses. I wish to aggregate and reshape the data from long to wide. How could I achieve this objective? Sample data and required output are shown below. The column headers designate prefix of sex, age and diagnoses. Thanks

Sample data

structure(list(diag = structure(c(1L, 1L, 1L, 1L, 2L, 2L, 2L, 
2L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 
2L), .Label = c("card", "cere"), class = "factor"), sex = structure(c(1L, 
1L, 2L, 2L, 1L, 1L, 2L, 2L, 1L, 1L, 2L, 2L, 1L, 1L, 2L, 2L, 1L, 
1L, 2L, 2L, 1L, 1L, 2L, 2L), .Label = c("Female", "Male"), class = "factor"), 
    age = structure(c(1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 
    1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L), .Label = c("35-64", 
    "65-74"), class = "factor"), admissions = c(1L, 1L, 0L, 0L, 
    6L, 6L, 6L, 1L, 4L, 0L, 0L, 0L, 4L, 6L, 5L, 2L, 2L, 4L, 1L, 
    0L, 6L, 5L, 6L, 4L), bdate = structure(c(1L, 1L, 1L, 1L, 
    1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 
    3L, 3L, 3L, 3L, 3L), .Label = c("1987-01-01", "1987-01-02", 
    "1987-01-03"), class = "factor")), .Names = c("diag", "sex", 
"age", "admissions", "bdate"), row.names = c(NA, -24L), class = "data.frame")

Required output

structure(list(date = structure(1:3, .Label = c("01jan1987", 
"02jan1987", "03jan1987"), class = "factor"), f3564card = c(1L, 
4L, 2L), f6574card = c(1L, 0L, 4L), m3564card = c(0L, 0L, 1L), 
    m6574card = c(0L, 0L, 0L), f3564cere = c(6L, 4L, 6L), f6574cere = c(6L, 
    6L, 5L), m3564cere = c(6L, 5L, 6L), m6574cere = c(1L, 2L, 
    4L)), .Names = c("date", "f3564card", "f6574card", "m3564card", 
"m6574card", "f3564cere", "f6574cere", "m3564cere", "m6574cere"
), class = "data.frame", row.names = c(NA, -3L))
See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

Your data are already in a long format that can be used easily by "reshape2", like this:

library(reshape)
dcast(df, bdate ~ sex + age + diag, value.var = "admissions")
#        bdate Female_35-64_card Female_35-64_cere Female_65-74_card Female_65-74_cere
# 1 1987-01-01                 1                 6                 1                 6
# 2 1987-01-02                 4                 4                 0                 6
# 3 1987-01-03                 2                 6                 4                 5
#   Male_35-64_card Male_35-64_cere Male_65-74_card Male_65-74_cere
# 1               0               6               0               1
# 2               0               5               0               2
# 3               1               6               0               4

I don't see any aggregation in your sample output, but if aggregation is required, you can achieve this with the fun.aggregate function within dcast.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to WuJiGu Developer Q&A Community for programmer and developer-Open, Learning and Share
...