Home Aggregate and NA values
Reply: 1

Aggregate and NA values

help... Published in 2017-12-04 12:46:44Z

This question already has an answer here:

  • Calculate mean across rows with NA values in R 2 answers

I've got a dataframe with 1000 observations.
For each observation I got five variables. Now I'd like to create a new variable which is an aggregation from those 5 variables.
I typed the following:

df$aggr_variable <- (1/5)*(var1+var2+var3+var4+var5)

I then got the new aggregated variable, but also a problem. If let's say observation 839 got a missing value NA in var2, but still values for the other four variables, it gives me NA in the aggregated variable.

How can I leave the NA's of the five variables out without having to leave out the whole observation when one variable contains an NA?

YQ.Wang Reply to 2017-12-04 13:04:22Z

According to your aggregate equation, you are computing the average value of these five variables for each sample(row).

#some reproduciable data
df <- data.frame(var1=rnorm(20,10,5),var2=rnorm(20,5,1),var3=rnorm(20,30,1),
#generates some NAs:
df[11,5] <- NA
df[8,3] <- NA
df[9,1] <- NA
df[17,2] <- NA
df[11,2] <- NA

#aggregate by mean
df$aggr_variable <- apply(df,1,function(x){mean(x,na.rm=T)})
You need to login account before you can post.

About| Privacy statement| Terms of Service| Advertising| Contact us| Help| Sitemap|
Processed in 0.311706 second(s) , Gzip On .

© 2016 Powered by mzan.com design MATCHINFO