Published in 2017-12-07 21:21:11Z
 So I have a dataset in R (Framingham Heart Study data), and I am trying to assign BMI groups "underweight," "normal," "overweight," and "obese." It has over 11,000 observations and 38 variables/columns, so it would be kind of hard to post some of the data here (I hope this won't be too much trouble to answer without it). The dataset is called frm and I am trying to subset in the following way: frm$BMIGRP <- NA #Creating new variable (this part works and creates a BMIGRP column with all NA values) frm$BMIGRP[which(as.numeric(frm$BMI) < 18.5)] <- "underweight"  However, there are NA values in the data set BMI variable (indicated with a ".", which I have also tried changing to NA). When I try to subset this way for each group, it is assigning only some of the underweight values to "underweight" and is assigning a lot of NA / "." values to underweight as well. It then tells me there are only 10 "normal" weight subjects and about 11000 in the obese category which is just not true because I can view the data set. If done correctly, this should create the four groups with several hundred to several thousand observations in each category. But I am only getting 10 normal, 71 underweight, and ~11,000 obese. I'm just not sure where I'm going wrong with this or if there is a different way I can create a new variable and assign it in the same kind of way. Any help is very much appreciated. I should also mention that this is the code that my professor gave us as an example in our lab session, and I am basically copying and pasting it with the appropriate replacements for my data set. This is my first question on this website so I apologize if it is incomplete or if I need to give more information. Thanks! leeum 2# leeum Reply to 2017-12-08 04:30:16Z  Reading your code, it seems the column is not numeric. This should work: frm$BMI <- as.numeric(frm$BMI) frm$BMIGRP[frm\$BMI < 18.5] <- "underweight" 
 Like @leeum said. Check that BMI is numeric. If you want to make a new category column based on the BMI, look at case_when from dplyr. So maybe this is what you wanted: library(dplyr) frm <- frm %>% mutate(BMI = as.numeric(BMI)) %>% mutate(BMIGRP = case_when( BMI < 18.5 ~ 'underweight', between(BMI, 18.5, 24.9) ~ 'healthy weight', between(BMI, 25, 29.9) ~ 'overweight', BMI > 30 ~ 'obese') )  The mutate(BMIGRP = as.numeric(BMIGRP)) converts the BMIGRP column to numeric. Then the mutate(BMIGRP = case_when(...) will create a new column called BMIGRP and assign 'underweight', 'healthy weight', 'overweight' or 'obese' based on the BMI. If the argument doesn't apply, an NA will be assigned.