So I have a dataset in R (Framingham Heart Study data), and I am trying to assign BMI groups "underweight," "normal," "overweight," and "obese."
It has over 11,000 observations and 38 variables/columns, so it would be kind of hard to post some of the data here (I hope this won't be too much trouble to answer without it).
The dataset is called frm and I am trying to subset in the following way:
frm$BMIGRP <- NA #Creating new variable (this part works and creates a BMIGRP column with all NA values)
frm$BMIGRP[which(as.numeric(frm$BMI) < 18.5)] <- "underweight"
However, there are NA values in the data set BMI variable (indicated with a ".", which I have also tried changing to NA).
When I try to subset this way for each group, it is assigning only some of the underweight values to "underweight" and is assigning a lot of NA / "." values to underweight as well. It then tells me there are only 10 "normal" weight subjects and about 11000 in the obese category which is just not true because I can view the data set.
If done correctly, this should create the four groups with several hundred to several thousand observations in each category. But I am only getting 10 normal, 71 underweight, and ~11,000 obese.
I'm just not sure where I'm going wrong with this or if there is a different way I can create a new variable and assign it in the same kind of way. Any help is very much appreciated.
I should also mention that this is the code that my professor gave us as an example in our lab session, and I am basically copying and pasting it with the appropriate replacements for my data set.
This is my first question on this website so I apologize if it is incomplete or if I need to give more information. Thanks!