Home Replace NaN in any of specific columns to value in another column
Reply: 2

Replace NaN in any of specific columns to value in another column

pashute
1#
pashute Published in 2017-12-07 02:54:27Z

In MATLAB, I have a table like this:

date        foo   bar   mitzva  baz
1995-1-1    21    54    12      37
1995-1-2    NaN   52    11      36
1995-1-3    39    56    NaN     38
1995-1-4    25    NaN   NaN     50

How do I set any column with NaN to the value of its row's baz?

Or in other words:

  1. How do I set a filter to check row-by-row for an NaN in any of the three columns before baz?

  2. How can I set the three values without copying the good value 3 times?

What do I change in the following to get it to work? Or am I in the wrong direction?

colsToUpdate = {colidxFoo : colidxMitzva);
baddataFilter = any(isnan(data(:,colsToUpdate)); 
    % corrected from isnull. thanks @SardarUsama my mistake...

data(baddataFilter, colsToUpdate) = data(baddataFilter, colidxBaz);

I want to get:

date        foo   bar   mitzva  baz
1995-1-1    21    54    12      37
1995-1-2    36    52    11      36
1995-1-3    39    56    38      38
1995-1-4    25    50    50      50
Sardar Usama
2#
Sardar Usama Reply to 2017-12-07 16:08:27Z
Val = table2array(T(:,2:end));  %Extracting the elements of all columns except 1st one
IndBadV = isnan(Val);           %Finding NaNs
Val(IndBadV) = 0;               %Replacing NaNs with zeros
Val = Val+repmat(Val(:,end),1,size(Val,2)) .* IndBadV; %Updating the values
T = [T(:,1) array2table(Val, ...%Updating the table
    'VariableNames',T.Properties.VariableNames(2:end))];
pashute
3#
pashute Reply to 2017-12-12 00:20:38Z

Just to clarify Usama's answer above:

% problemCols = 2:end
Data = table2array(T(:,2:end));  % Without date column (which serves as index)
BadOnes = isnan(Data);           % Finding NaNs
Data(BadOnes) = 0;               % Replacing NaNs with zeros
FixcolsCount = size(Data,2);     % Number of columns to fix
GoodCol = end;
RepFixed = repmat(Data(:,GoodCol),1,FixcolsCount); % repeat 1 time down, for all columns
Data = Data + RepFixed .* BadOnes;       % Updating the values
T = [T(:,1) array2table(Val, ...         % Updating the table
    'VariableNames',T.Properties.VariableNames(2:end))];

Note: This works because zero is added when the data exists (badones is zero), but Repfix value is added when the data does not exist (badones in this case is one).

For example:

index  foo  bar mitzva baz     badones a   b   c      fixed  foo bar mitzva
1      11   na  13     14              0    1  0             0    14  0
2      na   22  23     24              1    0  0             24    0  0

result of addition and .* multiplication:

1      11  14   13     14
2      24  22   23     24

A colleague showed me another way:

BadOnes = isnan(T{:,2:end-1});          
GoodIndice = find(~BadOnes);
Fixed = BadOnes * repmat(T{:,end},1,size(T{:,2:end-1},2));;
Fixed(GoodIndice) = DataArray(GoodIndice); % adds GoodValues back into array
T{:,2:end-1} = Fixed;

Or explained in detail:

DataArray = T{:,2:end-1};
ProblemColsCount = size(DataArray,2); % Number of columns to fix

BadOnes = isnan(DataArray);           % Finding NaNs
% GoodCol = end;
RepFixed = repmat(T{:,end},1,ProblemColsCount); % repeat 1 time down, for all columns

GoodIndice = find(~BadOnes);
Fixed = BadOnes * RepFixed;
Fixed(GoodIndice) = DataArray(GoodIndice); % adds GoodValues back into array
T{:,2:end0-1} = Fixed;

In any case, I was hoping for something like Usama's original answer with fillmissing() or something short and similar.

You need to login account before you can post.

About| Privacy statement| Terms of Service| Advertising| Contact us| Help| Sitemap|
Processed in 0.506733 second(s) , Gzip On .

© 2016 Powered by mzan.com design MATCHINFO