Home How to return Column names and Summary Values from SQL Server R Stored Procedure?
Reply: 0

How to return Column names and Summary Values from SQL Server R Stored Procedure?

user6794
1#
user6794 Published in June 24, 2018, 11:52 pm

I have created a stored procedure that uses the dplyr library with the intent to group by StudyID and ProductNumber. I want to return the mean of each value in fields c1-c8 and also the standard deviation for the same fields.

My stored procedure is as follows:

ALTER PROCEDURE [dbo].[spCodeMeans]
    @StudyID INT,
    @StudyID_outer INT OUT
AS
BEGIN
    SET NOCOUNT ON;

    DECLARE @inquery NVARCHAR(MAX) = N'Select
        c.StudyID, c.RespID, c.ProductNumber, c.ProductSequence, c.BottomScaleValue, 
        c.BottomScaleAnchor, c.TopScaleValue, c.TopScaleAnchor, c.StudyDate,
        c.DayOfWeek, c.A, c.B, c.C, c.D, c.E, c.F,
        c.DependentVarYN, c.VariableAttributeID, c.VarAttributeName, c.[1] as c1, 
        c.[2] as c2, c.[3] as c3, c.[4] as c4, c.[5] as c5, c.[6] as c6, c.[7] as c7, c.[8] as c8
        from ClosedStudyResponses c
        --Sensory Value Attributes only for mean and standard deviation analytics.
        where VariableAttributeID = 1
        and c.StudyID = 22'
        ;

    BEGIN TRY
        EXEC sp_execute_external_script
                @language = N'R',
                @script = N'
        library(dplyr)
            OutputDataSet <- InputDataSet %>%
                group_by (StudyID, ProductNumber) %>%
                summarise(c1_mean = mean(c1), c2_mean = mean(c2), c3_mean = mean(c3), c4_mean = mean(c4), c5_mean = mean(c5), c6_mean = mean(c6), 
                c7_mean = mean(c7), c8_mean = mean(c8), c1_sd = sd(c1), c2_sd = sd(c2), c3_sd = sd(c3), c4_sd = sd(c4), c5_sd = sd(c5), c6_sd = sd(c6), 
                c7_sd = sd(c7), c8_sd = sd(c8)) %>%
            `colnames<-`(c("StudyID", "ProductNumber","c1_mean","c2_mean","c3_mean","c4_mean","c5_mean","c6_mean","c7_mean",
            "c8_mean","c1_sd","c2_sd","c3_sd","c4_sd","c5_sd","c6_sd","c7_sd","c8_sd"))',
                @input_data_1 = @inquery
    END TRY
    BEGIN CATCH
        THROW;
    END CATCH

The results do not include column names and do not include the mean and standard deviation for fields c1 through c8. How do I adjust my syntax in order to accomplish this?

Update: As per suggestion, I've revised the stored procedure to insert the results into a temp table. The revised syntax is as follows:

ALTER PROCEDURE [dbo].[spCodeMeans]
-- Add the parameters for the stored procedure here
@StudyID int,
@StudyID_outer int OUT


AS
BEGIN
-- SET NOCOUNT ON added to prevent extra result sets from
-- interfering with SELECT statements.
SET NOCOUNT ON;

--Create temptable to store the outputdataset
Create table #temp_table (
    StudyID int,
    ProductNumber int,
    c1_mean decimal,
    c2_mean decimal,
    c3_mean decimal,
    c4_mean decimal,
    c5_mean decimal,
    c6_mean decimal,
    c7_mean decimal,
    c8_mean decimal,

    c1_sd decimal,
    c2_sd decimal,
    c3_sd decimal,
    c4_sd decimal,
    c5_sd decimal,
    c6_sd decimal,
    c7_sd decimal,
    c8_sd decimal
);

-- Insert statements for procedure here
Declare @inquery nvarchar(max) = N'Select
        c.StudyID, c.RespID, c.ProductNumber, c.ProductSequence, c.BottomScaleValue, 
        c.BottomScaleAnchor, c.TopScaleValue, c.TopScaleAnchor, c.StudyDate,
        c.DayOfWeek, c.A, c.B, c.C, c.D, c.E, c.F,
        c.DependentVarYN, c.VariableAttributeID, c.VarAttributeName, c.[1] as c1, 
        c.[2] as c2, c.[3] as c3, c.[4] as c4, c.[5] as c5, c.[6] as c6, c.[7] as c7, c.[8] as c8
        from ClosedStudyResponses c
        --Sensory Value Attributes only for mean and standard deviation analytics.
        where VariableAttributeID = 1
        and c.StudyID = 22'
        ;
 BEGIN TRY
        Insert into #temp_table
        exec sp_execute_external_script
        @language = N'R',
        @script = N'
        library(dplyr)
            OutputDataSet <- InputDataSet %>%
                group_by (StudyID, ProductNumber) %>%
                summarise_all(.funs=c(mean, sd)) %>%
                setNames(c("StudyID","ProductNumber",
                paste0("c",1:8, "_mean"),
                paste0("c",1:8, "_sd")))
            ',
@input_data_1 = @inquery,
@output_data_1 = N'OutputDataSet';

END TRY

BEGIN CATCH
    THROW;
END CATCH

Select * from #temp_table;
END

When I attempt to run the procedure, I receive an error stating Procedure expects parameter '@params' of type 'ntext/nchar/nvarchar'. Note that I've already declared @inquery as NVARCHAR(Max). Is there another step I've overlooked?

Update #2: I've spent some time re-working the stored procedure and discovered that the output needed to be in the format of a data.frame. I modified the R portion of the stored procedure accordingly. Now my column names appear, but no data, means or standard deviations are returned. Current stored procedure is as follows:

ALTER PROCEDURE [dbo].[spCodeMeans]
-- Add the parameters for the stored procedure here
@StudyID int

AS
BEGIN


-- SET NOCOUNT ON added to prevent extra result sets from
-- interfering with SELECT statements.
SET NOCOUNT ON;



-- Insert statements for procedure here
Declare @sStudy varchar(50)
Set @sStudy = Convert(Varchar(50),@StudyID)
Declare @inquery nvarchar(max) = N'Select
        c.StudyID, c.RespID, c.ProductNumber, c.ProductSequence, c.BottomScaleValue, 
        c.BottomScaleAnchor, c.TopScaleValue, c.TopScaleAnchor, c.StudyDate,
        c.DayOfWeek, c.A, c.B, c.C, c.D, c.E, c.F,
        c.DependentVarYN, c.VariableAttributeID, c.VarAttributeName, c.[1] as c1, 
        c.[2] as c2, c.[3] as c3, c.[4] as c4, c.[5] as c5, c.[6] as c6, c.[7] as c7, c.[8] as c8
        from ClosedStudyResponses c
        --Sensory Value Attributes only for mean and standard deviation analytics.
        where VariableAttributeID = 1
        and c.StudyID =' +@sStudy ;

BEGIN TRY
        --Insert into CodeMeans
        exec sp_execute_external_script
        @language = N'R',
        @script = N'
        library(dplyr)
        codemeans <- function(StudyID){
            res <- InputDataSet %>%
                group_by (StudyID, ProductNumber) %>%
                summarise_all(.funs=c(mean, sd)) %>%
                setNames(c("StudyID","ProductNumber",
                paste0("c",1:8, "_mean"),
                paste0("c",1:8, "_sd")))
            df <- data.frame(res)
            }
            ',
@input_data_1 = @inquery,
@output_data_1_name = N'df',
@params = N'@StudyID int',
@StudyID = @StudyID

END TRY

BEGIN CATCH
    THROW;
END CATCH

Select * from CodeMeans;
END

So, at this point, am seeking guidance on how to return output from the base query as well as the means and standard deviations.

You need to login account before you can post.

About| Privacy statement| Terms of Service| Advertising| Contact us| Help| Sitemap|
Processed in 0.311079 second(s) , Gzip On .

© 2016 Powered by mzan.com design MATCHINFO