Home MYSQL select top 5 of a count from table that is grouped by user
Reply: 3

MYSQL select top 5 of a count from table that is grouped by user

klye_g
1#
klye_g Published in 2018-02-14 16:55:06Z

I apologize in advance as I may not be describing my problem properly. I am trying to write a query that takes the top 5 most popular chosen_user_items per user and concatenating the top rows per user into a comma separated string via group_concat and then grouping by user_id's.

For example, if user_id of 1 has five rows for item_id of 1, two rows for item_id of 2, three rows for item_id of 3, and a single row for 4 5 and 6, then the result would be 1, 3, 2, 4, 5.

Here is my example table structure.

Name: chosen_user_items

id | user_id | item_id
------------------------
1  | 1       | 1
2  | 1       | 4
3  | 1       | 19
4  | 1       | 10
5  | 1       | 13
.  | 1       | 1
.  | 1       | 11
.  | 1       | 18
.  | 1       | 212
.  | 1       | 654
.  | 2       | 1
.  | 2       | 28
.  | 2       | 568
.  | 2       | 112
.  | 2       | 354
.  | 3       | 4
.  | 3       | 4
.  | 3       | 19
.  | 3       | 212
.  | 3       | 654
.  | 3       | 4
.  | 3       | 4
.  | 3       | 253
.  | 3       | 187
.  | 3       | 212

And here is an example of my desired output:

user_id | group_concat_results
------------------------------
1       | 1, 4, 19, 13, 212
2       | 1, 28, 568, 212, 354
3       | 4, 212, 19, 654, 253

Here is the query that I have so far

SELECT `chosen_user_items`.`item_id`, COUNT(`chosen_user_items`.`item_id`) AS 'item_count' 
FROM `chosen_user_items` 
WHERE `chosen_user_items`.`user_id` = 1
GROUP BY `chosen_user_items`.`item_id` 
ORDER BY `item_count` DESC 
LIMIT 5

While this works great for a single user, I want to be able to run this query just once for all users (in order to avoid doing hundreds or thousands of database queries), and having to manually concat the results in a language such as PHP.

Thanks in advance.

davejal
2#
davejal Reply to 2018-02-14 18:06:34Z

With some sort of ranking it is possible to do it with 1 query.

select user_id, group_concat(item_id) from 
(
select 
  user_id
  ,item_id
  ,@item_rank := if(@current_item = user_id, @item_rank+1,1) as item_rank 
  ,@current_item:=user_id
  from
      (      
      select 
        user_id
        ,item_id
        ,count(*) aantal
      from chosen_user_items
      group by user_id,item_id
      order by user_id,count(*) desc
      ) a )b
      where item_rank <6
      group by user_id

Here's an sqlfiddle to test it.

I only have a problem with the order of the group_concat, it isn't ordered accordingly.

try the following query to see the result before the group_concat maybe you can concat it better.

select 
  user_id
  ,item_id
  ,@item_rank := if(@current_item = user_id, @item_rank+1,1) as item_rank 
  ,@current_item:=user_id
  from
      (      
      select 
        user_id
        ,item_id
        ,count(*) aantal
      from chosen_user_items
      group by user_id,item_id
      order by user_id,count(*) desc
      ) a 

this query sorts the concat correctly according to the data in your question:

select user_id, group_concat(item_id) from 
(
select 
  user_id
  ,item_id
  ,@item_rank := if(@current_item = user_id, @item_rank+1,1) as item_rank 
  ,@current_item:=user_id
  from
      (      
      select 
        user_id
        ,item_id
        ,count(*) aantal
      from chosen_user_items
      group by user_id,item_id
      order by user_id,count(*) desc
      ) a )b
      where item_rank <6
      group by user_id
      order by user_id,item_rank asc
João Gonçalves
3#
João Gonçalves Reply to 2018-02-14 17:39:32Z

To solve this issue, I think you need to go for 4 different steps.

First, you need to choose/select/order the rows that will be shown. This can be done using a row_number and partition by (this can't be used in MYSQL, but in this guide, they show you the MYSQL equivalente solution https://blog.sqlauthority.com/2014/03/09/mysql-reset-row-number-for-each-group-partition-by-row-number/)

Secondly, you will need to filter the rows that have a row_number lower than 5, this will behave like the "limit 5" of each of the queries.

Third step, you need to have this 5 records for each user transfor to 5 columns. This can be done using pivot table. Here you can find an example of what you have to do: MySQL pivot table

Final step: all you need to do is concat each of the 5 columns, and you will have the information you need for each user.

I hope this clarifies

Edit: Using function GROUP_CONCAT will allow you to replace the 2 last steps

amanda
4#
amanda Reply to 2018-02-15 05:22:03Z

A solution using R.

The dbplyr package will allow you to run this script directly against the database without pulling the data into memory. If you don't want to use R, you can render the SQL query that dbplyr generates from your R statement.

library(tidyverse)
library(stringr)

# --- Set Up ---
dat <- tribble(
  ~user_id, ~item_id,
  1, 1,
  1, 4,
  1, 19,
  1, 10,
  1, 13,
  1, 1,
  1, 11,
  1, 18,
  1, 212,
  1, 654,
  2, 1,
  2, 28,
  2, 568,
  2, 112,
  2, 354,
  3, 4,
  3, 4,
  3, 19,
  3, 212,
  3, 654,
  3, 4,
  3, 4,
  3, 253,
  3, 187,
  3, 212
)

# --- Prep --- 
pre <- dat %>% 
  group_by(user_id) %>% 
  arrange(user_id, item_id) %>% 
  add_count(item_id) %>% 
  rename(
    n_items = n
  ) %>% 
  distinct(user_id, item_id, .keep_all = TRUE) %>% 
  top_n(5, n_items) %>% 
  slice(1:5) %>% 
  arrange(user_id, desc(n_items)) 

# --- Solve ---
# Hacky
solution_one <- pre %>% 
  mutate(collapsed = str_c(item_id, collapse = ", ")) %>% 
  slice(1) %>% 
  select(user_id, collapsed)

# Ideal
solution_two <- pre %>%
  nest() %>% 
  mutate(
    collapsed = data %>% 
      map("item_id") %>% 
      map_chr(str_c, collapse = ", "))

Output:

solution_two
#> # A tibble: 3 x 3
#>   user_id             data            collapsed
#>     <dbl>           <list>                <chr>
#> 1       1 <tibble [5 x 2]>     1, 4, 10, 11, 13
#> 2       2 <tibble [5 x 2]> 1, 28, 112, 354, 568
#> 3       3 <tibble [5 x 2]> 4, 212, 19, 187, 253

This is the best solution because you retain the item_id and its count in the nested list-column data.

You need to login account before you can post.

About| Privacy statement| Terms of Service| Advertising| Contact us| Help| Sitemap|
Processed in 0.465361 second(s) , Gzip On .

© 2016 Powered by mzan.com design MATCHINFO