Quiz: SQL query
Julien Dephix

Julien Dephix @joolsmcfly

About: Been programming for over 20 years and loving it!

Location:
France
Joined:
Jun 24, 2022

Quiz: SQL query

Publish Date: Jun 30 '22
8 7

Hello, coders! 💻

Today's quiz taken from a real world task is SQL related.

Database structure

There are two tables (very simplified for the sake of this quiz):

  1. companies with two columns: id and name
  2. users with two columns: company_id and is_pro (boolean)

Task

write a query to find all companies that only have pro users (is_pro=1).

My take on this

Click to see my solution using MySQL
SELECT
    c.name as company
FROM
    users u
JOIN companies c ON
    c.id = u.company_id
GROUP BY
    c.id
HAVING
    SUM(IF(is_pro, 0, 1)) = 0;
Enter fullscreen mode Exit fullscreen mode

IF(is_pro, 0, 1) means that I give 0 points to users that are pros (counter-intuitive but bear with me!). If all users are pros then the sum of points should be 0, right? That's what HAVING IF(is_pro, 0, 1)) = 0 does.

There you have it!

How would you have done it?

📝 Edit
As pointed out by @geraldew using SUM to COUNT is hacky and as such I'd better use COUNT and CASE:

SELECT
    c.name as company,
    COUNT(
        CASE WHEN can_use_app = 1 THEN 'x'
    END
    ) isPro,
    COUNT(
        CASE WHEN can_use_app = 0 OR can_use_app IS NULL THEN 'x'
    END
    ) isNotPro
FROM
    users u
JOIN companies c ON
    c.id = u.company_id
GROUP BY
    c.id
HAVING
    isPro > 0 and isNotPro = 0;
Enter fullscreen mode Exit fullscreen mode

Happy coding! ⌨️

Comments 7 total

  • geraldew
    geraldewJul 1, 2022

    I don't know which SQL dialects have such an IF function (none that I use do), so my question is: what happens when is_pro is null?

    • Julien Dephix
      Julien DephixJul 1, 2022

      What dialect do you use? IF has been available in MySQL since at least v5.1(released in 2008!).
      I should have mentioned MySQL in my solution, I'll edit.

      in SQL Server you have IIF for instance.

      If is_pro is null then it's not considered truthy so IF will return 1, which is what we want.

      mysql> select IF(null, 0, 1);
      +----------------+
      | IF(null, 0, 1) |
      +----------------+
      |              1 |
      +----------------+
      
      Enter fullscreen mode Exit fullscreen mode

      Official MySQL documentation on IF

      • geraldew
        geraldewJul 2, 2022

        Well my question comes more from the fact that there is no IF in the SQL standard, so any article not clearly saying it is about a particular dialect perhaps should not use it.

        The standard has the CASE expression for that purpose (note: expression, not function).

        But to answer your question, most of my work has been with Teradata, HiveQL, Impala and occasionally PostgreSQL and SQLite. In some of these, there is an "IF" construct but it is used outside of SQL statements to provide a kind of conditional execution that standard SQL itself lacks.

        An obvious caveat is to say that all vendors vary the dialect and that's probably a good thing overall and has been part of the evolution of SQL.

        The issue of what functions should do when encountering a Null is .. debatable - and really is a whole other topic. My personal preference is for Nulls to cascade except where a construct is explicit about handling them.

        Similarly I don't ever write SQL code that counts by summing the number 1 - even though I see that done a lot. When I'm investigating problems, code that does this is an immediate red flag.

        • geraldew
          geraldewJul 2, 2022

          And having said all that, I probably should back that up with code. A first cut of how I'd tackle it (i.e. just air code as I'm now at home) would be:

          SELECT
              c.id ,
              c.name AS company ,
              COUNT( CASE WHEN u.is_pro THEN 'x' END ) AS IsPro_Cnt ,
              COUNT( CASE WHEN NOT u.is_pro THEN 'x' END ) AS NotPro_Cnt ,
              COUNT( CASE WHEN u.is_pro IS NULL THEN 'x' END ) AS NullPro_Cnt
          FROM
              companies AS c 
              INNER JOIN 
              users AS u ON
                  u.company_id = c.id
          GROUP BY
              c.id
          HAVING
              IsPro_Cnt > 0
              AND
              NotPro_Cnt = 0
              /* depending on what we want to do about Nulls 
                  for is_pro we might also make use of ..
                  AND
                  NullPro_Cnt = 0 */
          ;
          
          Enter fullscreen mode Exit fullscreen mode

          Where for clarity I've shown the subcounts as columns that are then quoted by name in the HAVING clause. As each is only used once, they could just be expressed in the HAVING clause and thus not output as columns.

          • Depending on context, such re-arrangements aren't needed - e.g. if used in defining a View it then makes no difference to a query planner when you don't ever use those named columns outside of the view definition.

          Also, partly to make it clear - both to myself and a later reader of the code - I've put in the parts that would help deal with Nulls. This particularly helps when other requirements might prompt a change from using an INNER JOIN to using a LEFT OUTER JOIN, in which case you really do get a Null for the is_pro column via non-matches of the outer join. All the more reason to always be explicit about Null handling with SQL.

          p.s. FWIW I don't personally find my THEN 'x' to be elegant but any non-null value will do there. I don't use the number 1 precisely to avoid confusions with places where people might use the SUM of 1 approach.

          • Julien Dephix
            Julien DephixJul 2, 2022

            Thanks for your valuable input.

            It's indeed more portable to use COUNT in conjunction with CASE. I wrote that query as a one-off, for stats, after wondering how many companies only had pro users. It's not production code. That being said, I'll keep COUNT and CASE in mind!

            In my use case, NULL should be treated the same as 0 so I would write it as:

            COUNT(
                CASE WHEN can_use_app = 0 OR can_use_app IS NULL THEN 'x'
            END
            ) isNotPro
            
            Enter fullscreen mode Exit fullscreen mode
            • geraldew
              geraldewJul 2, 2022

              Ah yes, because you said is_pro was Boolean I just used that. Which is ironic as many data engines and their SQL dialects don't have a Boolean type at all, including the ones I mostly use. The joys of a language standard that is largely observed in the breach.

              • Julien Dephix
                Julien DephixJul 2, 2022

                Yeah by Boolean I meant 0 or 1 (up to you to decide if null is allowed or not).
                Boolean type in MySQL is sugar syntax for TINYINT(1).

                I’ll be more precise next time!

Add comment