Select every month even if month doesn't exist in mysql table

Let's say I have these two tables in mysql.

table1:

date         staff_no
2016-06-10   1
2016-06-09   1
2016-05-09   1
2016-04-09   1

      

table2:

staff_no    name
1           David

      

Then I have this query to get an analysis for staff for each month:

SELECT DATE_FORMAT(table1.date,'%b %Y') as month,COUNT(table1.date) as total_records,table2.name as name
FROM table1 as table1 
LEFT JOIN table2 as table2 on table2.staff_no = table1.staff_no
WHERE table1.staff_no = "1" and date(table1.date) between = "2016-04-01" and "2016-06-30" 
GROUP BY table2.name,DATE_FORMAT(table1.date,'%Y-%m')
ORDER BY DATE_FORMAT(table1.date,'%Y-%m-%d')

      

This query will be output:

month      total_records  name
Apr 2016               1  David
May 2016               1  David
Jun 2016               2  David

      

But, if I replace the date between "2016-04-01" and "2016-07-31" from the query, it doesn't show me the July record because it doesn't exist in table1, which is not what I want, I still want a result like this:

month      total_records  name
Apr 2016               1  David
May 2016               1  David
Jun 2016               2  David
Jul 2016               0  David   

      

Anyone from the experts? Please help me with this. Thank!

+1


source to share


3 answers


Consider the following diagram with a third table that is referenced in the year / month support table. Helper tables are very common and can naturally be reused in your code. I'll leave it to you to download with essential date data. Note, however, how the end date of each month has been put together for those of us who want to work less, and also by allowing the db engine to calculate leap years for us.

There can be only one column in this auxiliary table. But that would require using function calls for end dates in some of your functions, which means slower performance. We like it fast.

Scheme

create table workerRecords
(   id int auto_increment primary key,
    the_date date not null,
    staff_no int not null
);
-- truncate workerRecords;
insert workerRecords(the_date,staff_no) values
('2016-06-10',1),
('2016-06-09',1),
('2016-05-09',1),
('2016-04-09',1),
('2016-03-02',2),
('2016-07-02',2);

create table workers
(   staff_no int primary key,
    full_name varchar(100) not null
);
-- truncate workers;
insert workers(staff_no,full_name) values
(1,'David Higgins'),(2,"Sally O'Riordan");

      

Table below

create table ymHelper
(   -- Year Month helper table. Used for left joins to pick up all dates.
    -- PK is programmer choice.
    dtBegin date primary key,   -- by definition not null
    dtEnd date null
);
-- truncate ymHelper;
insert ymHelper (dtBegin,dtEnd) values
('2015-01-01',null),('2015-02-01',null),('2015-03-01',null),('2015-04-01',null),('2015-05-01',null),('2015-06-01',null),('2015-07-01',null),('2015-08-01',null),('2015-09-01',null),('2015-10-01',null),('2015-11-01',null),('2015-12-01',null),
('2016-01-01',null),('2016-02-01',null),('2016-03-01',null),('2016-04-01',null),('2016-05-01',null),('2016-06-01',null),('2016-07-01',null),('2016-08-01',null),('2016-09-01',null),('2016-10-01',null),('2016-11-01',null),('2016-12-01',null),
('2017-01-01',null),('2017-02-01',null),('2017-03-01',null),('2017-04-01',null),('2017-05-01',null),('2017-06-01',null),('2017-07-01',null),('2017-08-01',null),('2017-09-01',null),('2017-10-01',null),('2017-11-01',null),('2017-12-01',null),
('2018-01-01',null),('2018-02-01',null),('2018-03-01',null),('2018-04-01',null),('2018-05-01',null),('2018-06-01',null),('2018-07-01',null),('2018-08-01',null),('2018-09-01',null),('2018-10-01',null),('2018-11-01',null),('2018-12-01',null),
('2019-01-01',null),('2019-02-01',null),('2019-03-01',null),('2019-04-01',null),('2019-05-01',null),('2019-06-01',null),('2019-07-01',null),('2019-08-01',null),('2019-09-01',null),('2019-10-01',null),('2019-11-01',null),('2019-12-01',null);
-- will leave as an exercise for you to add more years. Good idea to start, 10 in either direction, at least.
update ymHelper set dtEnd=LAST_DAY(dtBegin);    -- data patch. Confirmed leap years.
alter table ymHelper modify dtEnd date not null;    -- there, ugly patch above worked fine. Can forget it ever happened (until you add rows)
-- show create table ymHelper; -- this confirms that dtEnd is not null

      

So this is a helper table. Set it up once and forget about it for a few years.

Note . Don't forget to run the above stmt update

Quick test for your request

SELECT DATE_FORMAT(ymH.dtBegin,'%b %Y') as month,
ifnull(COUNT(wr.the_date),0) as total_records,@soloName as full_name 
FROM ymHelper ymH 
left join workerRecords wr 
on wr.the_date between ymH.dtBegin and ymH.dtEnd 
and wr.staff_no = 1 and wr.the_date between '2016-04-01' and '2016-07-31' 
LEFT JOIN workers w on w.staff_no = wr.staff_no 
cross join (select @soloName:=full_name from workers where staff_no=1) xDerived 
WHERE ymH.dtBegin between '2016-04-01' and '2016-07-31' 
GROUP BY ymH.dtBegin 
order by ymH.dtBegin; 

+----------+---------------+---------------+
| month    | total_records | full_name     |
+----------+---------------+---------------+
| Apr 2016 |             1 | David Higgins |
| May 2016 |             1 | David Higgins |
| Jun 2016 |             2 | David Higgins |
| Jul 2016 |             0 | David Higgins |
+----------+---------------+---------------+

      

It works great. The first mysql table is the Helper table. Left concatenation to inject employee records (nullable). Let me stop here. This was in the end the question of your question: missing data . Finally, the desktop in the cross is connected.

cross join

- initialize the variable ( @soloName

), which is the name of the worker. While the null status of missing dates is picked up by a function ifnull()

that returns 0 as you request , we don't have that luxury for a worker name. It makes cross join

.

The cross join is a Cartesian product. But since this is a single row, we don't suffer from the usual problems with Cartesians causing the path to many rows in the result set. Anyway, it works.



But here's one of the problems: it's too hard to maintain and plug in values ​​in 6 places, as you can see. Therefore, we will consider below the stored procedure for it.

Saved Proc

drop procedure if exists getOneWorkersRecCount;
DELIMITER $$
create procedure getOneWorkersRecCount
(pStaffNo int, pBeginDt date, pEndDt  date)
BEGIN
    SELECT DATE_FORMAT(ymH.dtBegin,'%b %Y') as month,ifnull(COUNT(wr.the_date),0) as total_records,@soloName as full_name
    FROM ymHelper ymH 
    left join workerRecords wr 
    on wr.the_date between ymH.dtBegin and ymH.dtEnd 
    and wr.staff_no = pStaffNo and wr.the_date between pBeginDt and pEndDt
    LEFT JOIN workers w on w.staff_no = wr.staff_no 
    cross join (select @soloName:=full_name from workers where staff_no=pStaffNo) xDerived
    WHERE ymH.dtBegin between pBeginDt and pEndDt 
    GROUP BY ymH.dtBegin
    order by ymH.dtBegin;
END$$
DELIMITER ;

      

Test the saved process multiple times

call getOneWorkersRecCount(1,'2016-04-01','2016-06-09');
call getOneWorkersRecCount(1,'2016-04-01','2016-06-10');
call getOneWorkersRecCount(1,'2016-04-01','2016-07-01');
call getOneWorkersRecCount(2,'2016-02-01','2016-11-01');

      

Ah, much easier to work with (in PHP, C #, Java, you name it). The choice is yours whether it's stored in proc or not.

Bonus Stored Code

drop procedure if exists getAllWorkersRecCount;
DELIMITER $$
create procedure getAllWorkersRecCount
(pBeginDt date, pEndDt  date)
BEGIN
    SELECT DATE_FORMAT(ymH.dtBegin,'%b %Y') as month,ifnull(COUNT(wr.the_date),0) as total_records,w.staff_no,w.full_name
    FROM ymHelper ymH 
    cross join workers w 
    left join workerRecords wr 
    on wr.the_date between ymH.dtBegin and ymH.dtEnd 
    and wr.staff_no = w.staff_no and wr.the_date between pBeginDt and pEndDt
    -- LEFT JOIN workers w on w.staff_no = wr.staff_no 
    -- cross join (select @soloName:=full_name from workers ) xDerived
    WHERE ymH.dtBegin between pBeginDt and pEndDt 
    GROUP BY ymH.dtBegin,w.staff_no,w.full_name
    order by ymH.dtBegin,w.staff_no;
END$$
DELIMITER ;

      

Rapid testing

call getAllWorkersRecCount('2016-03-01','2016-08-01');
+----------+---------------+----------+-----------------+
| month    | total_records | staff_no | full_name       |
+----------+---------------+----------+-----------------+
| Mar 2016 |             0 |        1 | David Higgins   |
| Mar 2016 |             1 |        2 | Sally O'Riordan |
| Apr 2016 |             1 |        1 | David Higgins   |
| Apr 2016 |             0 |        2 | Sally O'Riordan |
| May 2016 |             1 |        1 | David Higgins   |
| May 2016 |             0 |        2 | Sally O'Riordan |
| Jun 2016 |             2 |        1 | David Higgins   |
| Jun 2016 |             0 |        2 | Sally O'Riordan |
| Jul 2016 |             0 |        1 | David Higgins   |
| Jul 2016 |             1 |        2 | Sally O'Riordan |
| Aug 2016 |             0 |        1 | David Higgins   |
| Aug 2016 |             0 |        2 | Sally O'Riordan |
+----------+---------------+----------+-----------------+

      

The takeaway

Helper tables have been in use for decades. Don't be afraid or hesitate to use them. In fact, trying to get some kind of special job without them is almost impossible.

+1


source


You can create a built-in set of variables representing all the dates you want using any other table on your system that has the smallest number of months you are trying to represent, although the data should not have dates. There are simply records that you can set a limit on.

TRY the following statement which uses MySql variables. The FROM clause declares an inline variable for the "@ Date1" SQL statement. I start it on March 1st 2016. Now the list of select boxes takes this variable and keeps adding on it 1 month at a time. Since it is combined with "AnyTableWithAtLeast12Records" (literally any table on your system with at least X records), it will produce a result showing dates. This is one way to format a list of calendar types.

But note that the SECOND column in this select does not change @ Date1 with the assignment: =. Thus, it takes the date as it is now and adds one more month to it for the end date. If you need a smaller or larger date range, just change the record limit to create a calendar schedule ...

select
     @Date1 := date_add( @Date1, interval 1 month ) StartDate,
      date_add( @Date1, interval 1 month ) EndDate
    from
      AnyTableWithAtLeast12Records,
      ( select @Date1 := '2016-03-01' ) sqlvars
   limit 12;

      



The result was something like ...

StartDate   EndDate
2016-04-01  2016-05-01
2016-05-01  2016-06-01
2016-06-01  2016-07-01
2016-07-01  2016-08-01
2016-08-01  2016-09-01
2016-09-01  2016-10-01
2016-10-01  2016-11-01
2016-11-01  2016-12-01
2016-12-01  2017-01-01
2017-01-01  2017-02-01
2017-02-01  2017-03-01
2017-03-01  2017-04-01

      

Now you have your dynamic Calendar in one simple query. Now use this as the basis for all the records you need and format them just like you do. So take the whole query above as a JOIN to find records in those date ranges ... No other queries or stored procedures are required. Now, a simple LEFT JOIN will keep all dates, but only show those with staff when WITHIN is between the start / end range. So ex: is greater than or equal to 04/01/2016 but LESS THAN 05/01/2016 which includes 04/30/2016 @ 11:59:59 pm.

SELECT 
      DATE_FORMAT(MyCalendar.StartDate,'%b %Y') as month,
      COALESCE(COUNT(T1.Staff_no),0) as total_records,
      COALESCE(T2.name,"") as name
   FROM 
      ( select @Date1 := date_add( @Date1, interval 1 month ) StartDate,
               date_add( @Date1, interval 1 month ) EndDate
           from
              AnyTableWithAtLeast12Records,
              ( select @Date1 := '2016-03-01' ) sqlvars
           limit 12 ) MyCalendar
        LEFT JOIN table1 T1
           ON T1.Date >= MyCalendar.StartDate
           AND T1.Date < MyCalendar.EndDate
           AND T1.Staff_No = 1
           LEFT JOIN table2 T2
              ON T1.staff_no = T2.StaffNo
   GROUP BY
      T2.name,
      DATE_FORMAT(MyCalendar.StartDate,'%Y-%m')
   ORDER BY 
      DATE_FORMAT(MyCalendar.StartDate,'%Y-%m-%d')

      

+1


source


I would say that you need to have a RIGHT JOIN here to include employees from the second table

0


source







All Articles