0

Let's say I have an 'employees' table with employee start and end dates, like so:

employees

employee_id   start_date   end_date
53            '19901117'   '99991231'
54            '19910208'   '20010512'
55            '19910415'   '20120130'
.             .            .
.             .            .
.             .            .

And let's say I want to get the monthly count of employees who were employed at the end of the month. So the resulting data set I'm after would look like:

month        count of employees
'20150131'   120
'20150228'   118
'20150331'   122
.            .
.            .
.            .

The best way I currently know how to do this is to create a "helper" table to join onto, such as:

helper_tbl

month
'20150131'
'20150228'
'20150331'
.
.
.

And then do a query like so:

SELECT t0b.month,
        count(t0a.employee_id)
FROM employees t0a
JOIN helper_tbl t0b
ON t0b.month BETWEEN t0a.start_date AND t0a.end_date
GROUP BY t0b.month

However, this is somewhat annoying solution to me, because it means I'm having to create these little helper tables all the time and they clutter up my schema. I feel like other people must run into the same need for "helper" tables, but I'm guessing people have figured out a better way to go about this that isn't so manual. Or do you all really just keep creating "helper" tables like I do to get around these situations?

I understand this question is a bit open-ended up for stack overflow, so let me offer a more closed-ended version of the question which is, "Given just the 'employees' table, what would YOU do to get the resulting data set that I showed above?"

4

2 回答 2

1

You can use a CTE to generate all the month values, either form a fixed starting point or based on the earliest date in your table:

with months (month) as (
  select add_months(first_month, level - 1)
  from (
    select trunc(min(start_date), 'MM') as first_month from employees
  )
  connect by level <= ceil(months_between(sysdate, first_month))
)
select * from months;

With data that was an earliest start date of 1990-11-17 as in your example, that generates 333 rows:

MONTH              
-------------------
1990-11-01 00:00:00
1990-12-01 00:00:00
1991-01-01 00:00:00
1991-02-01 00:00:00
1991-03-01 00:00:00
...
2018-06-01 00:00:00
2018-07-01 00:00:00

You can then use that in a query that joins to your table, something like:

with months (month) as (
  select add_months(first_month, level - 1)
  from (
    select trunc(min(start_date), 'MM') as first_month from employees
  )
  connect by level <= ceil(months_between(sysdate, first_month))
)
select m.month, count(*) as employees
from months m
left join employees e
on e.start_date <= add_months(m.month, 1)
and (e.end_date is null or e.end_date >= add_months(m.month, 1))
group by m.month
order by m.month;

Presumably you wan to include people who are still employed, so you need to allow for the end date being null (unless you're using a magic end-date value for people who are still employed...)

With dates stored as string it's a bit more complicated but you can generate the month information in a similar way:

with months (month, start_date, end_date) as (
  select add_months(first_month, level - 1),
    to_char(add_months(first_month, level - 1), 'YYYYMMDD'),
    to_char(last_day(add_months(first_month, level - 1)), 'YYYYMMDD')
  from (
    select trunc(min(to_date(start_date, 'YYYYMMDD')), 'MM') as first_month from employees
  )
  connect by level <= ceil(months_between(sysdate, first_month))
)
select m.month, m.start_date, m.end_date, count(*) as employees
from months m
left join employees e
on e.start_date <= m.end_date
and (e.end_date is null or e.end_date > m.end_date)
group by m.month, m.start_date, m.end_date
order by m.month;

Very lightly tested with a small amount of made-up data and both seem to work.

于 2018-07-31T17:15:46.770 回答
1

If you want to get the employees who were employed at the end of the month, then you can use the LAST_DAY function in the WHERE clause of the your query. Also, you can use that function in the GROUP BY clause of your query. So your query would be like below,

SELECT LAST_DAY(start_date), COUNT(1)
  FROM employees
 WHERE start_date = LAST_DAY(start_date)
 GROUP BY LAST_DAY(start_date)

or if you just want to count employees employed per month then use below query,

SELECT LAST_DAY(start_date), COUNT(1)
  FROM employees
 GROUP BY LAST_DAY(start_date)
于 2018-08-01T05:30:17.423 回答