How to count people in difficult groups by age / gender / etc?

I got the following table Patients

.

HospitalId INT,
GenderId BIT,
Age TINYINT,
DiseaseId SMALLINT

      

GenderId

= 0 is Male

GenderId

= 1 is feminine

HospitalA has HospitalId

0

HospitalB has HospitalId

1

Here's the output I want to generate:

DiseaseId | HospitalA_Male_18-30 | HospitalA_Male_31-40 |
---------------------------------------------------------
0         |   (count here)       |   (count here)       |
1         |   (count here)       |   (count here)       |
2         |   (count here)       |   (count here)       |
3         |   (count here)       |   (count here)       |

      

(columns continue)

HospitalA_Female_18-30 | HospitalA_Female_31-40 |
-------------------------------------------------
    (count here)       |     (count here)       |
    (count here)       |     (count here)       |
    (count here)       |     (count here)       |
    (count here)       |     (count here)       |

      

(columns continue)

HospitalB_Male_18-30 | HospitalB_Male_31-40 |
---------------------------------------------
    (count here)     |     (count here)     |
    (count here)     |     (count here)     |
    (count here)     |     (count here)     |
    (count here)     |     (count here)     |

      

(columns continue)

HospitalB_Female_18-30 | HospitalB_Female_31-40 |
-------------------------------------------------
    (count here)       |     (count here)       |
    (count here)       |     (count here)       |
    (count here)       |     (count here)       |
    (count here)       |     (count here)       |

      

(9 columns in the result set)

So, as you can see, I really need to calculate, for each disease, how many patients have the disease in each specific group (by hospital, by gender and by age).

How can such grouping be done (most efficiently) in T-SQL?

+3


source to share


2 answers


You can do this with a pivot query:

select * from 
(
  select diseaseid, 
         'Hospital'
         + case hospitalid when 0 then 'A' when 1 then 'B' end
         + '_'
         + case genderid when 1 then 'Female' else 'Male' end
         + '_'
         + case when age between 18 and 30 
              then '18-30' 
              else (case when age between 31 and 40 then '31-40' end) 
              end Title,
         1 Cnt
  from Patients
  where age between 18 and 40
) t
pivot (
  count (Cnt) for Title in (
    [HospitalA_Male_18-30],   [HospitalA_Male_31-40],
    [HospitalA_Female_18-30], [HospitalA_Female_31-40],
    [HospitalB_Male_18-30],   [HospitalB_Male_31-40],
    [HospitalB_Female_18-30], [HospitalB_Female_31-40]
  )
) as Q

      

UPDATE

As a development of the above solution, you can also transfer parts of the names from CASE expressions to your own virtual tables and join the table Patients

to them:



;with
hospital (hospitalid, hospitalname) as (
  select 0, 'HospitalA' union all
  select 1, 'HospitalB'
),
gender (genderid, gendername) as (
  select 0, 'Male' union all
  select 1, 'Female'
),
agerange (agefrom, ageto) as (
  select 18, 30 union all
  select 31, 40
)
select * from 
(
  select p.diseaseid, 
         h.hospitalname + '_' + g.gendername + '_'
         + rtrim(a.agefrom) + '-' + rtrim(a.ageto) as Title,
         1 Cnt
  from Patients p
    inner join hospital h on p.hospitalid = h.hospitalid
    inner join gender   g on p.genderid   = g.genderid
    inner join agerange a on p.age between a.agefrom and a.ageto
  where p.age between 18 and 40
) t
pivot (
  count (Cnt) for Title in (
    [HospitalA_Male_18-30],   [HospitalA_Male_31-40],
    [HospitalA_Female_18-30], [HospitalA_Female_31-40],
    [HospitalB_Male_18-30],   [HospitalB_Male_31-40],
    [HospitalB_Female_18-30], [HospitalB_Female_31-40]
  )
) as Q

      

The overhead of adding subqueries and joins is offset by the greater ease of maintenance:

  • part of the data (meta) is separated from the logical part;

  • name part lists are more convenient to expand as needed;

  • the concatenation expression is easier to change if you need to change the format of the target column names.

+3


source


Please, try



SELECT  
    DiseaseId,
    SUM(CASE WHEN HospitalId = 0 AND GenderId=0 AND (Age BETWEEN 18 AND 30)  THEN 1 ELSE 0 END) AS [HospitalA_Male_18-30],
    SUM(CASE WHEN HospitalId = 0 AND GenderId=0 AND (Age BETWEEN 31 AND 40)  THEN 1 ELSE 0 END) AS [HospitalA_Male_31-40],  
    SUM(CASE WHEN HospitalId = 0 AND GenderId=1 AND (Age BETWEEN 18 AND 30)  THEN 1 ELSE 0 END) AS [HospitalA_Female_18-30],    
    ......  
FROM   Patients 
GROUP BY DiseaseId
ORDER BY DiseaseId  

      

+3


source







All Articles