Looking at your SQL statement, I can see why the cte.cntNpi = cte.cntUnique
condition might not be working as expected. Here's the issue:
In your CTE, you're calculating two counts:
-
cntNpi
- counts records per taxid
-
cntUnique
- counts records per combination of npi, taxid, and provider details
The condition cte.cntNpi = cte.cntUnique
is meant to ensure that for a given taxid, all records with that taxid have the same npi and provider details. However, your WHERE clause has additional conditions that might be interfering.
The main problem is that your EXISTS subquery at the end is checking for other records with the same taxid but different provider names, but this is separate from your count comparison. Key changes:
- Moved the
COALESCE
and primarycontractedrelationship
conditions into the initial CTE to reduce the dataset early
- Created a second CTE to apply the count comparison and EXISTS condition
- Then selected from this filtered CTE for the final insert
This makes the logic clearer and ensures the count comparison is properly applied before other conditions.
DELETE FROM stg.stg_facets_solo;
WITH cte AS (
SELECT
*,
COUNT(*) OVER (PARTITION BY taxid) AS cntNpi,
COUNT(*) OVER (PARTITION BY npi, taxid, facilitygroupname, providerlastname, providerfirstname, providermi) AS cntUnique
FROM stg.facets_full
WHERE COALESCE(providerlastname, '') <> ''
AND UPPER(primarycontractedrelationship) = 'CLINICIAN'
),
filtered_cte AS (
SELECT *
FROM cte
WHERE cntNpi = cntUnique
AND NOT EXISTS (
SELECT 1
FROM stg.facets_full b
WHERE b.taxid = cte.taxid
AND (b.providerlastname = '' OR cte.providerlastname <> b.providerlastname)
)
)
INSERT INTO stg.stg_facets_solo (load_date, source, hash_diff_status, npi_hk, npi, status)
SELECT DISTINCT
GETDATE(),
'FACETS',
CONVERT(CHAR(32), HASHBYTES('SHA2_256', 'A'), 2),
CONVERT(CHAR(32), HASHBYTES('SHA2_256', npi), 2),
npi,
'A'
FROM filtered_cte;