Search for the Closest Matching Record within the Group — From SQL to SPL #9

Problem description & analysis: The table mytable in the MS SQL database has one ConfirmationStarted and multiple Closed statuses for each ID. Task: Now we need to find the record closest to ConfirmationStarted among all the Closed records before ConfirmationStarted in each ID, and retrieve the ID and time fields of the record. Code comparisons: SQL solution: WITH cte AS ( SELECT ID, CreatedAt, NewStatus, ROW_NUMBER() OVER (PARTITION BY ID ORDER BY CreatedAt DESC) AS rn FROM mytable WHERE NewStatus = 'Closed' AND CreatedAt < ( SELECT CreatedAt FROM mytable AS sub WHERE sub.ID = mytable.ID AND sub.NewStatus = 'ConfirmationStarted' ) ) SELECT ID, CreatedAt as xdate FROM cte WHERE rn = 1 ORDER BY ID; With cte AS ( SELECT ID, CreatedAt, NewStatus, ROW_NUMBER() OVER (PARTITION BY ID ORDER BY CreatedAt DESC) AS rn FROM mytable WHERE NewStatus = 'Closed' AND CreatedAt < ( SELECT CreatedAt FROM mytable AS sub WHERE sub.ID = mytable.ID AND sub.NewStatus = 'ConfirmationStarted' ) ) SELECT ID, CreatedAt as xdate FROM cte WHERE rn = 1 ORDER BY ID; SQL does not have natural sequence numbers, so it needs to generate sequence numbers using window functions first. After SQL grouping, it must aggregate immediately and records within the group cannot be filtered. It can only be solved in a roundabout way by filtering repeatedly using multi-level subqueries. The overall code is a bit cumbersome and difficult to understand. SPL solution: SPL has natural sequence numbers and provides rich position related calculations. SPL grouping can retain subsets after grouping, making it easier to process data within the group.

Mar 12, 2025 - 08:58
 0
Search for the Closest Matching Record within the Group — From SQL to SPL #9

Problem description & analysis:

The table mytable in the MS SQL database has one ConfirmationStarted and multiple Closed statuses for each ID.

source table

Task: Now we need to find the record closest to ConfirmationStarted among all the Closed records before ConfirmationStarted in each ID, and retrieve the ID and time fields of the record.

expected table

Code comparisons:

SQL solution

WITH cte AS (
    SELECT ID, CreatedAt, NewStatus,
        ROW_NUMBER() OVER (PARTITION BY ID ORDER BY CreatedAt DESC) AS rn
    FROM mytable
    WHERE NewStatus = 'Closed'
    AND CreatedAt < (
        SELECT CreatedAt FROM mytable AS sub
        WHERE sub.ID = mytable.ID AND sub.NewStatus = 'ConfirmationStarted'
    )
)
SELECT ID, CreatedAt as xdate
FROM cte
WHERE rn = 1
ORDER BY ID;
With cte AS (
    SELECT ID, CreatedAt, NewStatus,
        ROW_NUMBER() OVER (PARTITION BY ID ORDER BY CreatedAt DESC) AS rn
    FROM mytable
    WHERE NewStatus = 'Closed'
    AND CreatedAt < (
        SELECT CreatedAt FROM mytable AS sub
        WHERE sub.ID = mytable.ID AND sub.NewStatus = 'ConfirmationStarted'
    )
)
SELECT ID, CreatedAt as xdate
FROM cte
WHERE rn = 1
ORDER BY ID;

SQL does not have natural sequence numbers, so it needs to generate sequence numbers using window functions first. After SQL grouping, it must aggregate immediately and records within the group cannot be filtered. It can only be solved in a roundabout way by filtering repeatedly using multi-level subqueries. The overall code is a bit cumbersome and difficult to understand.

SPL solution:

SPL has natural sequence numbers and provides rich position related calculations. SPL grouping can retain subsets after grouping, making it easier to process data within the group.