Skip to content

Commit

Permalink
fix clicks <-> master_campaign join logic.
Browse files Browse the repository at this point in the history
  • Loading branch information
doryokujin committed Feb 19, 2024
1 parent 30864b1 commit 126235f
Show file tree
Hide file tree
Showing 2 changed files with 74 additions and 13 deletions.
17 changes: 10 additions & 7 deletions scenarios/cdp_campaign_management/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -549,21 +549,24 @@ Required options depend on the value of `is_audience_table`.

Specify the source of the master_campaigns table that the user will be uploading. By setting this table, the utm parameter of the clicks table can be retrofitted.

The motivation for setting up this table is when you want to retrofit the utm parameter, for example in the following cases:

- Complement misconfigurations or omissions in the utm parameter of existing campaign links
- Assign `cv_name` to campaigns outside of TD so that they can be measured for effectiveness as well
- Organize utm parameters for past campaigns so that they can be measured for effectiveness
> [!IMPORTANT]
> The goal for setting up this table is when you want to retrofit the `cv_name` or `activation_step_id` ( or `utm_term`, `utm_content`, `utm_connector`). These values can be set for each combination of `utm_source`, `utm_medium`, and `utm_campaign`. SO you can't retrofit the `utm_source`, `utm_medium`, and `utm_campaign`.

The master_campaigns table will be based on the [existing_campaigns](#existing_campaigns) output from this WF, with the utm parameter values of the campaigns you wish to edit or add edited.

However, since the following 3 parameter values are used to match records in the original `clicks` table, it is not possible to link records in the original `clicks` table that do not have these 3 values set:
If the `utm_id` of a record in the `clicks` table is not `NULL`, then
it is joined using a combination of the following 4 parameters:

- utm_id
- utm_source
- utm_medium
- utm_campaign

Basically, the aim of having master_campaign table can be considered to have utm parameters other than these 3 (`cv_name`, `activation_step_id`, etc...) later. And the value of the utm parameter set here takes precedence over the value of the utm parameter in the original `clicks` table.
Otherwise it is joined using a combination of the following 3 parameters:

- utm_source
- utm_medium
- utm_campaign

##### Required columns for table

Expand Down
70 changes: 64 additions & 6 deletions scenarios/cdp_campaign_management/queries/ingest_clicks.sql
Original file line number Diff line number Diff line change
Expand Up @@ -32,26 +32,84 @@ FROM
AND url_extract_parameter(${url_column}, '${td.utm_names.utm_source}') IS NOT NULL
AND TD_TIME_RANGE(${time_column}, ${time_from}, ${time_to})
AND ${filter}
AND url_extract_parameter(${url_column}, '${td.utm_names.utm_id}') IS NULL
) t1
${join_part}
LEFT OUTER JOIN (
SELECT
CAST(activation_step_id AS VARCHAR) AS activation_step_id
,utm_source
utm_source
,utm_medium
,utm_campaign
,cv_name
,MAX_BY(CAST(activation_step_id AS VARCHAR), time) AS activation_step_id
,MAX_BY(cv_name, time) AS cv_name
,MAX_BY(utm_connector, time) AS utm_connector
,MAX_BY(utm_content,time) AS utm_content
,MAX_BY(utm_term, time) AS utm_term
FROM ${campaign_db}.${master_campaigns_table}
WHERE ${td.utm_names.utm_source} IS NOT NULL AND ${td.utm_names.utm_medium} IS NOT NULL AND ${td.utm_names.utm_campaign} IS NOT NULL
GROUP BY 1,2,3,4,5
GROUP BY 1,2,3
) t3
ON (
t1.utm_campaign = t3.utm_campaign
AND t1.utm_medium = t3.utm_medium
AND t1.utm_source = t3.utm_source
)

UNION ALL
SELECT ${distinct}
t1.time
,db_name
,table_name
,${user_column}
,COALESCE(CAST(t3.activation_step_id AS VARCHAR), CAST(t1.activation_step_id AS VARCHAR)) AS activation_step_id
,COALESCE(t3.cv_name, t1.cv_name) AS cv_name
,COALESCE(t3.utm_campaign, t1.utm_campaign) AS utm_campaign
,COALESCE(t3.utm_medium, t1.utm_medium) AS utm_medium
,COALESCE(t3.utm_source, t1.utm_source) AS utm_source
,COALESCE(t3.utm_content, t1.utm_content) AS utm_content
,COALESCE(t3.utm_term, t1.utm_term) AS utm_term
,COALESCE(t3.utm_connector, t1.utm_connector) AS utm_connector
FROM
(
SELECT
${time_column} AS time
,'${input_db}' AS db_name
,'${input_table}' AS table_name
,${user_column_inner}
,url_extract_parameter(${url_column}, '${td.utm_names.utm_id}') as activation_step_id
,url_extract_parameter(${url_column}, '${td.utm_names.utm_campaign}') as utm_campaign
,url_extract_parameter(${url_column}, '${td.utm_names.utm_medium}') as utm_medium
,url_extract_parameter(${url_column}, '${td.utm_names.utm_source}') as utm_source
,url_extract_parameter(${url_column}, '${td.utm_names.utm_content}') as utm_content
,url_extract_parameter(${url_column}, '${td.utm_names.utm_term}') as utm_term
,url_extract_parameter(${url_column}, '${td.utm_names.utm_connector}') as utm_connector
,url_extract_parameter(${url_column}, '${td.utm_names.utm_cv}') as cv_name
FROM ${input_db}.${input_table}
WHERE url_extract_parameter(${url_column}, '${td.utm_names.utm_campaign}') IS NOT NULL
AND url_extract_parameter(${url_column}, '${td.utm_names.utm_medium}') IS NOT NULL
AND url_extract_parameter(${url_column}, '${td.utm_names.utm_source}') IS NOT NULL
AND TD_TIME_RANGE(${time_column}, ${time_from}, ${time_to})
AND ${filter}
AND url_extract_parameter(${url_column}, '${td.utm_names.utm_id}') IS NOT NULL
) t1
${join_part}
LEFT OUTER JOIN (
SELECT
utm_source
,utm_medium
,utm_campaign
,CAST(activation_step_id AS VARCHAR) AS activation_step_id
,MAX_BY(cv_name, time) AS cv_name
,MAX_BY(utm_connector, time) AS utm_connector
,MAX_BY(utm_content,time) AS utm_content
,MAX_BY(utm_term, time) AS utm_term
FROM ${campaign_db}.${master_campaigns_table}
WHERE ${td.utm_names.utm_source} IS NOT NULL AND ${td.utm_names.utm_medium} IS NOT NULL AND ${td.utm_names.utm_campaign} IS NOT NULL
GROUP BY 1,2,3,4
) t3
ON (
t1.utm_campaign = t3.utm_campaign
AND t1.utm_medium = t3.utm_medium
AND t1.utm_source = t3.utm_source
AND t1.cv_name = t3.cv_name
AND t1.activation_step_id = t3.activation_step_id
)
)

0 comments on commit 126235f

Please sign in to comment.