Assume that we have typical session log from the website with cookies and user id (in case he/she provided credentials):
cookie id timestamp af312b452c . 31-03-11 1:44 PM af312b452c . 31-03-11 1:46 PM af312b452c . 31-03-11 1:48 PM af312b452c . 31-03-11 1:51 PM af312b452c . 31-03-11 1:53 PM af312b452c . 31-03-11 1:56 PM af312b452c 1 31-03-11 1:58 PM af312b452c 1 31-03-11 2:01 PM
31-03-11 1:58 PM - user logged-in, so since then we are tracking his/her activity alongside with user_id.
I want to enrich such data by filling missing fields backwards to have complete information what particular user has seen on the webpage:
cookie id timestamp af312b452c 1 31-03-11 1:44 PM af312b452c 1 31-03-11 1:46 PM af312b452c 1 31-03-11 1:48 PM af312b452c 1 31-03-11 1:51 PM af312b452c 1 31-03-11 1:53 PM af312b452c 1 31-03-11 1:56 PM af312b452c 1 31-03-11 1:58 PM af312b452c 1 31-03-11 1:58 PM
What will be the best way to do it in SQL?
- whole table has many rows,
- users can have multiple cookies.
- a particular cookie value may have been used for a different user at some point in the past
Thank you all in advance.
I would start by creating a function that would get me the most recent logged in ID of a cookie given that cookie and the timestamp you logged that cookie (psuedocode off the top of my head, apologies if something is off):
CREATE FUNCTION getFutureLogin(in_cookie VARCHAR, in_timestamp DATETIME) RETURNS INT AS
DECLARE logged_in_id IN
SELECT logged_in_id = TOP 1 id
WHERE cookie = in_cookie
AND timestamp > in_timestamp
AND id IS NOT NULL
ORDER BY timestamp ASC
From there, it's a simple update that you'd probably schedule a job to do sometime.
UPDATE T_NAME SET id = getFutureLogin(cookie, timestamp) WHERE id IS NULL
May not be the most straightforward or fastest way, but it would work.