Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add the option to regenerate the page view ID with each page view event #436

Closed
bogaert opened this issue Dec 2, 2015 · 18 comments
Closed
Assignees
Labels
category:browser About the browser-specific code. priority:high To fix as soon as possible. type:enhancement New features or improvements to existing features.
Milestone

Comments

@bogaert
Copy link

bogaert commented Dec 2, 2015

Related to this topic: https://groups.google.com/forum/#!topic/snowplow-user/VdMVqBdoXPE

The page view ID is fixed when the JavaScript Tracker is loaded on a page. It only changes when the page is refreshed or a new page is loaded.

It'd be good to add the option to regenerate the page view ID for single page web apps and infinite scrolls.

@fblundun fblundun self-assigned this Dec 2, 2015
@fblundun fblundun added type:enhancement New features or improvements to existing features. category:browser About the browser-specific code. labels Dec 2, 2015
@fblundun
Copy link
Contributor

fblundun commented Dec 2, 2015

The API call would probably look something like

snowplow("regeneratePageViewId");

@alexanderdean
Copy link
Member

I'm hazy on why we would not want to generate a new page view ID each time if the user hits trackPageView multiple times on the same page. Is there a use case for multiple page views (potentially with different URIs thanks to setCustomUrl) having the same web_page ID?

@bogaert
Copy link
Author

bogaert commented Dec 2, 2015

I can image having an infinite scroll where the page view event is used to track what content is shown, but you still want to roll it up into a single page view.

@alexanderdean
Copy link
Member

Thanks @bogaert - the use case makes sense. I think part of the confusion here is that we are referring to this context in this ticket as a "page view ID", but the actual schema is more open-ended than this:

com.snowplowanalytics.snowplow/web_page/jsonschema/1-0-0

So I'd suggest the option to regenerate should be very explicitly named to prevent confusion. Something like:

snowplow("newWebPageForVirtualPageViews", true);

@fblundun
Copy link
Contributor

fblundun commented Dec 2, 2015

What's does the true argument do?

@alexanderdean
Copy link
Member

Nothing - we can remove!
On 2 Dec 2015 12:52 pm, "Fred Blundun" [email protected] wrote:

What's does the true argument do?


Reply to this email directly or view it on GitHub
#436 (comment)
.

@bogaert
Copy link
Author

bogaert commented May 11, 2016

This has come up a couple more times – I'd be good if this were included in the next major release.

@fblundun fblundun added this to the Version 2.7.0 milestone May 11, 2016
@alexanderdean
Copy link
Member

See #500

@alexanderdean
Copy link
Member

Before implementing this ticket we need to decide if the ID should always be reset per @yalisassoon's #508, or if it should be a disable-able option...

@bogaert
Copy link
Author

bogaert commented Sep 4, 2016

I would make it an option, enabled by default.

@yalisassoon yalisassoon added the priority:high To fix as soon as possible. label Sep 22, 2016
@ryanrozich
Copy link

one comment here is that, rather than having an id that either normalizes to the page load or normalizes to a virtual page view, it might be useful to do both by having IDs for two different contexts - one for page load and the other for pageview (virtual or otherwise).

The single page apps use case makes sense that you always want to normalize the events to an id based on virtual page view, but there are other cases where virtual page views get fired outside of single page apps.

For instance, if you have a content site that has pages that are subdivided into smaller content segments. For example a page like 'the top 50 movies on netflix right now' - this is a really long page, it loads once but has a writeup on 50 different movies. I may want to measure both the amount of time spent (or clicks from) the overall page 'top 50 movies..' and also want to measure the amount of time spent (or clicks from) each of the sub-sections, like which movie writeup and people stopping to read for a significant amount of time rather than just skimming past.

(this is a real use case by the way, we have a client that has content just like this)

In this use case, if I had to choose between the pageview id (thats stored in redshift in atomic.com_snowplowanalytics_snowplow_web_page_1) being either based on page load or virtual pageview, I'd have to decide up front which use case is more important to me, rather than having the flexibility to query in different ways depending on the question I'm trying to answer

@chuwy
Copy link
Contributor

chuwy commented Nov 2, 2016

Thanks @ryanrozich! That makes sense.

@yalisassoon @bogaert what do you think about it? I think "physical" page load can have value as well as "virtual" page reload and we probably don't want to loose it.

By "physical page load" I understand GET request for /someUri which initialize tracker code, whereas "virtual page load" is application-defined - it can be tab switch which doesn't perform GET /someUri, but something like Ajax request for /api/data/forSomeUri.

@yalisassoon
Copy link
Member

Hi @ryanrozich - in your example it feels to me like you'd want a
subsection context that's sent when a user when a particular div on a page
is in view?

For me a page view should be defined based on the user experience - if the
whole page changes and the URL changes with it, then it makes sense to
track that as a page view - whether it's a virtual page view on a single
page webapp or a new page physically loaded in the browser.

I'm struggling to imagine when I'd want to distinguish physical from
virtual page views: the only time I can imagine that being useful is if I'm
checking page load performance. (So I'd expect performance differences
whether the page loads are virtual or real.) But I suspect I'm not being
imaginative enough?

On Wed, Nov 2, 2016 at 6:14 AM, Anton Parkhomenko [email protected]
wrote:

Thanks @ryanrozich https://github.com/ryanrozich! That makes sense.

@yalisassoon https://github.com/yalisassoon @bogaert
https://github.com/bogaert what do you think about it? I think
"physical" page load can have value as well as "virtual" page reload and we
probably don't want to loose it.

By "physical page load" I understand GET request for /someUri which
initialize tracker code, whereas "virtual page load" is application-defined

  • it can be tab switch which doesn't perform GET /someUri, but something
    like Ajax request for /api/data/forSomeUri.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#436 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAMRu2RklToPjUh8QKx0k6P1GwRReJvMks5q6CpDgaJpZM4Gs-mO
.

Co-founder
Snowplow Analytics http://snowplowanalytics.com/
The Roma Building, 32-38 Scrutton Street, London EC2A 4RQ, United Kingdom
+44 7841 954 117
@yalisassoon https://twitter.com/yalisassoon
https://twitter.com/yalisassoon
Sign up to our mailing list! http://eepurl.com/b0yEgz

@bogaert
Copy link
Author

bogaert commented Nov 3, 2016

Thanks for the comments @ryanrozich - we appreciate all input!

It's an interesting discussion. I agree with @yalisassoon that we shouldn't overload the page view:

For me a page view should be defined based on the user experience - if the whole page changes and the URL changes with it, then it makes sense to track that as a page view - whether it's a virtual page view on a single page webapp or a new page physically loaded in the browser.

To track what content a user has “seen” (i.e. what divs have scrolled into view), I'd create a separate event, perhaps with a second context. I think there's a ticket for that, but can't seem to find it - do you know where it is @yalisassoon?

@alexanderdean
Copy link
Member

It's more about the mechanics than the schema design but there is a content viewability ticket, Add item in-view events #98

@yalisassoon
Copy link
Member

Hi @chuwy I've been testing this and it's not working as I'd expect.

Let's take a specific sequence of track methods firing on a single web page with the web page context enabled

  1. trackPageView
  2. trackSelfDescribingEvent
  3. trackStructEvent
  4. trackSelfDescribingEvent
  5. trackPageView
  6. trackSelfDescribingEvent
  7. trackSelfDescribingEvent

I'd expect to get the same web_page ID for the first 4 events, and then a second web_page ID for events 5-7.

Instead, I get one web_page ID for the first event.

Another web page ID for events 2-5.

A third web page ID for events 6-7.

So it looks like the ID is regenerating after the trackPageView event is called, rather than before.

Can you please double check?

Thanks!

@chuwy
Copy link
Contributor

chuwy commented Dec 26, 2016

Thanks for detailed info, @yalisassoon. I think you're right. This should be fixed in rc7.

@yalisassoon
Copy link
Member

Thanks @chuwy - the fix is working great!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
category:browser About the browser-specific code. priority:high To fix as soon as possible. type:enhancement New features or improvements to existing features.
Projects
None yet
Development

No branches or pull requests

6 participants