Jump to content

Talk:Analytics/Kraken/Data Formats

Add topic
From mediawiki.org

Otto's Comments

[edit]

Avro Schema

[edit]

uid

[edit]
     { "name": "uid",

Is there a reason we are calling this uid rather than user_id? Do we intend for this to mean 'unique id' rather than 'user id'?

ip

[edit]
 "name": "ip", "type": ["int", "string"],

How does this work? We could store IPv6s as 4 ints.

carrier

[edit]
 { "name": "carrier", "type": ["null", "string"], "default":null, "order": "ignore",
           "doc": "Mobile carrier for Zero project; from X-CARRIER header" },

I'm cool with this being stored, but let's not make our schema reference a single product/project. Can we move the mention of 'Zero Project' from the doc, and also maybe rename this 'mobile_carrier' so it is clear what this means?

method

[edit]
 { "name": "method", "type":

Can we call this either 'http_method' or 'request_method'? request_method might be better, since we are calling the status 'response_status'

response_mime

[edit]
 { "name": "response_mime",

This is an inaccurate name. MIME is an email standard thing, and mime type is a field extracted from from that standard. The HTTP header that this actually comes from is called 'Content-Type', so we should probably name this accordingly. Perhaps 'response_content_type'?

visit_id, pageload_id

[edit]

Hmm, I'm not sure I like the names of these either. Do events always correspond to page loads / groups of page visits?