Talk:Analytics/Kraken/Data Formats
Add topicOtto's Comments
[edit]Avro Schema
[edit]uid
[edit]{ "name": "uid",
Is there a reason we are calling this uid rather than user_id? Do we intend for this to mean 'unique id' rather than 'user id'?
ip
[edit]"name": "ip", "type": ["int", "string"],
How does this work? We could store IPv6s as 4 ints.
carrier
[edit]{ "name": "carrier", "type": ["null", "string"], "default":null, "order": "ignore", "doc": "Mobile carrier for Zero project; from X-CARRIER header" },
I'm cool with this being stored, but let's not make our schema reference a single product/project. Can we move the mention of 'Zero Project' from the doc, and also maybe rename this 'mobile_carrier' so it is clear what this means?
method
[edit]{ "name": "method", "type":
Can we call this either 'http_method' or 'request_method'? request_method might be better, since we are calling the status 'response_status'
response_mime
[edit]{ "name": "response_mime",
This is an inaccurate name. MIME is an email standard thing, and mime type is a field extracted from from that standard. The HTTP header that this actually comes from is called 'Content-Type', so we should probably name this accordingly. Perhaps 'response_content_type'?
visit_id, pageload_id
[edit]Hmm, I'm not sure I like the names of these either. Do events always correspond to page loads / groups of page visits?