Index ¦ Archives ¦ Atom ¦ RSS

When the Bootstrap Breaks - ODSC 2019

I'm excited to announce that I'll be presenting at the Open Data Science Conference in Boston next week. My colleague Saptarshi and I will be talking about When the Bootstrap Breaks.

I've included the abstract below, but the high-level goal of this talk is to strip some ...

Slow to respond through 2018

I'm working on an urgent and high priority request for the next few weeks. To make sure I can finish this work in 2018 I'm limiting my meetings and communications for the remainder of the year.

Slack is good for getting my immediate attention, but if your request ...

If you can't do it in a day, you can't do it

I was talking with Mark Reid about some of the problems with Coding in a GUI. He nailed part of the problem with soundbite too good not to share:

"If you can't do it in a day, you can't do it."

This is a persistent problem with tools ...

Planning Data Science is hard: EDA

Data science is weird. It looks a lot like software engineering but in practice the two are very different. I've been trying to pin down where these differences come from.

Michael Kaminsky hit on a couple of key points in his series on Agile Management for Data Science on ...

You can't do data science in a GUI

I came across You can't do data science in a GUI by Hadley Wickham a little while ago. He hits on a lot of the same problems I mentioned in Don't make me code in your text box. Take a look if you have some time. In the ...

Why bootstrap?

Over the next few quarters, I'm going to focus my attention on Mozilla's experimentation platform. One of the first questions we need to answer is how we're going to calculate and report the necessary measures of variance. Any experimentation platform needs to be able to compare metrics ...

SQL Style Guide

I'm happy to announce, we now have a SQL style guide. Check it out!

If you have any suggestions, feel free to file a PR or issue in the docs repository.

Many thanks to all who participated in the St. Mocli conversation and @mreid for the review!

PSA: Don't use approximate counts for trends

I got caught giving some bad advice this week, so I decided to share here as penance. TL;DR: Probabilistic counts are great, but they shouldn't be used everywhere.

Counting stuff is hard. We use probabilistic algorithms pretty frequently at Mozilla. For example, when trying to get user counts ...

Don't make me code in your text box!

Whenever I start a new data project, my first step is rooting out any false assumptions I have about the data.

The key here is iterating quickly. My workflow looks like this: Code a little, plot the data, what do you see? Ah, outliers. Code a little, plot the data ...

The 5 Stages of Experiment Analysis

I've been thinking about experimentation a lot recently. Our team is spending a lot of effort trying to make Firefox experimentation feel easy. But what happens after the experiment's been run? There's not a clear process for taking experimental data and turning it into a decision.

I ...

© Ryan T. Harter. Built using Pelican. Theme by Giulio Fidente on github.