00:03:09
I'll start early on at the Times I joined in, in 2013 and arrived at a time when the New York Times was navigating this shift from predominantly an advertising business and a print business as well into the early stages of becoming the digital subscription business that it is today.
And today it's the leading news digital subscription business on the planet. So it's been a significant shift away from a history of print and advertising products. I came in with a background, as you mentioned Memetrics at the start, a background in experimentation, software, and running large-scale experiments online um, and identified early on that the Times wasn't real, couldn't run these types of experiments or even analyze user-level data. And when you're faced with driving a subscription business and you have these mechanisms like a paywall. A balance between what's freely available to visitors to New YorkTimes and what they need to pay for you need to do a lot of experimentation to drive that. Um, and So we had a legacy data stack at the time. We had a lot of Oracle. We had, we were playing around with some on-premise Hadoop. I'd say we had a lot of vendor point solutions for product analytics. None of these allowed us to connect and provide a view of the customer Um, and, and they weren't very product-centric. They were designed to support, an ad sales business, so really didn't have a lot of usable user data, we weren't interacting with the newsroom, which is largely a driving force, but behind the subscription business. Um, and So it, it was then in around 2015 or 2016 that I was one of the leaders of this move to the Google Cloud platform.
And obviously, Big Query is the central piece to that. Um, But both streaming and batch capabilities around Google and some of the tools there. And this was a huge unlock for us in just modernizing our stack, really throwing away a lot of this legacy technology and centralizing around a cloud data warehouse that enabled us to make different domains of data join away effectively and take on analytics and data science projects for the first time that started to unlock value for the business. I'd say from there just bouncing off that, that kind of cloud warehouse foundation we were able to do much stronger experimentation for the company across product analytics and marketing teams. We were able to move data into the newsroom. I had an early experience where, An investigative journalist from the newsroom worked with me to produce an insight report on the Times audience and coverage, which we then toured around the newsroom. But that was a foray into starting to build analytics and data science in the newsroom.
And we can invest, in first-party ML-driven data products that drove. Both the way we approach subscription marketing as well as building segmented and tailored ad products for our advertisers. And then finally just kind of operating with the newsroom. We were able to apply machine learning in ways that started to scale our journalism and use judgment from the newsroom.
And build a product experience that was much more tailored to the user. I'd say throughout this journey, it felt like a bit of a high-wire act to maintain quality and trust in data. As, and you've probably heard this from many people you've spoken to, but as a data leader sometimes, whether it's with the executive suite, with your consumers or with internal users of data, Sometimes one false move and you actually lose a lot of trusts, and then you have to slowly build that back in the organization with internal and external customers. And so for me, some of these challenges were complex commerce data that was transformed into financial reporting data that had to be kind of accurate to the penny, but also available for operational analytics. We had high velocity and high volume event data that was used in machine learning that had to be very available and on time and fresh.
And we had content data that you have a newsroom at the New YorkTimes that manually tags articles with all this rich metadata. And then that becomes part of how we feed sort of algorithmic recommendations and other machine learning. And All in all, these data environments evolve much faster than you can keep up with manual quality checks. We had looked into some solutions and used some solutions that were testing for data quality, but you're tackling this small percentage of what you know at the time, and the data's evolving around you at a pace that, that requires a different solution. And so at that time, I was focused on moving into how we establish quality and trust across the breadth and the depth of Times data.
And so launched an initiative and we actually brought Monte Carlo in as a partner, and this is probably two and a half years ago now. And really what we are looking at is how we identify more of those unknown issues. How do we have a solution that scales to our environment and how do we provide the ability to look upstream and downstream to resolve data issues? And I was just struck by Monte Carlo's team and their ability to deliver on that promise. And so then after leaving the, how I got to Monte Carlo was really after leaving theTimes and taking a bit of a nine-month break to do some travel I started talking to Barr about ways that I could maybe come into Monte Carlo to support customers on that same journey that I had taken at the Times.
And I guess the rest is kind of recent history