Black Friday is a great time of the year for businesses that sell online!
Until it’s not. Until your shopping cart breaks…in the middle of the night due to some snarky bug.
Here is the tale of John, that lived an ecommerce horror story. I heard so many similar stories now, it’s crazy.
Storytime!
John is in charge of marketing for an ecommerce store. It’s 12:01 AM. Black Friday sales are online.
John is confident that everything will work, as it always does and has always been the case since he’s been there (a couple years).
After browsing the site a bit, adding an item with a deal to his shopping cart, he considers everything will work. He’s going to bed so he will be in good shape for the crazy day to come.
3:00AM. Things go south. There is a silent error that prevents people from buying. It’s a bug. No one receives an alert. The customer support team is sleeping too, but the helpdesk software is filling up quickly with tons of requests for help.
John and all his coworkers are sleeping.
6 AM, John wakes up. He’s looking at sales so far. What? Only that? There must be a problem. Wait, no sales since 3:00 AM! That’s terrible!
John calls the CTO which himself call some more people. The dev team is struggling to understand what’s the issue. They end up finding it after 20 minutes or so. They fix it. It’s about 6:40 AM.
Over 3 hours of lost sales during the first few hours of Black Friday. John run some numbers and find out that they lost about $80,000 in sales. Luckily, it was during the night, but still. That’s an expensive bug.
John starts to ask himself and the CTO what they could have done differently.
They decide to add a few layers of monitoring. One is adding better monitoring of the infrastructure and the code.
John is mostly satisfied, but want to add an additional layer. What John needs is a confirmation that sales are coming in, non-stop.
He could monitor the payment gateway, but they typically don’t have automated monitoring built-in or even real-time data. At least, not the payment gateway they use.
What’s next? They already have Uptime Robot (it could have been Pingdom, too) set up to test if some key pages are working. But it’s not testing the whole checkout, and that’s the core thing that needs to work. They could write a process that will fake a customer and buy something every hour or every few minutes. That said, it’s going to cost some money every time, or have some hacks in place that are likely to make things more brittle. What’s next? They use Google Analytics and have transactions, events, and goals in there. Some of that data is available in real-time but not the transactions themselves. What if we were monitoring completed checkout events? John is convinced that’s a crucial thing to monitor. After all, he doesn't want to lose tens of thousands of dollars ever again.
What are the options for John and his team?
It depends on the layers you want to monitor. Here are a few options that I can recommend. The first thing to monitor are errors in your production environment. I highly suggest using a tool like Honeybadger or Bugsnag that will notify the developers if there is an error. That said, some bugs will not raise errors per se. We call that silent bugs or silent errors. You should make sure your dev team has some automated tests running before every push to production (i.e., a new version of your site). The tools to use for that will vary a lot based on what your site is built with. One tool that works with any technology that you might want to use is Rainforest QA. Most of those tools are for your dev team. What if you want to have an additional layer that listens to the silence of having no successful payments? Ideally, you would have alerts for if your payment gateway doesn’t have at least $X/per hour or something like that. While I could not find such tools, you could probably get it built by your team. A good workaround for that would be to monitor your online conversions or conversion rate with Google Analytics. You have access to some data in real-time in Google Analytics. We can use that. That way, we can make sure that there are at least X conversions that happened among the active users, for example. Or how many conversions happened in the last hour. For that, you can use us, Metrics Watch. That’s typically the closest you can get to real-time monitoring of the silence that the lack of purchase is doing. The lowest hanging fruit is typically to start using an error monitoring tool and Metrics Watch. Metrics Watch can be set up in minutes by just connecting your Google Analytics account while error monitoring tools can also be deployed in very little time. Everything else will require more of your time but is worth doing, too. Don’t be the next John, start to use some or all of these tools.
"I am using a hosted solution, like Shopify. Does this apply to me?"
It depends™. If we take the Shopify example: it’s a solid business, they are hosting a LOT of ecommerce stores and have been doing that very reliably for a long time. That said, if you use an app from their marketplace, you might want to monitor your online store sales. While I assume most of the Shopify third-party apps are probably solid, they might not have the same robustness and be as fast to fix issues as Shopify itself would be. If you were to monitor a vanilla Shopify store with no third-party apps, what could you do if it went down? Contact Shopify and wait. Not much else to do in most case. On the other hand, if you use a third-party app and your store starts to have issues, you can most likely disable the apps and try to fix the situation by yourself. It’s really a matter of trust AND how actionable such an alert could be for you. If you can’t do anything else than be like "oh well, it sucks. All I can do is talk to my audience via email and social media, maybe put a popup or something…" then it’s up to you to decide if it’s worth it or not. But if you can fix the issue by yourself by changing or removing some apps, then I think it’s very useful to monitor your online business closely.