Baking Bread
Honestly, these generated images are the funniest thing I've seen in a while.

Baking Bread

Happy Friday everyone! It’s been an interesting week, to be sure.?

I’ve got a story about networking, automation, process improvement, and my insistence that automation is only a 90% tool.

It starts with a regular group of friends, who, after working together, and going their separate ways, decided that they liked each other enough to keep talking to each other. Side note, that’s one of the ways networking works in the industry. As much as I like industry events, you really for the most part remember the people you want to work with again. And like other industries, it’s really small, so you’ll run into the same people over and over again. But I digress.

So one of the guys that we play with, Bread, has been involved with Extra Life Charity for a long time. He’s the one who got us into doing Extra Life and it’s one of our big get-together events that we like to do near the end of the year. A ton of fun, and going to a good cause. Bread goes a step further and streams regularly on Twitch under the moniker ‘bread4kids’, and donates EVERY dollar that he gets from subscriptions and bits to Extra Life.

Sounds like a fine time, right? What could possibly go wrong? Well, the other day, Bread messaged us with a note that he had gotten his account suspended for violating the ToS, which is just utterly dumbfounding to all of us that know him. Granted, all of us troll him a little bit with risque humor, but none of us would go out of our way to actively sabotage him. So after some more digging, our conclusion was that it was either a false positive from a malicious report, or an overaggressive filter mistaking the name ‘bread4kids’ for something salacious. So Bread took the steps to notify customer service that there was a mistake and that his account shouldn’t be suspended. On top of that, we had several friends approach people internally about the incident to help escalate it. The last thing that happened was that Bread went to Twitter (the only deadname I’ll use) to let people know what happened. Fortunately, it was picked up by the masses and started trending, which led to the intervention of the CEO of Twitch communicating with Bread. So yeah, at that point, I’m pretty sure there was some serious yelling and gnashing of teeth within Twitch, I can only speculate because I don’t work there. The final resolution was that Bread got his account back in short order, and Twitch wanted to make amends through a whole bunch of ways.

At this point, I’d like to take a step back and look at the way things happened, because I think it’s a real good lesson about process, tooling and automation. Like Bread, I don’t actually fault the program that flagged his account. We’ve both been in dev long enough to set up tooling that has been overly aggressive, either by design or by accident. I’ve written enough bad regexp in my career to know that we’re all capable of making mistakes.

So where’s the breakdown? In many processes and systems, it’s usually a human component. Humans get tired, make mistakes, have to do those annoying things like take breaks and pee. But I don’t think that was the case here. Basically, the process was started to evaluate Bread’s account, either because of a report, or a regular sweep occurred. I think that’s fine, both of those reasons are fine for why the process was started. The problem lies in the next part, when the program evaluated the account to be racy. This isn’t the first time that false positives have happened, on any platform. My own Instagram photos have been flagged for being too sexual. The images in question? Some pictures of food. (And no, not anything like a peach or a cucumber.) The failing in this case is the fact that it went from making an assessment to outright suspension. I don’t know if there was a human involved in this step, but with the way that this all went down, it feels like there wasn’t.?

Okay, some back of the envelope math here. There’s about 92,000 concurrent streamers on Twitch, as of Jan 2024. Let’s say that 5% of the streamers have some sort of content that someone finds offensive and generates a report, whether accurate or not. That’s 4600 times that the script gets activated. The thing with filtering scripts is that for the most part, the programmer needs to know what infractions are to be caught and how to deal with them. They either have a comprehensive list (which will get out of date very quickly), or in this day and age, have AI that can interpret apparent infractions as it comes across them. Let’s say that of the 4600 times the script is called, maybe half of them are spurious. That’s still 2300 positives. We start to get to a very interesting part of the problem. Because if it’s fully automated, you have 2300 suspended accounts, but also 2300 false positives and potential situations like Bread’s. I’m sure Daniel Clancy doesn’t want to be bothered every time there’s a false positive. But at the same time, I don’t think it’s completely cost effective to have a human monitoring every suspended account and having to dedicate time to confirming the ban. Even if the person only spent 2 minutes evaluating each report, that’s 4600 * 2 = 9200 minutes ≈ 153 hours. That’s not even remotely efficient.

The goal is to automate as much of the process without needing a person to be part of it, but also having a human at the end of the process so that there’s verification and accountability.?

So what to do? There’s a few approaches, depending on how the company is positioned in terms of workforce. Granted, I don’t work at Twitch, so they very well may have many of these ideas implemented.

Starting from the beginning of the process, make it more stringent before the script gets called. Maybe several reports from distinct IPs, to avoid one person spamming reports. Reports could accumulate to a threshold. To add granularity to it, reports could time out over the course of a few days and if the number of reports gets over a certain threshold, run the script or alert CS. That way, reach can be calculated as well. Alternatively, a report can be weighed against the number of concurrent viewers. If someone has 10 viewers and 5 of them are offended and reporting it, maybe that’s worth looking at. Granted, there’s something to be said that even if one person is offended, then it’s worth looking at it, but I do think that some threshold needs to be established, otherwise you risk empowering some bad actors. (Every system in place will have someone who tries to abuse it.)

On the tail end of the automation, the filter script could flag reports by calculated severity, allowing someone in CS to then review requests in order of severity and be able to manage the number of reports that occur. As we generate more and more content in the world, we need automation in order to help us process all that information. Maybe it can merge reports so that a total number of reports for a specific issue can be seen so that CS knows what to look for. It can also have a feedback loop, like in the old days of slashdot, where moderation had meta-moderation to ensure that the system was working well. CS could give feedback to the program that it had overrated something as too severe, or needed to escalate the severity. Additionally, having a feedback system that also allows the users to indicate what they find offensive is a great way to crowdsource information (another topic for another day).

Ultimately, this is a great opportunity for Twitch to improve their process and create tooling that empowers CS and keeps the Breads of the world baking. (Sorry, kneaded to get a bread pun in there.)

要查看或添加评论,请登录

Russ Fan的更多文章

  • Bad System - Healthcare

    Bad System - Healthcare

    Been avoiding writing about this, mostly because I was thinking I could just enjoy writing about games and talking…

    3 条评论
  • Starting Top Down

    Starting Top Down

    When I was younger, I used to get so annoyed at my English teacher that would insist on having us outline our paper…

  • The Fractal Nature of Making Games

    The Fractal Nature of Making Games

    Been reading and listening to discussions about crunch. There are lots of opinions on it and why it exists.

  • Our Digital Existence

    Our Digital Existence

    Been playing games online for a long time. I remember real early days having LAN parties with friends and having to…

  • Yet Another Look at the Helldivers Fiasco

    Yet Another Look at the Helldivers Fiasco

    I guess I’ll chime in on the whole Helldivers fiasco. For the uninitiated, Helldivers 2 is a game that’s currently on…

  • Thinking on Three Levels

    Thinking on Three Levels

    This is hopefully the start of some more in depth conversation as it’s the beginning of the framework for thinking…

  • My Keyboard

    My Keyboard

    Happy Wednesday! This will be an interesting one for reasons that will be clear later. For starters, I don’t use a…

    5 条评论
  • Magic vs Juggling

    Magic vs Juggling

    This is a post about magic and juggling, how they’re different and how they’re the same. This is also a post about game…

    4 条评论

社区洞察

其他会员也浏览了