Gmail spam filtering, a reality check.
News article: RNC loses complaint claiming Gmail spam filter is biased against Republicans
FEC found "no reason to believe" that Google used spam filter to help Democrats.
My thoughts:
1)?The demagogy must stop.?
2)?This complaint proves that the partisan politicos [read: hacks] do not understand #technology of #email-at-scale or the complex #workflows to mitigate bad or unwanted messages. Like at all. It is nearly a zero level of understanding.??But that may be on purpose (see item 1); it may be another case of willful ignorance of what I call The Left Right Orthodoxy. Remember, these are the people setting technical policies for the US Govt and businesses across the country. That should frighten anyone with any lick of common sense and a basic grasp of the technology landscape.
3) Technology should be a tool, not a weapon. Especially not a weapon used as a crutch to a wholly bad argument used for a seemingly unsupported ambush against your perceived enemies.
And now, the rest of the story...
I started working with #messaging over 20 years ago with Microsoft Exchange Server 4.0.??I think this was the version that was MAPI-only and required a DOS-based #SMTP gateway to send and receive Internet emails. Yeah, old school and then some.?? I have used every version since, including assisting in administrating, securing, and maintaining a large Exchange cluster for a national lab.
Later, I dived very deep into #messagingsecurity, including writing and putting into production an #antispam / #antimalware active #datapipeline between Splunk and #ExchangeServer.??The pipelines have matured into a detailed workflow that manages a variety of #countermeasures, including, but not limited to, automated IP Blocks, domain and domain-and-subdomain blocking, bad username detection, known bad IP/CIDR detection (WannaCry, Sunburst, known nefarious data centers/hosting providers, et al), geo-tracking #botnets with similar attributes (i.e. same TLD, especially if it tracks back to Russia or the old Soviet satellite states)?plus UT Levenstein distance evals, Shannon entropy tests (way, way too long domain names, there is a clue! See graphic header) and #sentimentanalysis.??Yeah, there is a lot going on there... There is even a direct-to-block-by-CIDR/24 routine for sex ads/sex trafficking [possible criminal] activities. The ones dead-ending the SMTP returns, resulting in #tarpitting, are most likely the criminal element in this vector. Those are NOT dating sites. Not at all. [insert mad and disgusted emoji face here]
I use Google for my home office domain including hosting my email at Gmail.???I know the good, the bad, and the ugly of putting my MX receiver on Google.??The good outweighs the bad and the ugly (good equals not having to worry about Exchange patches and the myriad of threats, from the Autodiscover.json #zeroday to #IIS and auth bugs).??The bad:??You lose personal and tactile control, especially on #filtering.??The ugly:??Wow, the false positives are EVERYWHERE!!??The politicos arguing that Google is "out to get them" simply don't understand the problem space and lash out with drama and uninformed, if not baseless, opinions on the state of tech.?
Here is the very SHORT list of Gmail false positive filters on my domain:?
Wayfair -?a home furnishing online store
SiriusXM - the sat-based radio service for my cars
AT&T - My gigabit business fiber provider, including billing messages.?
East Bay Times - my local paper
California State University - East Bay - a local university.
惠普企业服务 - a little SV company you may have heard of...
Hulu - #RickandMorty is reason alone to subscribe. I love that show! I digress...
President Joe Biden via ACT dot org?(yeah, see my point already?)
And nearly every candidate's email, regardless of political affiliation, that was sent from a mass email (blaster) service.??As an Independent and a wonk, I have had both Katie Porter's and! Christine Todd-Whitman's messages incorrectly filtered.? Oh, Gary Johnson, the Independent-ish Libertarian got filtered too. ?So, what is your argument again??
I keep going to the G-based admin console, approving those filtered senders/domains, and many get re-filtered later as just another false positive. The Internet Headers from the blasters are not always clear/clean/understandable/proper per RFC, and the custom headers cause their own set of outlier headaches. Some custom headers act as a fingerprint (I have tracked #botnets using a sore thumb header), while others screw up the whole detection process. It is a work in progress basically as they continuously mess with the structure and syntax of the email headers. It is a very well-known issue in message security.
So, the claim that "they are out to get us" holds exactly zero water.?Why???Messaging security is hard, and seriously complex autonomous response engineering is harder to get running correctly at any level.??Certain interests demanding workflow perfection (note: to their given bias) and 0% false positives rates are frankly laughable [as it is an unsophisticated and very possibly, a clueless argument].???
I still get false positives and false negatives in my own personally-coded and years-matured workflows because edge cases are extremely difficult to address programmatically or statistically, or with #machinelearning in some cases. ?It is not like I have not worked to eliminate those false detections; again, it is difficult work!
But at the scale and velocity of Google's email and domain services, foolproof detections are just wishful thinking at best, and partisan hackery at worst. Baseless accusations do nothing to fix the underlying problem or build better workflows. The partisans have a bigger argument with their own bulk email sender firms than with Google. That's a reality I am sure they do not want to hear and possibly can't hear while lashing out at others.
Disclaimer: The information provided in this post is my opinion and my proprietary research. This is not a recommendation, warranty, surety, or guarantee in any form whatsoever. These are my own opinions and analysis, not my employer's.