RLFO#29: Voodoo Programming
This post originally premiered on Random Links Found Online .
If web servers are programs that map some HTTP input stream to another HTTP output stream, then programmers are primates that map some JIRA input stream to a programming language output stream. Most of what we do daily is not as impressive as we make it to sound. Though once in a while, some of the programming primates cook up something so intricate and capable, it may even be wizardry. Voodoo Programming , if you will. But to quote Gilfoyle, “it’s not magic, it’s talent and sweat”. It’s just that what we don’t comprehend, we claim to be magic.
I was very pleased a few years back, when Google rolled out a feature where you could say to your phone “Hey, what song is this?” and it would — just like Shazam — find out the song playing in the background, in an instant and a half. Magical! And it’s the kind of magic we’ll debunk in this RLFO.
?? abracadabra: How does Shazam work?
[link ] A long time ago, in a thought experiment most likely conducted while drinking in a bar, some friends and I were discussing on “how the hell does Shazam work”, and “how would we build Shazam”. We didn’t reach a conclusion, either due to obvious skill issues and/or outrageous alcohol levels, but we all agreed that picking up a short sequence of sounds and comparing that sequence against all the songs in a database would not be a viable approach: the sequence could be from any part a song, and the search space would be enormous. It would be like comparing a 10th of a fingerprint against all the fingers in the world. It could work in some cases, but it’s very impractical.
Aaaand… that’s exactly how it works… with a sprinkle of dimensionality reduction on top and a side of ingenious hashing.
?? Pokémon Sprite Decompression Explained
[link ] As a kid, I remember playing one of those Mario games on a Game Boy. As an adult, I don’t remember suddenly realizing that everything you see and hear in the game is not more than just a few kilobytes: the music, the sprites, the whole game logic has to fit in a Game Pak cartridge that is not really that big. I remember having PlayStation Memory cards of 2 and 4 MBs, and those were a big deal back then.
Turns out, Nintendo programmers had a few compression tricks hidden in their sleeves, one of which is Run-Length Encoding, which I’ve mentioned in one of my old blog posts .
?? How we built Pingora, the proxy that connects Cloudflare to the Internet
[link ] When you’re Cloudflare and your traffic is of global nature, most off-the-shelf software doesn’t cut it for you. Nginx as a proxy has always been “good enough” for most I’ve done so far but in Cloudflare’s case, they faced roadblocks and limitations on CPU usage and resource consumption. So they went full Bender style and made their own reverse proxy, with blackjack and… Rust.
?? The future of Reverse Engineering Dynamic Binary Visualization
[link ] A 12 years old video but as far as I know, this isn’t the present of Reverse Engineering. So it gotta still be the future… Mapping any file to a 256 by 256 pixels image seems to reveal patterns that are shared by other files of similar type. Some files get transformed to pixels on the upper-left part of the image, some others into diagonals, others into squares within squares like Hubble’s Deep Field photo. By analyzing these patterns, one can tell whether a random file missing its extension is an image, or an audio file, or perhaps an executable virus program.
?? One process programming notes
[link ] “Modern” web services require a ton of resources: you need a machine to run the application server, and you need another machine to run the database, and you need a CDN or a reverse proxy for your single app server because “muh scale”… Now imagine the total opposite of that; instead of running multiple servers for everything, you run everything in a single server. Imagine being a single person developing a business, and you want to focus on cranking out features rather than provisioning yet another server. Bundling everything in a single process doesn’t sound that bad now, does it?
But then again, voodooing the shit out of every project might not be the best idea ever.