ç™»å½•æŸ¥çœ‹æ›´å¤šå†…å®¹

How I Accidentally Built an API Business

Kevin zhou

Sr.java,Php,solidity Engineer

å‘å¸ƒæ—¥æœŸ: 2020å¹´12æœˆ14æ—¥

In this article, Iâ€™ll share my journey of building an API business, the technology behind it, and how to build your own API business in the future.

First, a little bit about the business I've built: Listen Notes is a podcast search engine that allows people to search nearly two million podcasts and more than 89 million episodes by people or topics. We also provide a podcast API for developers to use, which is called Listen API. It has become a core part of our business.

An accidental API business

I left my previous failed startup in September 2017. After a few days of tinkering, I picked up one of my fledgling side projects to polish the UI a bit.

That side project was Listen Notes, a podcast search engine website, which was just a single page React JS app running on three $10/month DigitalOcean droplets.

Little did I know a few years ago that my small, neglected side project would turn into the helpful business it has blossomed into.

An early version of Listen Notes

I continued to work on Listen Notes full-time and incorporated Listen Notes as a Delaware C-Corp in October 2017. One of my goals was to experience as many facets of business as possible, rather than just writing code behind the scenes.

My initial plan was as follows: (Donâ€™t laugh at me!)

Build a podcast search engine website and make some money from advertising, just like Google. Simple!
If this Listen Notes thing doesnâ€™t work in two or three months, then Iâ€™ll run out of cash, and Iâ€™ll go into credit card debt to keep going for one more month or so. If it still doesnâ€™t work, then Iâ€™ll have to find a full-time job. Although Jeff Bezosâ€™ parents invested $300,000 in early Amazon and Mark Zuckerbergâ€™s parents loaned $100,000 to early Facebook, not every family is able to casually toss six figures of cash at web projects.

Then something happened.

On November 20, 2017, I got an email from the developer of a new podcast app, who asked if Listen Notes provided an API. He wanted to be able to search episodes in his app, but he didnâ€™t want to build the entire backend.

I asked a few questions (for example, how would the endpoints look, what data fields did he need, how much was he willing to payâ€¦). I got his answers. Everything was in an email thread within a couple days.

On November 30, 2017, I quickly implemented three endpoints (GET /search, GET /podcasts/{id}, and GET /episodes/{id}), which were basically three Django views.

I Googled â€œAPI gatewayâ€ or something like that and found a service called Mashape, which was an API marketplace that handled payment, user management, and API documentation.

So I put my three endpoints on Mashape and created two plans there: FREE and PRO. I emailed the developer back to tell him the API was ready to use.

The email thread that prompted me to build Listen API

Then nothing happened. The podcast app developer didnâ€™t use our API and instead phased out their project.

Eventually, I moved on to primarily focus on the development of listennotes.com. The API was basically in self-driving mode on the open web. Anyone who happened to discover our API could sign up, without talking to any human beings.

On January 14, 2018, I got my first paying user. A few more paying users arrived that same year.

The email notification I received for my first paying user

Wait, what is RapidAPI? Well, Mashape was acquired by a startup named RapidAPI. They didnâ€™t rebrand Mashape to RapidAPI completely until mid-2018. Startups typically donâ€™t do things in a clean and methodical way, which is totally understandable.

Then something happened.

There was an outage on the RapidAPI end on November 29, 2018.

The email I sent to people in RapidAPI when the outage happened

RapidAPI had performed a big backend upgrade around that time. As an engineer, I totally understand that outages happen, especially when making huge changes in the backend. But I felt helpless because their customer support didnâ€™t reply to my email. Phone call didn't work, as expected.

Usually their customer support was very responsive. Perhaps it was the holiday season and people were on vacation.

So I used hunter.io to find work emails of individual RapidAPI employees, the CEO, as well as the CTO. The issue was finally resolved, many hours later. In other words, our API was completely unusable during those down hours. I felt very sorry for our paying users.

Then around mid-February 2019, RapidAPI had billing problems and failed to pay us a few thousand bucks. Our paying users paid RapidAPI first. RapidAPI took a 20% cut. Then they paid the remaining 80% (minus PayPal fees) to us.

After several back-and-forth emails and phone calls, we finally got our payment. Itâ€™s understandable. Again, startups make mistakes.

In late February 2019, I decided to build our own RapidAPI replacement, for a few reasons:

Our API revenue became nontrivial. The 20% cut from RapidAPI was a bit too much for us.
We wanted API requests to hit our own servers directly, thus lowering latency for our users.
I didnâ€™t want to feel helpless when RapidAPI had outages. Overall they did a good job running the service. But I wanted to control my own destiny.
I wanted to contact my API users directly. Using RapidAPI, API providers like me didnâ€™t have access to our usersâ€™ email addresses. Itâ€™s understandable. Itâ€™s like the â€œUber for Xâ€ companies that donâ€™t want workers and customers to bypass them and strike deals under the table. Marketplaces donâ€™t want users to skip the middlemanâ€™s commission fees.

In addition, I vowed to do two things really well for our new API system:

We must provide great customer service to our paying users.
We will give customers a very stable & reliable backend service.

After 30 days of hard work, we launched Listen API v2 on March 27, 2019. The legacy API hosted on RapidAPI became Listen API v1, a version we wonâ€™t add new features to but donâ€™t want to shut down because some apps are still using it as of December 2020!

We continue to improve our new Listen API v2 by adding new endpoints, new data fields, improving operational efficiency, as well as spiffing up the user dashboard and our internal tools.

Things are picking up speed gradually. Iâ€™ve been happy since then.

So, thatâ€™s the journey of Listen API so far.

Note: Although we decided to move on from RapidAPI, I still think itâ€™s a great service. Startups all make mistakes in the early stage. They fix things and continue to improve their service, which is great!

The technology behind Listen API

Developers can use our API to search podcasts and fetch detailed podcast-episode metadata. To make this whole thing work, we need to make sure a few core components are in place.

Listen API's main components and the technologies used

Datastore and search engine

This is a shared component with our website. Therefore, I didnâ€™t need to change anything in the datastore and search engine when building our API infrastructure.

We use Postgres as our main data store (for example, for podcast metadata, user accounts, and so on), and Elasticsearch as the search engine.

I wrote an old blog post with the details of the entire tech stack.

Internal tools and processes

If youâ€™ve worked at any web companies, you probably know what Iâ€™m referring to here.

Itâ€™s rare for an Internet business to be 100% automatic. A company always needs to build tons of internal tools and set up manual processes to keep the service functional. Thatâ€™s why companies like Retool have such a high valuation nowadays.

Companies are investing big money in internal tools that are invisible to end users:

Percentage of team's time spent on internal tools. Credits: Retool

To start our API business, we needed to build (at least) two types of internal tools:

For data operations: We needed the ability to keep the podcast metadata up-to-date, fix corrupted metadata, plus review and approve any changes made by users.
Additionally, we required a framework that handled new, rare edge cases of corrupted podcast data along the way. To some degree, building a software product means handling tons of edge cases for a very long period of time (like, years), rather than launching new features every day.
For user operations: We required the ability to suspend a bad userâ€™s account, as well as immediately look up all information related to a specific user who contacted us for a specific issue.
Plus, we had to be able to quickly evaluate if â€œitâ€™s our faultâ€ (server-side errors) or â€œitâ€™s their faultâ€ (client-side errors) when users complained.

Internal tools are used by employees inside the company. Some of those tools are fully automated, such as cron jobs that perform scheduled tasks. But many tools should be used manually by human employees, for example when inputting a userâ€™s ID number and clicking a button.

Most of our internal tools have ugly web UIs, with default Bootstrap styling :)

A portion of our internal toolâ€™s UI that allows us to suspend an API userâ€™s account.

Fortunately, our API shares many internal tools with the website. So we didnâ€™t need to build too many new things here.

The analytics and billing system

The pricing model of an API is typically usage-based. Check out some real world examples:

https://www.twilio.com/pricing
https://sendgrid.com/pricing/
https://cloud.google.com/maps-platform/pricing/
https://www.microsoft.com/en-us/bing/apis/pricing

Itâ€™s a must to track how many requests a user uses in real-time. We use Redis to keep track of such stats and periodically dump into Postgres for persistent storage.

What happens if our Redis has an outage? We might temporarily lose some tracking stats. In this case, we have an internal tool to sync stats from raw Nginx logs.

We have to change billing plans without affecting existing users. For example, if we raise prices, existing users should still enjoy the benefit of the old plans. If itâ€™s not done right, itâ€™s easy to have inconsistent states across the board, and angry users getting charged the wrong billing plan!

Payment failures, a very common occurrence, must be handled gracefully. We canâ€™t just suspend users right away. We need to be able to notify ourselves that â€œthis user failed to payâ€ and notify the user that â€œyou failed to pay.â€

After a few retries, we suspend users manually â€” well, we couldâ€™ve automated this last step. But we donâ€™t suspend users often nowadays, so itâ€™s okay to do so manually. Thereâ€™s no need to make everything perfect (at least for now).

We have a dashboard (Godâ€™s view) to see how many requests each individual user uses in the current billing cycle. And we are able to review raw logs for each user from a web UI, without manually pulling log files from S3.

Stripe and PayPal (via Braintree) are our payment processors. Most of our international users use PayPal.

Finally, putting all of these factors together, we can calculate the actual amount of money that a user should pay us in real-time, based on their usage. We run async tasks via Celery to charge due bills.

What happens if a user unsubscribes in the middle of a billing cycle? We charge them prorated rates, based on time and usage. Users donâ€™t need to pay a full monthâ€™s fee in those instances.

API Servers

We run Django apps to serve API requests. Each endpoint is a simple Django view. A Django middleware verifies if a request is legit, then generates a log or rejects the request right away.

We cache response data per API key + unique URL in Redis. In general, our API performance is pretty good.

We use Nginx as a load balancer and provision multiple API servers. Itâ€™s straightforward to do rolling deployment here, with a bunch of sanity checks to ensure the API is functioning.

Generally speaking, the easy and robust deployment process increases my confidence to make incremental code changes often and to deploy frequently.

An API endpoint is RESTful and returns a JSON response, pretty standard nowadays.

User Dashboard and API Docs

Each API user can access a dashboard on our website to learn the amount of requests theyâ€™ve used in the current billing cycle and view recent raw logs. They can also update payment methods, create or reset new API keys, set up webhooks, and add coworkers to the same API account.

Listen API's user dashboard

API Docs is probably the most important UI for an API business. Therefore, many API companies employ a whole team of full-time engineers to build and maintain â€œmerelyâ€ the API Docs page(s).

An API Docs page is not simply a full page of English words. It must show code snippets for different programming languages.

Users have to be able to run your code example directly from the page. You are required to design a repeatable process (no matter if it's automatic or manual) to keep the documentation in sync with your code. There are plenty of nuances.

We spent a lot of time and energy building and iterating multiple versions of our API Docs page. Following is the end result:

Listen API Docs Page

Initially, we tried a few open source solutions for the API documentation. Itâ€™s quite time-consuming to understand an open source project well enough to customize it. Ultimately, we decided that it would be faster to build the page from scratch rather than customizing an open source solution built by others.

Our API Docs page is basically a React JS single page app.

We codify all endpoints, response data schema, and example response in an OpenAPI spec. The React JS app of the API Docs page reads from our OpenAPI spec directly.

The side effect of using OpenAPI is that we can easily integrate with tools like Postman, because OpenAPI is a (relatively) widely adopted standard for API documentation nowadays.

Why Listen API works

Listen API has been a nice business for me so far.

But donâ€™t expect me to share revenue numbers publicly :)

Some companies are doing this open startup thing, sharing every single business metric to the public, which is great.

But we shouldnâ€™t blame the majority of companies (including my small company Listen Notes, Inc.) who donâ€™t want to share business metrics publicly.

Not everyone is comfortable being naked in public, literally or figuratively.

Similarly, thereâ€™s lots of business advice (or cliches) that you donâ€™t have to follow.

You donâ€™t have to find a cofounder - having a horrible cofounder is way worse than not having one.
You don't have to reveal your revenue to public or do any "open startup" thing. No pressure. Don't feel guilty if you are not doing what other cool kids are doing. You run your own company. You make your own decisions.
You donâ€™t have to do XYZ that a Twitter VC philosopher urges you to do in a fortune-cookie-like tweet.
You don't have to be 100% bootstrap nor 100% VC-backed. Many things are not completely one way or the other. Usually, there's middle ground.
...and the list goes on.

The bottom line is, not one is absolutely wrong or absolutely correct. Each individual's vision/knowledge is limited. Each person's preferences might be very different.

An API business may be too obscure to most people in the world, but I like my API business very much. People from big companies (like Apple, Amazon, or Microsoft) may examine my business and deem it â€œcuteâ€. But I would consider it a success for me personally.

And success is relative. The key is to bring happiness to customers (by saving them time and money and helping them solve problems), myself (a professional achievement), and my family (by keeping the fridge full).

So why does the Listen API work?

Demand and MVP

I didnâ€™t build a solution to find problems. It was the problem (a podcast app that wanted to add search functionality) that found usâ€”and we built a very simple solution at first.

We didnâ€™t spend months launching the API. We spent a couple of hours. It costs at least $100 per hour to hire a not-so-bad engineer in San Francisco, so the cost of launching this API MVP was approximately $200. Even if it were $2,000, I'd still think it was worthwhile.

Two reasons why we were able to launch an MVP quickly:

The heavy lifting part of building a podcast database, search engine, and data operations tool was already done, because of our podcast search engine website.
Mashape / RapidAPI existed to provide a plug-and-play solution for us to manage users and create paid plans without writing code on our end.

However, in hindsight, itâ€™s actually very common for a commercial search engine to license their tech (via API or other ways). Some examples:

Yahoo Search was powered by Google circa 2000, and is powered by Bing today.
In the early days, Baidu's only business model was to put a web search on some Chinese portal sites
Today, Bing provides a bunch of search APIs.

By launching an MVP fast, we were able to get feedback early, especially after getting the first paying user only a month or so after launch.

Good documentation

User feedback proves that our API Docs page plays an important role in customers' decisions to use our API. There must be a reason for API companies to employ a whole team of engineers â€œonlyâ€ to maintain their documentation pages.

Great documentation builds trust.

Stable backend service

Stability is the essential base of an API businessâ€™ Maslowâ€™s hierarchy of needs. If an API is not stable at all (for example, it has frequent outages or runs very very slowly), it can't be used.

However, itâ€™s boring to perform work to improve backend stability. Most tasks to stabilize backend services are preventive, including extensive monitoring and alerting, the process to deploy code with confidence, end-to-end regression tests, and so on.

No news is good news.

No outages are great news.

è¦æŸ¥çœ‹æˆ–æ·»åŠ è¯„è®ºï¼Œè¯·ç™»å½•

Kevin zhouçš„æ›´å¤šæ–‡ç«

Talking more about AutowiredAnnotationBeanPostProcessor

2022å¹´5æœˆ17æ—¥

Talking more about AutowiredAnnotationBeanPostProcessor

AutowiredAnnotationBeanPostProcessor object is used to dynamically place objects that depend on the dependent objectâ€¦
Spring â€“ BeanFactory

2022å¹´5æœˆ15æ—¥

Spring â€“ BeanFactory

The first foremost thing when we talk about spring is dependency injection which is possible because spring is actuallyâ€¦
Difference Between BeanFactory and ApplicationContext

2022å¹´5æœˆ15æ—¥

Difference Between BeanFactory and ApplicationContext

1. Overview The Spring Framework comes with two IOC containers â€“ BeanFactory and ApplicationContext.
SOLVING THE N+1 PROBLEM IN ORMS

2021å¹´2æœˆ3æ—¥

SOLVING THE N+1 PROBLEM IN ORMS

SOLVING THE N+1 PROBLEM IN ORMS THE N+1 PROBLEM ORMs are tools that write SQL requests for you. Because you use a niceâ€¦
How I Learned to Code and Built a Real Software Product in 6 Months

2020å¹´12æœˆ14æ—¥

How I Learned to Code and Built a Real Software Product in 6 Months

On May 21st, 2020 I wrote my first-ever line of code as part of Harvardâ€™s CS50 online course. I was a 30 year old withâ€¦

1 æ¡è¯„è®º
How to Convert PDF to Images with Imagemagick

2020å¹´12æœˆ1æ—¥

How to Convert PDF to Images with Imagemagick

How to Convert PDF to Images with Imagemagick In this post, I want to share how to accomplish this task withâ€¦

1 æ¡è¯„è®º
Design Patterns - Facade Pattern

2020å¹´3æœˆ22æ—¥

Design Patterns - Facade Pattern

Facade Design Pattern | Introduction Facade is a part of Gang of Four design pattern and it is categorized underâ€¦
Advantages of ERP

2020å¹´3æœˆ7æ—¥

Advantages of ERP

Advantages of ERP Now that weâ€™ve cleared up some misconceptions, itâ€™s time to move onto why companies all around theâ€¦
Configure postfix as relay for macOS Sierra

2019å¹´6æœˆ9æ—¥

Configure postfix as relay for macOS Sierra

Configure postfix as relay for macOS Sierra Source: https://www.developerfiles.
How to Register & Use Laravel Service Providers

2019å¹´4æœˆ22æ—¥

How to Register & Use Laravel Service Providers

If you've ever come across the Laravel framework, it's highly unlikely that you haven't heard of service containers andâ€¦

See all articles

How I Accidentally Built an API Business

Kevin zhou

Sr.java,Php,solidity Engineer

An accidental API business

The technology behind Listen API

Datastore and search engine

Internal tools and processes

The analytics and billing system

API Servers

User Dashboard and API Docs

Why Listen API works

Demand and MVP

Good documentation

Stable backend service

Kevin zhouçš„æ›´å¤šæ–‡ç«

ç¤¾åŒºæ´žå¯Ÿ

å…¶ä»–ä¼šå‘˜ä¹Ÿæµè§ˆäº†

SaaStr is 4 Years Old. What Weâ€™ve Learned.

?? CEO Digest - 10 key takeaways from recent podcasts ??

Jen Phan on the Creator Economy

Taxonomies in Product and Portfolio Strategies // Coherence and Composability // Age of Multi-Product

3 New Habits, 3 New Books, 3 Minutes

The Heart of the Daydream: Unveiling the Origins of Fintech Daydreaming | Fintech Daydreaming S09E08

Never stop experimenting with your career

Technically Speaking Newsletter - July 2023 Recap

Will Lansing (CEO of FICO) and the Power of Data Standards

From Unemployed to Innovator: A Ticket Tech Revolution

An accidental API business

The technology behind Listen API

Datastore and search engine

Internal tools and processes

The analytics and billing system

API Servers

User Dashboard and API Docs

Why Listen API works

Demand and MVP

Good documentation

Stable backend service

Kevin zhouçš„æ›´å¤šæ–‡ç«

Talking more about AutowiredAnnotationBeanPostProcessor

Spring â€“ BeanFactory

Difference Between BeanFactory and ApplicationContext

SOLVING THE N+1 PROBLEM IN ORMS

How I Learned to Code and Built a Real Software Product in 6 Months

How to Convert PDF to Images with Imagemagick

Design Patterns - Facade Pattern

Advantages of ERP

Configure postfix as relay for macOS Sierra

How to Register & Use Laravel Service Providers

ç¤¾åŒºæ´žå¯Ÿ

å…¶ä»–ä¼šå‘˜ä¹Ÿæµè§ˆäº†

SaaStr is 4 Years Old. What Weâ€™ve Learned.

?? CEO Digest - 10 key takeaways from recent podcasts ??

Jen Phan on the Creator Economy

Taxonomies in Product and Portfolio Strategies // Coherence and Composability // Age of Multi-Product

3 New Habits, 3 New Books, 3 Minutes

The Heart of the Daydream: Unveiling the Origins of Fintech Daydreaming | Fintech Daydreaming S09E08

Never stop experimenting with your career

Technically Speaking Newsletter - July 2023 Recap

Will Lansing (CEO of FICO) and the Power of Data Standards

From Unemployed to Innovator: A Ticket Tech Revolution

å…¶ä»–ä¼šå‘˜ä¹Ÿæµè§ˆäº†