Data science for the 99%
Image by Tim Mossholder on Unsplash (https://unsplash.com/photos/qvWnGmoTbik)

Data science for the 99%

June 2023 Issue


I lived for a while in New York City, a place that I love dearly. I am, however, originally from Los Angeles, California, and, for the last 25 years, I have lived in Salt Lake City, Utah. What’s curious is that many of the people I met in NYC knew that I was from “someplace on the other side of the country” and seemed to have a mental map that resembled this picture:

New Yorker Magazine cover showing distorted perception of world outside of Manhattan
"View of the World from 9th Avenue" by Saul Steinberg (https://en.wikipedia.org/wiki/View_of_the_World_from_9th_Avenue)

We all have our own forms of myopia, where we see one small part of the world and its possibilities in great detail, but the rest is just a vague haze. This can happen even when the part outside our awareness is much, much larger than our familiar, comfortable area. One peculiar place that can happen is in data science careers.

As an example, Springboard published a list of the "22 Best Data Science Companies Hiring in 2023." Not surprisingly, the top of the list is dominated by giant tech companies. Here are the top five companies on their list, along with their employee counts:

  1. Microsoft (220,000 employees)
  2. Amazon (1.5 million employees)
  3. EY (365,000 employees)
  4. Google (Alphabet; 150,000 employees)
  5. VMware (38,300 employees)

It almost makes VMware look small by comparison, but, at this exact moment, they have a market capitalization of over $60 billion, so they're definitely big.

Another way to look at data science is with the list "Highest Paying Data Science Jobs in 2023" by Simplilearn. They list familiar job titles like data scientist, machine learning engineer, data architect, and so on. While all of these are important jobs and they pay well, they also represent a narrow view of the career possibilities for people interested in working with data. (And they're definitely not the only well-paying jobs that are potentially available to people with training in data work.) Really, it starts to feel like a gold-plated hall of mirrors. You see a lot of very shiny things, but you don’t necessarily see very far.

Hall of Mirrors at the Palace of Versailles
The Hall of Mirrors at the Palace of Versailles in France (https://www.historylines.net/img/versailles/La_Galerie_des_Glaces.jpg)

So, it may be time to update your map of the data science career landscape, and see what else is out there for you. To help with this, I'll share five recommendations.

1. See the 99%

Public viewfinder looking over field and sunset
Image by Matt Noble on Unsplash (https://unsplash.com/photos/BpTMNN9JSmQ)

This is an exercise that involves seeing what's right in front of you. According to data from the US Small Business Administration (and additional data from the US Census, the US Chamber of Commerce, and Forbes), there are over 33 million businesses in the United States, but fewer than 21,000 have 500 or more employees, which is the cutoff in the US for "small business." That's just 0.06% of all businesses – it's not even visible in the chart below. On the other hand, small businesses, or those that have fewer than 500 employees, account for over six million business establishment in the US, or 18%. And, finally, solo businesses with no employees beyond the owner are by far the most common, accounting for over 27 million businesses, or nearly 82%.

Bar chart of number of businesses in US by business size

Then again, large businesses employ more people per firm, so it's also helpful to look at the total number of employees in each category. Large businesses collectively have 30 million employees, which is a little more than the solo businesses (19% and 17%, respectively). But the small businesses employ more than both of those categories put together: nearly 100 million people in America work for small businesses, or almost two-thirds of the total. This is a massive group, and a place where an aspiring data scientist can make a meaningful contribution.

No alt text provided for this image

[And It should also be clear by this point that I wasn't quite accurate when I called this newsletter "Data Science for the 99%." Really, it should be "Data Science for the 99.94%."]

2. Learn about the goals of the 99%

No alt text provided for this image
Image by Maranda Vandergriff on Unsplash (https://unsplash.com/photos/fZBwUGlKbO8)

The work that makes headlines in data science – the extraordinary advances in machine learning and artificial intelligence, for example – are associated with tech giants like Google, Amazon, and Microsoft. (Paradoxically, the company that produced the world-changing ChatGPT, OpenAI, is apparently still a small business; Wikipedia and other sources report that OpenAI has 375 employees.) But data work is still critical to the six million small businesses and 27 million solo businesses in the United States. Think of some of the small businesses around you:

  • Local bakeries, breweries, cafés, and restaurants
  • Doctors, lawyers, and accountants with their own practices
  • Regional architectural firms, construction companies, and subcontractors
  • Social media marketing and event planning specialists
  • Professional dance companies, music ensembles, and theater companies
  • Landscapers, plumbers, electricians, and HVAC technicians

(For a more complete list see "United States Small Business Economic Profile" by the US Small Business Administration, which ranks various categories by number of firms and employment, including the percent of firms in a category that are small businesses. For example, 86% of firms in "Agriculture, Forestry, Fishing and Hunting" are small businesses.)

These businesses are designed to provide a sustainable living for their owners and employees. That goal may sound painfully obvious, as though there were no other options, but it stands in stark contrast to the number of companies that are oriented towards rapid growth and splashy IPOs for their investors. (Curiously, many of the "growth-oriented" companies that are in the news have never been profitable, but rely on continued rounds of investor funding.) Both kinds of businesses need to track their progress, although they will use different metrics. They both need to calculate their ROI, or return on investment, but growth-oriented businesses are likely to focus on investor-facing actions, while small businesses will focus more on customers. In particular, small businesses need to know how well they are serving their current customers and clients, as well as how they can reach out to new ones in a sustainable way, which means without the dramatic expansions in investment or headcount that growth-oriented startups rely on.

In my own work with small businesses and nonprofits, their major concerns included questions like:

  • What days and times should they be open?
  • How can they simplify record-keeping for their employees and volunteers?
  • How did they connect with their best donors and how can they find more?
  • What products and services were their clients most interested in?
  • How well did they help their audience meet a wide range of challenges in their lives?

These are not necessarily complicated questions in need of high-powered machine learning. They are fundamentally simple, but they are also the most important questions that these organizations had. The questions all had a direct impact on how they operated and, by extension, how well they could sustain their work. They are data-driven questions, but not everyone knows how to work with data to answer them. Learning how to address these goals, as opposed to, say, developing a new data product, can make your work crucial to the 99% of businesses around you.

3. Adapt your methods to the 99%

No alt text provided for this image
Image by Markus Winkler on Unsplash (https://unsplash.com/photos/IrRbSND5EUc)

In 1954, Kitty Kallen sang "little things mean a lot." In working with data for small businesses, this same principle is true. The datasets from small businesses nearly always fit neatly into a single spreadsheet file. In fact, I can only think of one client I worked with who used anything else. In that case, it was a simple matter of taking their two Salesforce databases, doing an inner join, and then saving the data in a spreadsheet file. In an earlier edition of this newsletter entitled "Minimalism in data work," I encouraged people to start with simple tools and only move on as needed. Specially, I said:

  • First, spreadsheets
  • Second, apps
  • Third, languages

The idea is to use the "minimally-sufficient tool," or the simplest thing that gets the job done properly and with minimal difficulty. With data, that typically means spreadsheets, unless and until it is no longer easy to do what you need to do. Then move to apps like SPSS or jamovi, unless and until is no longer easy to do what you need to do there. Only at that point would you need to move on to a data-focused programming language like R or Python.

There are a few important reasons for this recommendation when it comes to working with the 99%:

  • Spreadsheets are universally available. This is a tool that your clients will already have access to and will be familiar with. Also, when the data is properly prepared (for example, in the "tidy data" style advocated by Hadley Wickham) and saved in CSV files (the "comma-separated values" format that serves as a generic standard for spreadsheets), then nearly any data app or program in the world can read it with minimal effort.
  • Spreadsheets require you to keep your work simple. Spreadsheets are great for organizing data, for sorting variables, for computing descriptive statistics, and for creating bar charts and line charts. More complicated work may be possible, but it can be difficult, and so the basic approaches are reinforced. In my experience, this small set of operations will answer at least 90% of the questions that small businesses may have.
  • Spreadsheets facilitate communication. By using a common format on a universally-accessible tool, your clients will be able to see everything you did. In addition, by relying on spreadsheets to create your charts, your choices are limited to the options that are typically most effective. This enforced simplicity promotes clear, concise communication, and that is always good. (As an example, I used Google Sheets to make the two bar charts that appeared earlier in this newsletter. It was a quick process, and the charts are easy to understand.)

Businesses are best served when you can give them actionable insights. In small businesses, that can often be just a "yes" or "no" to a critical question. In most cases, you can provide a data-driven answer to those questions using simple tools and procedures.

(FREE COURSE: For more information on actionable insights, see my new LinkedIn Learning course "Actionable Insights and Business Data in Practice." This course is a hands-on approach to making your data work directly useful to your clients. Use this link and it will be free to you for 24 hours.)

4. Connect with the 99%

No alt text provided for this image
Image by Antenna on Unsplash (https://unsplash.com/photos/ohNCIiKVT1g)

Fortunately, finding the 99% is easy because they're everywhere. Every city and every town has small businesses that could benefit from thoughtful data work. An easy first step is to connect with your local Small Business Development Center or your local Chamber of Commerce to describe your skills and ways that you think you could help. They can give you great ideas to connect with people who would value your work.

It also helps to connect with groups that focus on particular topics. For example, I live in Utah, and I have connected with the Utah Nonprofits Association and the Utah Cultural Alliance. I have also presented on data topics at the Mountain West Arts Conference, where I formed some wonderful, lasting connections.

Connecting through networking events, such as Meetup groups in your area, can also be a productive way of finding the 99%. And, finally, there is the opportunity to connect with local nonprofits, most of which are small businesses, through service events. Here in Utah, I organized several editions of our "Data Charrette," where we connected data-savvy volunteers with local nonprofits in two-day, hackathon-like events. These events were great experiences where volunteers made important connections and honed their data skills. We'll launch a modified one-day version of the event again later this fall, with the goal of doing local and remote versions at least twice a year.

5. Enable the 99%

No alt text provided for this image
Image by Amy Hirschi on Unsplash (https://unsplash.com/photos/K0c8ko3e6AA)

Finally, one of the best things you can do for the small businesses you work with is help them develop their own data skills for ongoing work. You can do this by clearly organizing your work in an easy-to-follow format, and providing reusable templates, such as spreadsheets and presentations. You can share training materials with them that guide them in their own work. (For example, I have a range of courses at LinkedIn Learning, which is a subscription service, and free courses through my own company, datalab.cc, but many other options are available.)

It's also worth considering working as a full-time member of one of the small businesses near you. They may not need a full-time data scientist, but your ability to think with data, as well as your ability to learn new, vital skills can make you a valuable addition to any company. (As a personal note, I have four degrees in research psychology – BS, MA, MPhil, and PhD – and I am employed as a full-time faculty member at a university. However, when it comes to the work that I am actually paid for, which is teaching statistics, I rely on the empirical and interpretative mindset that I developed in graduate school, but the technical skills that I use I have learned on my own since then.)

Your data skills and insights are valuable, not just to the tech giants but to nearly every business and organization around you. Once you learn to better understand those organizations, what their goals are, and how you adapt your data expertise to meet those goals, then you can fulfill the promise of your training and your passion in new and unexpected ways.


Thanks for joining me here. And remember, sharing is caring!?Follow me on LinkedIn?and share this??newsletter with a friend who you think would benefit from it.

Ian Mac Moore

Data Visualization Designer | Inspiring discovery

1 年

Timely and insightful for me, thank you!

回复
Vincent (Kok Heng), LIM

International HSE Leader/Auditor/Educator | Certified Data Analyst - Currently focusing on Data Center Industry

1 年

Agree! It can be applied to any business as long as meaningful data is being collected!

Viorel Cazacu

Head of Controlling at INDITEX Romania | Financial Controlling & Financial Analysis Courses Lecturer @ Skillab

1 年

Great article, full of actionable insights ("actionable insights" is a mantra that I have learned from your Data Trainings and has been my Data Analysis focus ever since). Many thanks!

Thanks for caring Bart

David Ochoa Corrales

Data Lover | Lifelong Learner

1 年

Wow, it's a very important idea!

要查看或添加评论,请登录

Barton Poulson, PhD的更多文章

  • The symbol vs. the thing

    The symbol vs. the thing

    February 2025 issue In 1931, Alfred Korzybski, who developed the field of general semantics, famously said "the map is…

    3 条评论
  • In praise of DIY data work

    In praise of DIY data work

    January 2025 issue There is a fundamental paradox to data work: It feels like an exact science, because it has rows and…

    1 条评论
  • 3 themes for data work worth doing

    3 themes for data work worth doing

    December 2024 issue Richard Wagner, the 19th century German composer, revolutionized opera in many ways – in scale…

    1 条评论
  • Unoptimized and fine

    Unoptimized and fine

    November 2024 issue I'm a data person but, paradoxically, I'm not always data-centric. That is, I teach people how to…

    1 条评论
  • The data project pathway: A video newsletter

    The data project pathway: A video newsletter

    October 2024 issue People who work with data are smart: they know how to take a messy dataset and find meaningful…

    3 条评论
  • Easy is hard is easy

    Easy is hard is easy

    September 2024 issue I sometimes talk here about bicycling because (a) I love it; and (b) it's a great source of life…

    4 条评论
  • Pudding, focus, and data work

    Pudding, focus, and data work

    August 2024 issue It's easy to get distracted, forget what your actual goals are, and spend too much time and energy on…

    5 条评论
  • On not reinventing the wheel

    On not reinventing the wheel

    July 2024 Issue I love bicycling. I'm not very fast, but I ride thousands of miles each year, and, more importantly, I…

    1 条评论
  • The (conspicuously absent) consolation of art

    The (conspicuously absent) consolation of art

    June 2024 Issue LinkedIn isn't really the right place for a confessional, but, given that this one connects to my…

    7 条评论
  • Numbers (still) don't speak for themselves

    Numbers (still) don't speak for themselves

    May 2024 Issue I'm a psychology professor, and when I teach introductory psychology and lifespan development, we talk…

    2 条评论

社区洞察

其他会员也浏览了