登录查看更多内容

Data Mining for Marketing – Simple K-Means Clustering Algorithm

Alexandra Cote

I help SaaS companies score investments through content | Writer | Growth Consultant | Building a consumer app

发布日期: 2018年7月31日

Data mining is not just for technical people.

And you might have to cluster your data even if you’re just segmenting your clients for your next marketing campaign. Or maybe you’re just a student who’d like to find out the basics of Weka (data mining software).

Here’s a brief data mining tutorial for non-techies to help you get started with clustering:

Where can you get Weka?

The safest option is its official website. Download Weka (Doesn’t work without Java).

And it’s free. ??

Where do you find the right database?

Weka doesn’t work with just any database. And the algorithms you’re going to choose won’t fit all datasets.

So, if you want to use a specific algorithm, it’s best to just create your own set of data over which you can have full control. Aim for more than 1000 rows for accurate data.

But here are three sources where you could find some decent datasets:

data.imf.org

catalog.data.gov

tomslee.net/airbnb-data-collection-get-the-data

(Drop me a line if you know more.) ??

And if you’re looking for a case study (in plain English) with few technical elements so you can get an idea of how clustering really works: ??

Case study – Bank clients segmentation through clustering

Disclaimer: part of the case study is missing as I’ve done it for a college project and the results are not disclosable

Study objectives

Highlight the use of Weka for basic data mining processes
Discover the most representative segment of a bank’s (fictional) clients
Find out how a bank’s (fictional) services can be improved starting with the data regarding clients’ age, job, marital status, education, account balance, housing, and loans through an online marketing campaign that could bring new clients

Introduction

Data mining is the process through which valid and previously unknown information is extracted from a specific set of data and is then used to make an important business decision.

Briefly put, data mining is a method that allows YOU to find similar behavioral patterns, trends, or tendencies from an existing data set.

The main goal of the entire process is DISCOVERY.

From this point of view, I’ve chosen to find out the most significant clients of a bank (fictional) through clustering.

For this study, I picked a type of application often used in marketing and retail: identifying significant client profile and behavior patterns.

As a field of applicability, I’ve chosen banking. In this case, the main goal was to identify relevant clients (who are also loyal) and use their profile to create new digital marketing campaigns.

Typically, data mining could’ve been used to identify loyal clients or errors in the use of banking services, to discover new behavior, predict the way in which a service will be used, or estimate possible client administration costs.

The main target (and result) was to attract new clients based on analyzed profiles and behavior patterns. Thus, the desired profile of the bank’s possible clients will be created from the data on existing loyal clients.

As a result, we’ll be able to create a digital marketing campaign that will target exactly this market segment. And you might be looking to create alternative campaigns for the other significant client segments as well.

The link between objectives and strategic marketing

Highlight the use of Weka for basic data mining processes

Facilitates the use of an innovative method on a dataset owned by a marketing department and capitalizes upon their power to create new marketing campaigns in a fast and more efficient way than any traditional method. Using such data mining tools or method for marketing operations can offer a competitive advantage.

Discover the most representative segment of a bank’s (fictional) clients

Using a data mining software or method (like Weka) we can extract the profile of a significant or loyal client/customer. From this profile, we’ll build the online marketing campaigns.

Starting with the information offered by clients, personalized campaigns can be created. Clients’ response towards these is likely to be a positive one and people will be more interested in these than they would in a general, non-personalized campaign.

The success rate of a campaign will thus be considerably higher than if we had used a traditional method of segmentation.

Consequently, the chosen marketing strategy for this case study too will be using an innovative method to reduce costs (since Weka is a free tool ??) and time spent on segmentation and to increase the success rate of marketing campaigns built with this method.

Find out how a bank’s (fictional) services can be improved starting with the data regarding clients’ age, job, marital status, education, account balance, housing, and loans through an online marketing campaign that could bring new clients

In the case of companies or marketing departments that are using data mining or the client/market segmentation strategy for the first time, a reorientation of the general marketing strategy is needed.

Therefore, data mining is an easy way of determining which of a client’s attributes can be used to create and start a new digital marketing campaign.

You’ll also find out through which of these attributes you’ll get more success and a better response from your audience.

For example, through the dataset chosen in this case, you can test whether a campaign based on the clients’ job is more efficient than one that targets their age (or the other way around).

There are multiple opportunities and they can be diversified and tested until the right campaign model is found.

Work methodology

Dataset

Undisclosable .csv database. ??

Criteria for selecting a set of data

Any Weka project must start with a correctly built and error-free dataset.

Missing information would cause serious mistakes in the final results and thus jeopardize the marketing campaign we want to create.

For a closer data analysis, all information can be sorted and checked before you add it to Weka from Excel (or any other editor).

For instance, we can sort data according to age so that you can verify the diversity of your list based on the age of the people that are part of it. This ensures the objectivity of the Weka analysis to guarantee that the final campaigns will be fair.

After choosing a database, analyze it to see if it matches your project’s requirements and your objectives.

This way, the right database for this study had to contain a large number of people and relevant data on them that could be used for a marketing campaign. Among the necessary data were demographic characteristics, personal interests, and the relationship between the client and the seller (in this case the fictional bank).

The profile of the chosen dataset

The database I used contains attributes such as age, job, marital status, education level, account balance, and other info regarding their housing and bank loans.

This way, I ensured that the people in the database have diverse profiles/characteristics. Their ages are between 18 and 95; from students to retired people; single, married, or divorced; having a primary, secondary, tertiary, or unknown education; varied account balance, debts, or with no money in their accounts, etc.

Process and algorithm

The process

Data mining is the process of extracting, transforming, and analyzing the data in a set of data regardless of its size.

For this case study, the data mining process was used to gather info regarding a fictional bank’s clients. This type of analysis will then be used to plan a digital marketing campaign and facilitate other general business decisions.

The data mining can help identify errors, patterns, and data correlations to predict approximate but effective results. This information can then be used to generate new results, profit, and other benefits, to reduce costs and risks, or to improve the seller-client relation.

Using exact client data we can customize campaigns that will allow us to increase our profit, satisfy our clients, and avoid losing large sums of money on useless marketing campaigns that don’t target a specific buyer persona.

The data mining algorithm

I used Simple K-Means Clustering as an unsupervised learning algorithm that allows us to discover new data correlations. (Note: It does so much more than just that. But I’ll stick to the basics for now.)

After choosing an algorithm, I’ve selected the number of wanted (or needed if you have a specific target in mind) clusters (3), the maximum number of iterations (500), and the distance metric (EuclideanDistance).

Note: Again, clustering is so much more than just these metrics. And this is a good thing. If you’re looking into learning data mining on an advanced level you’ll see how these functions, classifiers (etc.) can help you get more accurate results.

The clustering results were then shown in a table whose attributes and columns correspond to the final cluster centroids.

Read the full post on my blog.

要查看或添加评论，请登录

Alexandra Cote的更多文章

How SaaS brands are doing link building- Steal their ideas!

2024年11月26日

How SaaS brands are doing link building- Steal their ideas!

This edition is sponsored by dofollow.com — a link-building agency for B2B SaaS companies aiming to increase their…

3 条评论
Why Hybrid Marketing Teams are the Future

2024年9月4日

Why Hybrid Marketing Teams are the Future

Are hybrid team setups the future of marketing? The answer is a resounding yes, but not in the way you might think…
How To Get Returning Visitors To Your Website

2024年8月27日

How To Get Returning Visitors To Your Website

A couple of years ago the ranking factors used by Yandex were leaked, revealing some similarities to Google's…
How to Use Content Optimization Tools Correctly

2024年8月12日

How to Use Content Optimization Tools Correctly

Content optimization tools [Surfer, Clearscope, Frase, etc.] are taking off.

1 条评论
A/B Testing Mistakes You Didn't Know You Had to Avoid

2024年7月29日

A/B Testing Mistakes You Didn't Know You Had to Avoid

A/B testing is a powerful tool in the digital marketer's arsenal. In fact, it's too powerful.
How to Get Your Entire Team Involved in the Content Creation Process

2024年7月23日

How to Get Your Entire Team Involved in the Content Creation Process

Engaging your entire team in content creation can be a daunting task for any marketer. Convincing your marketing team…
Alternative/competitor comparison pages - How not to be boring

2023年4月12日

Alternative/competitor comparison pages - How not to be boring

Creating alternative pages can be tricky. And often underrated.
An ode to the simplicity of the dark mode - When you don't have the resources for a fancier design

2023年3月15日

An ode to the simplicity of the dark mode - When you don't have the resources for a fancier design

We've all had that campaign where we needed good design but didn't have the resources for it. So whenever I needed a…
Avoiding AI content + how humans can detect AI content

2023年2月15日

Avoiding AI content + how humans can detect AI content

AI is EVERYTHING marketers are talking about. And if you ask me, it's not a good thing.

3 条评论
Written interviews on your blog - Are they worth it?

2023年1月18日

Written interviews on your blog - Are they worth it?

I'm a huge hater of posting written interviews [or transcriptions] on your blog. Mainly because no one I've ever talked…

6 条评论

See all articles

Data Mining for Marketing – Simple K-Means Clustering Algorithm

Alexandra Cote

I help SaaS companies score investments through content | Writer | Growth Consultant | Building a consumer app

Where can you get Weka?

Where do you find the right database?

Case study – Bank clients segmentation through clustering

Study objectives

Introduction

The link between objectives and strategic marketing

Work methodology

Dataset

Criteria for selecting a set of data

The profile of the chosen dataset

Process and algorithm

The process

The data mining algorithm

Read the full post on my blog.

Alexandra Cote的更多文章

社区洞察

其他会员也浏览了

Operational Data Mining for better decision-making (Part 2 )

What is Data Cleaning?

2018 KDnuggets Poll: What software you used for Analytics, Data Mining, Data Science, Machine Learning projects in the past 12 months?

How To Become a Data Mining Specialist

Data Mining: Transforming Data into Business Insights

7 Data Mining Functionalities Every Data Scientists Should Know About

Top 15 Data Mining Tools To Discover Patterns And Correlations

Why Data Mining is Still Important in 2019

The Early Days of Data Mining: Extracting Value from Information

KDnuggets Poll: What software you used for Analytics, Data Science, Machine Learning?

Where can you get Weka?

Where do you find the right database?

Case study – Bank clients segmentation through clustering

Study objectives

Introduction

The link between objectives and strategic marketing

Work methodology

Dataset

Criteria for selecting a set of data

The profile of the chosen dataset

Process and algorithm

The process

The data mining algorithm

Read the full post on my blog.

Alexandra Cote的更多文章

How SaaS brands are doing link building- Steal their ideas!

Why Hybrid Marketing Teams are the Future

How To Get Returning Visitors To Your Website

How to Use Content Optimization Tools Correctly

A/B Testing Mistakes You Didn't Know You Had to Avoid

How to Get Your Entire Team Involved in the Content Creation Process

Alternative/competitor comparison pages - How not to be boring

An ode to the simplicity of the dark mode - When you don't have the resources for a fancier design

Avoiding AI content + how humans can detect AI content

Written interviews on your blog - Are they worth it?

社区洞察

其他会员也浏览了

Operational Data Mining for better decision-making (Part 2 )

What is Data Cleaning?

2018 KDnuggets Poll: What software you used for Analytics, Data Mining, Data Science, Machine Learning projects in the past 12 months?

How To Become a Data Mining Specialist

Data Mining: Transforming Data into Business Insights

7 Data Mining Functionalities Every Data Scientists Should Know About

Top 15 Data Mining Tools To Discover Patterns And Correlations

Why Data Mining is Still Important in 2019

The Early Days of Data Mining: Extracting Value from Information

KDnuggets Poll: What software you used for Analytics, Data Science, Machine Learning?