登录查看更多内容

Ask an Analyst | What is a Model?

John P. Gough

Assistant Vice President, Editor of the Journal of Advancement Analytics

发布日期: 2018年1月2日

Do you work in fundraising? Have you ever wondered about what all those people in development/advancement services are doing at your organization? Have you ever sat in a meeting with your data people and thought; what on earth are they talking about? If so, this mini-series is for you! Over the course of the next 12 months I’ll be answering questions frequently asked by non-technical fundraising professionals. If you have a question that you would like answered, feel free to leave it in the comments section below.

What is a Model?

'Model' is a term that gets thrown around quite a lot in our world of big data; model this, and model that - models seem to be the magical solution for everything. But what is a model, really? Surely your data analyst doesn’t have the figure for professional modeling (at least the analyst writing this doesn’t) so what could they possibly be doing?

Put simply, a model is a depiction or description of behaviors, relationships, and/or processes found in the real world. These models are built using either mathematical or graphical representations of real-world relationships.

When data people start talking about models there are two broad categories of model they are likely referring to:

Data Model
Predictive Model

Data Models

A data model is a representation of the relationships that exist within or among data themselves and are created to help design information storage systems like databases to store facts about real-world processes. Within this category of model there are three different levels: conceptual, logical, and physical.

Conceptual models are high-level representations of relationships that exist between entities. An entity is simply an object, person, process, or concept that interacts with other objects, people, processes, or concepts. For example, a conceptual model describing the relationship between a major gift officer and a donor would look something like this:

This model is constructed using something called Chen notation. The rectangles represent entities, the diamonds represent relationships, and the M:N describe the relationship. In this case we have what is called a many-to-many relationship: meaning many donors can be solicited by the same gift officer, and a donor can be solicited by many different gift officers. These models can become very complex as they are built to describe real world processes:

The next level of data model is called a logical model. This shows the actual data structures in relationship to each other and is derived from the conceptual model. A simple logical model looks something like this:

Finally, the physical model actually maps out the location of each data point in a specific database system with the system names and something called a domain which among other things simply means an attribute’s data type (e.g. birthdays are dates, ages are integers, and names are text, etc.).

Having good data models that accurately reflect the reality of your processes and business rules is key to building efficient systems that make data both easy to enter and extract.

Predictive Models

Statistical Models

There are many different types of predictive model out there. Perhaps the most well-known are statistical models; the simplest belonging to the general linear model family. Most of us will have gone through at least one statistics course in our lives and will have hopefully encountered linear regression. Put simply, linear regression is a method of fitting a line to a set of data to best describe the relationships between the variables within that data to predict an outcome. Once the relationships have been described, we can use those descriptions to make predictions about the future. To illustrate this, I’m going to leave the world of fundraising and enter the world of diamonds with no justification other than to say that diamonds are shiny and fun to talk about.

Suppose I was looking to get engaged and wanted to buy a solitaire diamond ring. I don’t want to overspend, but I also want to get the best bang for my buck. Further, suppose I also had at my disposal a large data set describing attributes of many individual diamonds and their sale price in dollars.

I could use this information to build a model that would predict the price I should be prepared to pay given any combination of characteristics. There are many assumptions associated with linear regression which I won’t go into here, but let’s assume that for now I’m only interested in the relationship between the size of a diamond in carats and its price, and want to build a model that will predict a diamond’s price based on only its carat weight. Here’s where a data scientist (which is what we like to call ourselves) would turn to a software tool like SPSS or R to fit (fit is a fancy term for build) the model so we don’t have to do all of the math by hand. In R, I would write a piece of code that would take my two variables and produce a model:

Out of all this confusing output, I’m really only looking for a few key things. First, I want to see that the 'Pr(>|t|)' values are all below .05. R will let me know this by placing a ‘*’ next to the value. This simply means that the predicted values could not have happened by random chance. Next I want to see what the 'Adjusted R-squared' is; here it is .84 which means that the model explains 84% of the variability in the price. This is really good - by carat weight alone the model can account for the majority of the variability in price between the different diamonds. (I guess size really does matter.) Finally, I want to look at the 'Estimate Std.' which is what we refer to as a model coefficient. For carat the coefficient is 7756.43, which means that for every increase of one unit in carat weight – we should expect the price of the diamond to increase by $7,756.43.

All of this can be reduced to a simple equation:

y = a + bx

Diamond Price = -2256.36 + (7756.43 * Carat)

So, if I wanted to predict the price of a one carat diamond I would enter the number into the equation as follows:

5500.07 = -2256.36 + (7756.43 * 1)

According to the model, for a one carat diamond I should expect to pay about $5K. A quick Google search at the time of writing found that the average price of a loose one carat diamond on Google Shopping was $5,523. Not too bad!

Regression isn't just good for the price of diamonds; I could use regression to predict the number of gifts a gift officer will close in a given year based on several key performance indicators (KPI's) like the number of prospects they have in a given stage, how long those prospects have been in that stage, the number of in-person visits they've made this FY, the number of written solicitations they have delivered, etc.

Machine Learning

Another branch of predictive modeling is called machine learning. This type of modeling is often referred to as black box modeling as it is left to the computer to find relationships and build models that predict behavior, but these models often do not explain behavior. One such technique is called neural networking. Much like the synapses in our brain, nodes are created virtually that are then trained and used to predict an outcome. By training we mean that we provide the computer a test case where the outcome is known; for example we show it an image of an apple and tell it that it is looking at an apple. We then show it another image and it is allowed to determine if it is looking at an apple. If it answers correctly it reinforces the nodes that predicted the apple; if it answers incorrectly it readjusts the nodes and moves on to the next image. At the end of the exercise we are left with a series of nodes with numerical weights that don’t really explain much about what makes a picture an image of an apple (it won’t tell us that apples are round and red for example), but it will allow the computer to correctly identify an image of one.

All of these techniques can be applied to fundraising to predict donor behavior and provide insight into the relationships that exist between donors, institutions, gift officers, and philanthropic behaviors.

In Conclusion

Data is all around us and the way it is structured and then analysed is important for providing insight into the work that we do as fundraisers. Structuring our day-to-day business data in a way that supports our processes and then ensuring the quality of that data allows us to ask robust and sometimes complex questions with confidence. The tools and methods we use to answer those questions are rapidly evolving and the frontier for analytics in every industry, not just fundraising, is expanding and full of potential. So the next time you are entering your contact reports, remember, you are contributing to the models that are predicting the donors of tomorrow.

About the Author

John Gough is the Director of Reporting and Analytics in the Office of Development at the University of Texas at Austin. He is also on faculty at the University of Illinois' School of Information Science where he teaches graduate courses as an adjunct lecturer in database design and business analytics.

Rita Walters

VP of Development and Alum Engagement/Strategist/Leader/Visionary

7 年

Hi John, I would like to participate in your mini course over the next 12 months. As an accounting major in college, I actually took two stat classes; however, I am in need of a refresher! I’m particularly interested in predictive modeling and the relationship between close rates of portfolios with a large discovery mix. This is common in art schools.

Susan Hayes-McQueen

Assistant Vice President for Development Services at University of Washington

7 年

Great idea John P. Gough

Steven Kreytak

Leading the way on cloud data excellence, generative AI and BI and machine learning.

7 年

Very nicely done.

查看更多评论

要查看或添加评论，请登录

John P. Gough的更多文章

Failure and the Dignity of Work

2024年6月10日

Failure and the Dignity of Work

The first week of June holds several important anniversaries for me, and I hope you don’t mind if I allow myself to…

25 条评论
Ask an Analyst | What Database Should I Buy?

2018年4月15日

Ask an Analyst | What Database Should I Buy?

Do you work in fundraising? Have you ever wondered about what all those people in development/advancement services are…

1 条评论
The Value Proposition of Making Mistakes

2017年7月17日

The Value Proposition of Making Mistakes

In my course each semester I give an anonymous online survey around midterms to allow students the opportunity to…

5 条评论
The Future of Fundraising

2017年3月28日

The Future of Fundraising

Recently, I was asked to give a presentation discussing data in both the context of the world around us and the context…

6 条评论
How Does Data Analytics Work?

2017年3月11日

How Does Data Analytics Work?

Over the course of my career as a data analyst I have frequently found myself explaining what data analysis is and how…

2 条评论
Data & Wolf Packs | 24 Months as a Data Analyst in Advancement

2016年6月29日

Data & Wolf Packs | 24 Months as a Data Analyst in Advancement

It was a heavy summer morning, the kind which portents rain through the already substantial auroral heat. I was headed…

15 条评论

See all articles

Ask an Analyst | What is a Model?

John P. Gough

Assistant Vice President, Editor of the Journal of Advancement Analytics

What is a Model?

Data Models

Predictive Models

In Conclusion

About the Author

John P. Gough的更多文章

社区洞察

其他会员也浏览了

Do You Know the Differences Between Business Analytics and Data Analytics?

How to be a Data Analytics: A Roadmap

Unlocking Insights: Top Virtual Assistant Services for Data Analysis

Common Tools Used in Data Analysis: An Overview

MUST-HAVE DATA ANALYST SKILLS

Building Skills for the Future: A Full-Cycle Guide for Data Analytics

Business Analyst vs Data Analyst: Key Differences, Skills, and Career Paths

Advancing Your Career Within Analytics

Mastering Statistical Functions DAX Power BI

The Importance of Data & MS Excel in Modern Business

What is a Model?

Data Models

Predictive Models

In Conclusion

About the Author

John P. Gough的更多文章

Failure and the Dignity of Work

Ask an Analyst | What Database Should I Buy?

The Value Proposition of Making Mistakes

The Future of Fundraising

How Does Data Analytics Work?

Data & Wolf Packs | 24 Months as a Data Analyst in Advancement

社区洞察

其他会员也浏览了

Do You Know the Differences Between Business Analytics and Data Analytics?

How to be a Data Analytics: A Roadmap

Unlocking Insights: Top Virtual Assistant Services for Data Analysis

Common Tools Used in Data Analysis: An Overview

MUST-HAVE DATA ANALYST SKILLS

Building Skills for the Future: A Full-Cycle Guide for Data Analytics

Business Analyst vs Data Analyst: Key Differences, Skills, and Career Paths

Advancing Your Career Within Analytics

Mastering Statistical Functions DAX Power BI

The Importance of Data & MS Excel in Modern Business