Major Industry Classification Systems around the World – Part 2
This is part of my series on industry/vertical classification. In my previous article, Major Industry Classification Systems around the World – Part 1, I discussed how major classification systems around the world group industries. In this article, let’s get to the hard part – how to classify an entity.
Who Classifies Companies
Perhaps the biggest myth out there is there there is a "standard official" assignment of industry codes to organizations (this is the case even among some market researchers who focusing on vertical/industry research.) This is simply not true!
See below excerpt from the US Census Bureau:
"There is no central government agency with the role of assigning, monitoring, or approving NAICS codes for establishments. Different agencies maintain their own lists of business establishments to meet their own programmatic needs. These different agencies use their own methods for assigning NAICS codes to the establishments on their lists.
Statistical agencies assign one NAICS code to each establishment based on its primary activity. For example, the Social Security Administration assigns a NAICS code to new businesses based on information provided on their application for an Employer Identification Number. The Census Bureau generally assigns NAICS codes to businesses on its list of establishments based on information provided by the business on a survey or census report form. The Bureau of Labor Statistics initially assigns NAICS codes based on business activity information provided on an application for unemployment insurance."
(From census.gov/NAICS website & cited in 2017 NAICS Manual, page 77)
I have also contacted various US government statistic agencies personally, and their official responses were unanimously pointing to the same paragraph in the NAICS manual.
?
Guiding Principles of Industry Classification
Major classification systems provide rules, or principles, to help guide the classification process.
The common themes in sensibly classifying an organization's industry assignment are:
Mutual exclusivity means that a single entity cannot be classified under more than one industry code.
?Specificity
Classification should be specific across the board. If one classifies firms at the 4-digit level, then he/she should classify all firms at the 4-digit level, not some at 2 digits, some at 3 digits, and some at 4 digits.
The older system, SIC, technically does not codes with "0" at the end, such as 1000, 2000, 2030, 5900, 7990, etc. These codes are sometimes present in SIC lists but they are there mainly to help organize the industry hierarchy or for computer programming purposes, not to be assigned to individual firms. This is why external data providers like DnB do not have records with SICs ending with zeros in the 4th digit.
Other systems do allow the last digit being zero but usually only because there are no sub-levels below them. See below excerpt below:
"Whenever a given level of the classification is not divided into categories of the next more detailed level of classification, “0” is used in the code position for the next more detailed level. For example, the code for the group “Other personal service activities” is 960 since the division “Other personal service activities” (code 96) is not divided into groups. Again, the code for the class “Manufacture of furniture” is 3100 because the division “Manufacture of furniture” (code 31) is divided neither into groups nor into classes. The class “Manufacture of pulp, paper and paperboard” is coded as 1701 since the division “Manufacture of paper and paper products” (code 17) is not divided into groups but the group “Manufacture of paper and paper products” (code 170) is divided into classes." (ISIC Revision 4, pg. 12)
Statistical Unit
Statistical unit refers to one member of a set of entities being studied. In this case, it is the immediate organization being classified. It can be the physical establishment of the studied firm's or agency's headquarters location, or it can be all establishments owned or controlled by the entity. But regardless, it needs to be per-defined and consistent across all entities being classified. For example, one should not classify one entity using one method but another with a different method. See below for an example of how the NACE manual define statistical unit.
(Source: NACE Rev. 2 – Statistical Classification of Economic Activities in the European Community)
How to Determine Primary Economic Activity
Due to mutual exclusivity and specificity, the largest challenge to using production-oriented industry classification systems is how to determine a firm's primary economic activity. Market-oriented systems, such as GICS, sometimes have additional industry codes for "diversified activities", but SIC, NAICS, ISIC, and NACE systems require making independent judgements on what the primary economic activity of a firm is.
Principal vs Secondary vs Ancillary Activities
Most large firms have combined activities; therefore, not all their activities can fit into just a single industry code.
There are two types of combined activities:
Most classification systems offer some guidelines to help define these activities and determine "principal activity", which should be the primary activity based on which the primary industry should be assigned.?For example, the NACE rev 2 manual clearly states that:
"The principal activity of a statistical unit is the activity which contributes most to the total value added of that unit. The principal activity is identified according to the top-down method (see section 3.1) and does not necessarily account or 50% or more of the unit’s total value added.
A secondary activity is any other activity of the unit, whose outputs are goods or services which are suitable for delivery to third parties. The value added of a secondary activity must be less than that of the principal activity.
An activity is ancillary if it fulfills all the following conditions:
(Source: NACE rev 2 Statistic Classification of Economic Activities)
NACE and ISIC are mostly identical in defining these terms. The American system, NAICS, purposedly stays vaguer to leave "classifiers" more room for independent judgement. But the underlying principle is the same:
领英推荐
"In most cases, if an establishment is engaged in more than one activity, the industry code is assigned based on the establishment's principal product or group of products produced or distributed, or services rendered. Ideally, the principal good or service should be determined by its relative share of current production costs and capital investment at the establishment. In practice, however, it is often necessary to use other variables such as revenue, shipments, or employment as proxies for measuring significance." (2017 NAICS Manual, page 23)
However, none of the classification systems advocate "simply adding up" revenues/assets/people from different activities from all business units. For example, if a firm makes and distribute car parts as well as electronic components, one cannot just simply add up manufacturing and wholesale/logistics activities from both divisions to determine principal activity based on that because the firm may be wrongly classified as a wholesaler or logistics company.
"One approach to classifying these activities would be to use the primary activity rule, that is, whichever Activity is largest. However, the fundamental principle of NAICS is that establishments that employ the same production process should be classified in the same industry." (2017 NAICS Manual, page 23)
The NAICS manual further offered an example:
"For example, paper may be produced either by establishments that first produce pulp and then consume that pulp to produce paper or by those establishments producing paper from purchased pulp. NAICS explicitly specifies that both of these types of paper-producing processes should be classified in NAICS 32212, Paper Mills, the final step in paper manufacturing, rather than in NAICS 32211, Pulp Mills." (2017 NAICS Manual, page 23)
In Appendix A: How to Determine Primary SIC of Kimberly-Clark, we have provided an actual example of how to apply this principle properly instead of relying blindly on the firm's self-reporting (to SEC), which clearly violates the principal/primary activity principle.
To address this, NACE and ISIS provide more prescriptive guidance, which shall be discussed below.
The Top-Down Approach
Most systems use the top-down approach to determine primary economic activity of an entity with combined activities. While NAICS's manual gives more flexibility to allow the "classifier" more room to make sensible judgement calls, as compared to ISIC and NACE, the top-down principle is the same across all.
The top-down method looks at all the value-added activities (valued added is defined as the difference between output and intermediate consumption; it is an additive measure of the contribution of each economic unit to GDP – valued added is one method of computing GDP; it helps to differentiate retail from wholesale or manufacturing, for example).
The method uses each activity's share in the firm's total activity to classify the very top level of industry groups (i.e., is it more in manufacturing or services or resources), and then work its way down to the next level (is it more in agriculture or mining), till the most granular level is determined.
Here is an example from the NACE manual:
"The correct class is 28.93 – manufacturing of machinery for food, beverage, and tobacco processing, although the class with the biggest share of value added is class: 46.61 Wholesale of agricultural machinery, equipment, and supplies."
Source: NACE Rev. 2 – Statistical Classification of Economic Activities in the European Community
Classifying Industry in Practice
Of course, in practice, it is unrealistic to undergo this thorough exercise for all entities. It is impossible, even for organizations as vast as US Census or Eurostat, to access company data at that level granularity for all companies. Therefore, practical industry assignment for a large company dataset, especially for smaller research firms, relies largely on multiple sources:
On Using Third-Party Data Source
External data providers usually have large data teams and well documented process.
For example, DnB's industry team has hundreds if not thousands of professionals who are well trained to comb through the data to determine primary activities by applying the primary activity principle consistently. Their data dictionary is large and comprehensive – over 13 thousand records to describe their fields in the API.?However, given the vast number of firms (and their subsidiaries), no external data providers have the "perfect data-set." It is still ‘best effort’ basis.
Additionally, different providers have custom methodologies (such as how to define statistical units), which may not be appropriate for your purpose. Also, integrating data from different providers is also difficult. To best leverage third party industry data, it requires you to fully understand their methodologies.
On Using Self-Reporting
Self-reporting can be a valuable data source because the filing firms are familiar with their own financials. However, it also has major challenges.
First of all, while professional "industry classifiers" apply the principal activity methodology more consistently, individual firms' finance departments may not know how to do it properly. Secondly, there are no real incentives to report accurately or update timely. Thirdly, the agencies to whom firms report to have different objectives and priorities.
For example, there are no real legal consequences for filing one's industry classification wrongly by the SEC – the agency does not publish industry reports and its scrutiny lies largely with misrepresenting financial data such as revenue, expense, etc.
For example, if we look at the five companies listed below (all publicly traded), the following firms' self-reported SICs to the SEC are correct:
But the following firms' self-reported SICs are not:
The US Census bureau and Labor Department use a more voluntary self-reporting survey approach, but because the economists spent more time developing long-term relationships with large firms and thoroughly explain to them the nuances of the methodology, their industry data are likely to be more accurate (unfortunately, these agencies do not disclose such data at the firm level – there is a site affiliated with the US Census Bureau www.naics.com but it is not considered an official government site).
Other statistical agencies, for example in Europe and China, do have the power to require firms to "self-report" industry related data; however, they also run into data reliability issues – when you are "required" to self-report but no real incentives to report accurately and honestly, data quality becomes highly questionable.
For example, IT firms in China are often required to self-report to different government agencies' statistical offices (at different local and province levels) about their economic activities in software/services and sometimes revenues in hardware; many just end up giving their entire revenue to all the surveys. When the numbers are tallied and published blindly without due diligence, there can be massive double counting and triple counting (not all economic data published in China go through their national statistics bureau, which have more standard methodology.) Therefore, when looking at economic data, source matters.
In summary, good industry classification always requires some level of vetting and exercise sound judgment. There is also no silver bullet. At best, it is "best effort" guided by sound principles.