This briefing paper recommends the introduction of a fifth classification level in the ASCL to account for regional and dialectal language variants. The proposed expansion aims to enhance the ASCL’s capacity to capture linguistic diversity within languages, aligning with user needs for more granular data on language use. This addition would improve the ASCL's ability to reflect cultural identity and regional linguistic nuances, offering more precise data for policymakers, educators, and researchers. The appendix provides an expanded classification structure for several languages, including Ukrainian, Hungarian, Greek, Italian, German, French, Arabic, Spanish, English, and Chinese dialects (e.g., Teochew, Hokkien, Mandarin, Yue/Cantonese), detailing regional variants that may be mutually exclusive.
The ASCL currently utilises a four-level classification system that adequately categorises most languages but needs more granularity to capture regional and dialectal variations within widely spoken languages. This limitation affects not only major languages with extensive global dispersion but also Indigenous and local languages that possess unique variants significant to cultural identity and usage patterns.
As language use in Australia continues to diversify, the demand for detailed, region-specific language data has grown. Service providers, educational institutions, and cultural organisations require this data to design targeted programs, allocate resources, and foster inclusivity for various linguistic communities. Expanding the ASCL to include a fifth level focused on regional variants would offer ABS a more nuanced approach, aligning with international standards while addressing Australia’s unique multilingual landscape.
The objectives of adding a fifth level to the ASCL are to:
- Enhance Granularity: Provide a structured, detailed classification of regional variants and dialects within major and minor language groups, improving data accuracy and usability.
- Promote Inclusivity: Recognise diverse linguistic identities within broader language groups, particularly among diaspora communities and Indigenous populations.
- Support Policy and Resource Allocation: Acknowledge the linguistic needs of specific communities to enable policymakers to make data-driven decisions, particularly in education, healthcare, and community services.
- Maintain International Consistency: Ensure compatibility with international classification standards, notably ISO-639, while accommodating Australia-specific classifications.
Proposed Structure for the Fifth Level
The fifth level, termed “Variant”, would be applied selectively within ASCL’s existing structure to capture significant regional, dialectal, or sociolect distinctions. It would use a 10-digit code to build upon the current four levels, structured as follows:
- Language Family Group (2-digit): Groups languages by genetic affinity (e.g., Indo-European Languages).
- Sub-Family Group (4-digit): Differentiates major branches within families (e.g., Romance Languages).
- Narrow Group (6-digit): Further divides languages into narrower groupings, often based on location.
- Language (8-digit): Represents individual languages (e.g., Spanish).
- Variant (10-digit): Specifies unique regional, dialectal, or sociolectal variants within a language (e.g., Mexican Spanish, Hong Kong Cantonese).
Implementation Considerations
Introducing a fifth level requires careful design to ensure clarity and consistency across ASCL users. Implementation considerations include:
- User Education: Providing guidance for ABS staff, government agencies, and other users on identifying and coding language variants.
- Data Collection: Modifying data collection instruments (e.g., Census forms, surveys) to capture language variants accurately without overwhelming respondents.
- International Compatibility: Aligning with ISO-639 classifications to allow international comparison while offering granularity unique to Australian needs.
- Database and System Updates: Ensuring that ABS and related systems can accommodate an additional classification level.
Benefits of a Fifth Level for Regional Language Variants
- Enhanced Data Accuracy: Differentiating language variants allows more precise capture of linguistic diversity, especially within multicultural and multilingual communities.
- Improved Service Delivery: Detailed language data enables tailored service provision, such as language-specific education materials, interpreter services, and healthcare communication.
- Cultural Recognition and Inclusivity: Recognising language variants acknowledges the distinct identities and cultural affiliations within linguistic communities, fostering inclusivity.
- Complexity and Overhead: Introducing a fifth level increases classification complexity, which could require additional training and adjustment for users.
- Data Consistency: Ensuring consistent data collection and reporting across regions may be challenging due to the variability in dialect recognition.
- Respondent Burden: Differentiating language variants may require additional guidance or clarification to avoid response fatigue or confusion.
Conclusion and Recommendations
The proposed fifth level for regional language variants aligns the ASCL with Australia’s evolving linguistic landscape, addressing the demand for more granular language data. ABS is recommended to adopt this structure as an optional level within the ASCL, allowing users to apply it selectively based on need. This approach ensures alignment with international standards while catering to local requirements. A pilot study or consultation with key stakeholders (e.g., linguists and community groups) is recommended to refine this classification approach before full implementation.
Appendix: Proposed Expanded Examples with Regions of Primary Use and Explanatory Notes
- Language Family Group: 13 Indo-European Languages
- Sub-Family Group: 1313 East Slavic
- Narrow Group: 131311 Ukrainian-Russian
- Language: 13131101 Ukrainian
- Variant Level: 1313110101 Standard Ukrainian – Used in Ukraine and by Ukrainian diaspora worldwide. 1313110102 Carpathian Ukrainian – Used in Western Ukraine, especially the Zakarpattia region, and among the Eastern European diaspora. 1313110103 Canadian Ukrainian – Spoken in Canada among the Ukrainian diaspora, mutually exclusive with other Ukrainian variants.
Notes: Canadian Ukrainians have developed unique lexical influences over time, reflecting the distinct culture of the Ukrainian-Canadian community. This variant is mutually exclusive from other forms.
- Language Family Group: 21 Uralic Languages
- Sub-Family Group: 211 Finno-Ugric
- Narrow Group: 2111 Ugric
- Language: 211101 Hungarian
- Variant Level: 2111010101 Standard Hungarian – Predominantly spoken in Hungary and globally by Hungarian speakers. 2111010102 Székely (Transylvanian Hungarian) – Spoken in Transylvania, Romania. 2111010103 Csángó – Used in Romanian Moldova and among Hungarian diaspora, mutually exclusive.
Notes: Csángó is linguistically distinct from Standard Hungarian and represents a cultural heritage specific to certain Romanian regions.
- Language Family Group: 13 Indo-European Languages
- Sub-Family Group: 1311 Hellenic
- Narrow Group: 131111 Greek
- Language: 13111101 Greek
- Variant Level: 1311110101 Modern Greek – Spoken in Greece, Cyprus, and by Greek-speaking diaspora globally. 1311110102 Cypriot Greek – Primarily used in Cyprus, mutually exclusive with Modern Greek. 1311110103 Pontic Greek – Used in Northern Greece and by diaspora in Russia and Turkey.
Notes: Cypriot Greek has unique phonological features, and Pontic Greek is distinct in vocabulary, reflecting a historical heritage within the Greek community.
- Language Family Group: 13 Indo-European Languages
- Sub-Family Group: 1311 Romance Languages
- Narrow Group: 131111 Italic
- Language: 13111101 Italian
- Variant Level: 1311110101 Standard Italian – Spoken throughout Italy and among the global Italian diaspora. 1311110102 Sicilian – Primarily in Sicily and Italian communities in Australia and North America. 1311110103 Venetian – Used in Veneto, northern Italy. 1311110104 Calabrese – Spoken in Calabria, southern Italy.
Notes: Italian dialects like Sicilian and Calabrese have unique linguistic identities within Italian communities abroad. Each variant is exclusive because of distinct vocabulary and grammar.
- Language Family Group: 13 Indo-European Languages
- Sub-Family Group: 1311 Germanic
- Narrow Group: 131111 West Germanic
- Language: 13111101 German
- Variant Level: 1311110101 Standard German – Spoken in Germany, Austria, Switzerland, and globally. 1311110102 Bavarian – Used in Bavaria (Germany) and Austria. 1311110103 Swiss German – Primarily in Switzerland, mutually exclusive with other German forms. 1311110104 Low German – Northern Germany and parts of the Netherlands.
Notes: Swiss German, a dialect continuum distinct from Standard German, is mutually exclusive and widely recognised as culturally significant within Switzerland.
- Language Family Group: 13 Indo-European Languages
- Sub-Family Group: 1311 Romance Languages
- Narrow Group: 131111 Gallo-Romance
- Language: 13111101 French
- Variant Level: 1311110101 Standard French – Used in France and worldwide by French-speaking communities. 1311110102 Canadian French (Québécois) – Mainly spoken in Quebec, Canada, mutually exclusive with Standard French. 1311110103 Caribbean French – Used in French-speaking Caribbean nations such as Haiti and Guadeloupe. 1311110104 African French – Common in West and Central Africa, former French colonies.
Notes: Canadian French has distinct phonetic features and vocabulary differences from Standard French, while local languages and unique cultural contexts influence African and Caribbean variants.
- Language Family Group: 22 Afro-Asiatic Languages
- Sub-Family Group: 221 Semitic Languages
- Narrow Group: 2211 Arabic
- Language: 221101 Arabic
- Variant Level: 2211010101 Modern Standard Arabic – Used universally in formal and media contexts across the Arab world. 2211010102 Egyptian Arabic – Predominantly spoken in Egypt. 2211010103 Levantine Arabic – Used in Lebanon, Syria, Jordan, and Palestine. 2211010104 Gulf Arabic – Common in Saudi Arabia, UAE, Kuwait, and Oman. 2211010105 Maghrebi Arabic – Spoken in North African countries such as Morocco, Algeria, Tunisia, mutually exclusive with other Arabic dialects.
Notes: Each variant is mutually exclusive and varies significantly in vocabulary and syntax. Modern Standard Arabic serves as a formal lingua franca but differs from colloquial dialects.
- Language Family Group: 13 Indo-European Languages
- Sub-Family Group: 1311 Romance Languages
- Narrow Group: 131111 Iberian Romance
- Language: 13111101 Spanish
- Variant Level: 1311110101 Castilian Spanish – Used primarily in Spain; mutually exclusive with other variants. 1311110102 Mexican Spanish – Dominant Spanish variant in Mexico and the U.S. 1311110103 Caribbean Spanish – Spoken in Cuba, Puerto Rico, Dominican Republic. 1311110104 Rioplatense Spanish – Common in Argentina and Uruguay.
Notes: Variants like Mexican and Caribbean Spanish show phonological and lexical distinctions. Rioplatense is notable for its unique voseo conjugation, making each variant mutually exclusive.
- Language Family Group: 13 Indo-European Languages
- Sub-Family Group: 1311 Germanic
- Narrow Group: 131111 West Germanic
- Language: 13111101 English
- Variant Level: 1311110101 British English – United Kingdom and some Commonwealth countries, mutually exclusive. 1311110102 American English – Primarily in the United States and parts of Canada. 1311110103 Australian English – Used in Australia. 1311110104 Canadian English – Canada, with shared features from British and American English. 1311110105 New Zealand English – Used in New Zealand. 1311110106 Indian English – India, South Asia. 1311110107 Irish English – Used in Ireland; mutually exclusive. 1311110108 South African English – South Africa. 1311110109 Singapore English (Singlish) – Common in Singapore. 1311110110 Philippine English – Spoken in the Philippines. 1311110111 Caribbean English – Jamaica, Trinidad and Tobago. 1311110112 Malaysian English (Manglish) – Used in Malaysia. 1311110113 Nigerian English – Nigeria.
Notes: Variants such as British and American English are mutually exclusive due to lexical and phonological differences. Varieties like Singlish and Manglish incorporate local influences.
- Language Family Group: 33 Sino-Tibetan Languages
- Sub-Family Group: 331 Chinese Languages
- Narrow Group: 3311 Chinese Varieties
- Language: 331101 Mandarin
- Variant Level: 3311010101 Standard Mandarin (Putonghua) – China, Singapore, Taiwan. 3311010102 Taiwanese Mandarin – Primarily in Taiwan.
- Language: 331102 Yue (Cantonese)
- Variant Level: 3311020101 Standard Cantonese (Guangzhou) – Guangdong Province, China. 3311020102 Hong Kong Cantonese – Hong Kong; mutually exclusive. 3311020103 Macau Cantonese – Macau. 3311020104 Overseas Cantonese – In Southeast Asia (Malaysia, Singapore, Thailand) and Western diaspora communities.
- Language: 331103 Min
- Variant Level: 3311030101 Hokkien – Taiwan, Fujian, Singapore, Malaysia. 3311030102 Teochew – Chaoshan, Thailand, Cambodia, Malaysia. 3311030103 Fuzhou – Fuzhou, Fujian. 3311030104 Hainanese – Hainan Island.
Notes: Cantonese variants like Hong Kong Cantonese are mutually exclusive, reflecting unique regional pronunciations. Hokkien and Teochew are predominant in Southeast Asia among Chinese communities.