The Great Game: Create an Arabic Study App with Power Platform and SQL Server
Overview
Latin and Greek word roots: the elixir for English vocabulary tests in high school. With their help, I avoided the parental gaze. Those same roots proved useful while studying Romance languages; familiar pieces fit, and words came alive with these tiny clues to their meaning.
Word roots are also present in the Arabic language. The differences between English and Arabic are clear, sure, but the concept is more or less the same in both systems: a grouping of letters express a general meaning or concept. These grammatical arrangements are the foundation of some languages and can act as an aid for new learners to understand a new written system.
Yet all these years after my first Arabic class, it remains a challenge to find a searchable *and* interactive list of word roots and their meanings. Latin and Greek learners can eat their heart out but there are not many tools for searching and learning this fundamental information.
Still, countless resources for learning Arabic are available via traditional means and on the web. A searchable list and vocabulary companion must be among them. Not necessarily, and therein awaits an opportunity.
Further, the tools for managing data and building applications are much different now. Far more is possible in a fraction of the time with advents such as Power Query, SQL Server and a Power App. A lot has happened since, oh, the mid-2000s ??
This leap in capability lets us deliver a product in less time, effort, and cost. That product could, perhaps, help a student learn faster, thus improving their education. Superior value.
“When there is wind,” the saying goes, “Ride it.” It’s about that time.
/**/
This article will demonstrate how to create a Power App using SQL Server data for use as an aid in studying a foreign language.
Caveats
A few things to know before you go.
Not so bad.
Prerequisites
Basic “system requirements:”
Not too terrible.
Steps to Implement
Building this app follows a general process.
Power Query transforms and cleans source data. Excel and SQL Server are also options.
SQL Server stores the transformed data. If a study activity is desired, SQL scripts will be necessary for logging activity.
A Power App loads the data from SQL Server. Ideally, Power Fx will handle most UI functions, with some interaction with the SQL database.
The app handles search criteria based on user input, thus returning filtered results.
Power BI connects to SQL Server to display the scores collected by users. A simple report to start at first but the sky is the limit here.
The Arabic Alphabet 101
Arabic serves as the official language in 23 countries. Modern Standard Arabic functions as the standardized variant, commonly referred to as "book Arabic," prevalent in academic literature, legal documents, mass media, and textbooks. Complementing this, Classical Arabic stands as a literary expression and holds sacred significance. Divine revelations were communicated to Prophet Muhammad in Arabic, rendering the Qur'an, the cornerstone of Islam, as God's message in written form.
All things considered the number of speakers and readers exceeds 1 billion people around the world.
A Semitic language, written Arabic reads from right to left.
Numbers, however, read left to right.
Numerals exist, of course.
The Arabic alphabet contains 28 characters total, and some count this character to make it 29.
Characters are synonymous with Western letters, its' Romanized counterparts. In Arabic script, they connect to each other to form words.
The shape of a character changes based on its’ position in the word.
This means there are four different shapes for each character.
Take ? (baa) for example, the equivalent to the letter b in the Roman alphabet. This character enjoys a full range of connecting ability.
Only 6 characters in the alphabet have constraints on connecting to other letters.
The rules for these characters are few in number.
1.)??They can stand separately from other characters in the word (aka isolated).
2.)??They can connect to another character using their final position.
3.)??They cannot, however, connect to the next character in the word. The character must take its’ final shape (without needing to be the last letter).
Examples of these rules in action, in the above order:
alif (?)
???
???
???
dal (?)
??
???
????
dhal (?)
???
???
????
raa (?)
???
???
???
zay (?)
???
???
???
waw (?)
???
?????
???
Many alphabets are available detailing how letters connect. Use one that best reinforces the concept for you.
Of course, these rules are general, and any sort of special cases or rules will become apparent as you learn.
Word Roots
Words in Arabic are based on 3- or 4-letter groupings, or roots. Like Latin and Greek roots, these tri- and quadrilateral groupings hold a general meaning or concept. Unlike Latin and Greek roots, however, Arabic word roots are consonants only.
These little building blocks of words in the language.
Forms
If a word root bears a general meaning, a form gives it exact meaning.
Think of a root as a tree and forms are the branches. Each form/branch has its own spelling pattern. A change to the spelling pattern creates a variation on a roots’ meaning; this creates a related word.
There are ten commonly used forms, with Form I as the basic meaning. The verb ??? has a root of three base letters – f-ae-l – and generally means to do.
Broken out in numbered format:
? F = First letter
? M = Medial letter
? L = Final letter
Let’s take a different word. In its’ purest form – Form I – the verb ??? (j-m-ae) means to collect, gather. In a different form – Form III – ???? (j-mm-ae), the meaning morphs: to amass, to accumulate.
One root takes many forms, and each form provides a different shade of meaning.
For a student, the root system provides an efficient way to acquire vocabulary. A searchable collection of word roots would be fast and support fundamentals, making it a valuable reference service.
Vocalization and Diacritical Marks
Vocalization and diacritical markings really tie things together.
Officially, the markings in Arabic words are known as tashkīl – dashes, dots, loops, and squiggly lines that affect pronunciation. When a subset of markings known as ?arakāt, or short vowels, accompany a word, this is known as vocalization.
Vowels are the equivalent to the Western letters a, i, and u. Two kinds exist, short and long. Short vowels take the form of “dashes and loops,” with proper names of course.
??= fatHa (a, as in car)
? ?= kasra (i, as in ship)
? ?= damma (u, as in broom)
The short vowels can be elongated as well, where the a, i, and u are simply doubled.
It is worth nothing that most Arabic is unvoweled. Vowels are training wheels and will be replaced by context and knowledge of grammar.
Vocalization affects more than vowels. Certain marks indicate pronunciation and stress yet are not vowels but rather consonants.
A tiny “w” is a shadda. This represents a doubled consonant: array, yell, apple, shaddah.
The little “o” is a sukūn, which signifies two consonants in a row with no vowel: heart, ticket, jump.
Full bird vocalized script:
Unvocalized script:
Vocalized script serves as another aid in becoming familiar with the written language, leading to comprehension at a faster pace.
These are the very basics to the written language; broad strokes make us functional, though it does get more involved.
Word Root Data Sources
Three major data sources have been consulted to form a list of Arabic word roots.
Wiktionary has two sources, Arabic Roots and Quranic Verbs (sic). A third source, GitHub Arabic Roots, contains a word root list that is an output of the SAMA analyzer, which is the work of Tim Buckwalter from the University of Maryland.
Only one source was chosen for now, Wiktionary Arabic Roots, as it is the most straightforward and complete. Some findings:
The additional sources will be added later. We have our base data set with a direct route to a meaning/definition.
Power Automate
Extraction data from Wiktionary is not an easy task, apparently. Power Automate helps us immensely but there are still some challenges based on page structure and consistency.
When a root has one or many definitions, certain terms are used. Same goes for roots that have a page but no translation.
Root with definition:
Root with no definition:
Our flow looks for particular phrases and parses the content for display in the app.
The most important part of our flow:
FYI: This does not work 100% of the time. An action serving as an “exit ramp” has been added as a parallel branch to set a default message when a particular phrase is not found.
Power Query
Power Query made quick work of the data source to prepare it for migration to SQL Server. This app seemed to be the best bet as a transformation tool, given its’ variety of parsing functions.
A table containing characters was created along with space for detail.
Once the transformations were complete, the finished data set was migrated to a SQL database.
SQL Server
All app data is stored in SQL Server. Transformed data was imported into a few database tables, whose structure is outlined below. Considerations around character representation needed to be made.
Data Model
The start of our data model:
This is a simple enough approach that will serve different purposes.
The SQL is also simple enough.
--Arabic characters
--Borrowed from MARC Standards at the Library of Congress:
--https://www.loc.gov/marc/specifications/codetables/BasicArabic.html
CREATE TABLE [dbo].[ArabicCharacters](
[ArabicCharacterID] [int] IDENTITY(1,1) NOT NULL,
[MARC8_G0] [nvarchar](16) NOT NULL,
[MARC8_G1] [nvarchar](16) NOT NULL,
[UCS_Unicode16] [nvarchar](16) NOT NULL,
[UCS_UTF8] [nvarchar](16) NOT NULL,
[ArabicCharacter] [nvarchar](16) NOT NULL,
[CharacterDefinition] [nvarchar](100) NULL,
[EnglishEquivalent] [nvarchar](10) NULL,
[Pronunciation] [nvarchar](25) NULL,
[LetterNameAudioLink] [nvarchar](255) NULL,
[PronunciationAudioLink] [nvarchar](255) NULL,
[CharacterOrder] [int] NULL,
[IPA_Pronunciation] [nvarchar](25) NULL,
[IsBaseCharSet] [bit] DEFAULT ((1)) NULL,
[InitialPosition] [nvarchar](10) NULL,
[MedialPosition] [nvarchar](10) NULL,
[FinalPosition] [nvarchar](10) NULL,
[IsolatedPosition] [nvarchar](10) NULL,
[IsActive] [bit] DEFAULT ((1)) NULL,
[WhoCreated] [nvarchar](50) NULL,
[WhenCreated] [datetime] DEFAULT (getdate()) NULL,
[WhoUpdated] [nvarchar](50) NULL,
[WhenUpdated] [datetime] DEFAULT (getdate()) NULL,
?CONSTRAINT [PK_ArabicCharacters] PRIMARY KEY CLUSTERED?
(
[ArabicCharacterID] ASC
)
)
A simple table of Arabic word roots is our main data source. Note the collation specified inline. It is not needed, as we are using the nvarchar data type but demonstrates that collation may be applied at this level.
--Arabic word roots
CREATE TABLE [dbo].[WikiArabicRoots](
[RootID] [int] IDENTITY(1,1) NOT NULL,
[Source] [nvarchar](255) NULL,
[SourcePage] [int] NULL,
[CharacterOrder] [int] NULL,
[RootGroup] [nvarchar](255) NULL,
[InitialLetter] [nvarchar](100) NULL,
[MedianLetter] [nvarchar](255) NULL,
[MedianLetter2] [nvarchar](255) NULL,
[FinalLetter] [nvarchar](255) NULL,
[Root] [nvarchar](255) NULL,
[Definition] [nvarchar](max) NULL,
[RootType] [nvarchar](255) NULL,
[SourceLink] [nvarchar](255) NULL,
[IsActive] [bit] DEFAULT ((1)) NULL,
[WhoCreated] [nvarchar](50) NULL,
[WhenCreated] [datetime] DEFAULT (getdate()) NULL,
[WhoUpdated] [nvarchar](50) NULL,
[WhenUpdated] [datetime] DEFAULT (getdate()) NULL,
?CONSTRAINT [PK_WikiArabicRoots] PRIMARY KEY CLUSTERED?
(
[RootID] ASC
)
)
Additional tables accommodate a simple multiple-choice drill game.
--User table
CREATE TABLE [dbo].[UserDetails](
[UserID] [int] IDENTITY(1,1) NOT NULL,
[Username] [nvarchar](25) NOT NULL,
[UserEmail] [nvarchar](50) NOT NULL,
[UserFirstName] [nvarchar](50) NULL,
[UserLastName] [nvarchar](50) NULL,
[IsLockedOut] [bit] DEFAULT 0 NULL,
[IsConfirmed] [bit] DEFAULT 0 NULL,
[IsActive] [bit] DEFAULT ((1)) NULL,
[WhoCreated] [nvarchar](50) NULL,
[WhenCreated] [datetime] DEFAULT (getdate()) NULL,
[WhoUpdated] [nvarchar](50) NULL,
[WhenUpdated] [datetime] DEFAULT (getdate()) NULL,
?CONSTRAINT [PK_UserDetails] PRIMARY KEY CLUSTERED?
(
[UserID] ASC
)
)
CREATE TABLE [dbo].[UserRootList](
[UserRootListID] [int] IDENTITY(1,1) NOT NULL,
[UserID] [int] NOT NULL,
[RootListLabel] [nvarchar](25) NOT NULL,
[IsDefaultList] [bit] DEFAULT 0 NULL,
[IsActive] [bit] DEFAULT ((1)) NULL,
[WhoCreated] [nvarchar](50) NULL,
[WhenCreated] [datetime] DEFAULT (getdate()) NULL,
[WhoUpdated] [nvarchar](50) NULL,
[WhenUpdated] [datetime] DEFAULT (getdate()) NULL,
?CONSTRAINT [PK_UserRootList] PRIMARY KEY CLUSTERED?
(
[UserRootListID] ASC
)
)
CREATE TABLE [dbo].[UserRootListItem](
[UserRootListItemID] [int] IDENTITY(1,1) NOT NULL,
[UserRootListID] [int] NOT NULL,
[RootID] [int] NOT NULL,
[IsActive] [bit] DEFAULT ((1)) NULL,
[WhoCreated] [nvarchar](50) NULL,
[WhenCreated] [datetime] DEFAULT (getdate()) NULL,
[WhoUpdated] [nvarchar](50) NULL,
[WhenUpdated] [datetime] DEFAULT (getdate()) NULL,
?CONSTRAINT [PK_UserRootListItem] PRIMARY KEY CLUSTERED?
(
[UserRootListItemID] ASC
)
)
CREATE TABLE [dbo].[StudySession](
[StudySessionID] [int] IDENTITY(1,1) NOT NULL,
[UserID] [int] NOT NULL,
[OptionID] [int] NOT NULL,
[OptionCategory] [nvarchar](15) NOT NULL,
[RootID] [int] NOT NULL,
[IsCorrect] [bit] NULL,
[StudySessionGUID] [nvarchar](50) NULL,
[IsSkipped] [bit] DEFAULT ((0)) NULL,
[UserAnswer] [int] NULL,
[IsActive] [bit] DEFAULT ((1)) NULL,
[WhoCreated] [nvarchar](50) NULL,
[WhenCreated] [datetime] DEFAULT (getdate()) NULL,
[WhoUpdated] [nvarchar](50) NULL,
[WhenUpdated] [datetime] DEFAULT (getdate()) NULL,
?CONSTRAINT [PK_StudySession] PRIMARY KEY CLUSTERED?
(
[StudySessionID] ASC
)
)
May ways exist to carry this out, but we are building for economy and speed. This seemed to be the most straightforward way to move towards the goal.
Collation and Data Types
One would think that database collation plays a role in storing Arabic characters in a database. Based on research and experience, the data type is a determining factor.
1.)??If nvarchar data type, no collation required. This data type supports Unicode characters, so any character from any language, script, symbols, etc. will appear as intended so long as data is preceded with the N’ string literal.
INSERT INTO ArabicCharacters (ArabicCharacter) VALUES (N'?'
INSERT INTO ArabicCharacters (ArabicCharacter) VALUES ('?'))
2.)??If varchar data type, collation required. For varchar columns, set the collation to an Arabic_CI_AI_KS_WS, or something similar, as varchar does not provide the same support as nvarchar.
Collation can be set at the server, database, or field level.
Collation can even be specified within DML and DDL statements on the fly.
--DML:
SELECT
ArabicCharacter
FROM ArabicCharacters
INNER JOIN
WikiArabicRoots ON
ArabicCharacter COLLATE SQL_Latin1_General_CP1_CI_AS = RootGroup COLLATE SQL_Latin1_General_CP1_CI_AS
--DDL:
ALTER TABLE ArabicCharacters ALTER COLUMN Definition NVARCHAR(100) COLLATE SQL_Latin1_General_CP1_CI_AS --or Arabic_CI_AI_KS_WST
ALTER TABLE WikiArabicRoots ALTER COLUMN InitialLetter NVARCHAR(100) COLLATE SQL_Latin1_General_CP1_CI_AS --or Arabic_CI_AI_KS_WST
It is obvious when the collation or data types require attention. The data will show it. Question marks appear in the data, meaning a character is not available to represent the data.
In happier times:
UTF encoding works with international audiences, including URLs. These URLs will then return the same page:
A mapping is available for UTF-8 codes to Arabic characters for building a hyperlink in both Arabic and English characters.
A valid URL is generated, but it is important to note that Wiktionary may not have an entry for all roots.
!????, they say. Patience, as an entry for this root may have not been created yet.
Onto the app.
Creating the App
A basic canvas app was created with a few basic screens.
Each screen contributes to building up reading comprehension and recall.
Search page
This screen provides an interactive search for finding Arabic root words. Filters are compound and the results based on search criteria. The text search supports both Arabic and English terms.
Details:
Some items of note:
List Manager
The app supports creating multiple lists and adding roots to them, so we will need a mechanism to manage that saved information.
Here, you can create and deactivate lists as well as edit list properties and their contents.
From right:
Duplicate list name detection is built into the Save procedure when editing a list.
Lists allow for “bucketing” roots into categories and groups, which may be conducive to study. A method of management keeps your content organized as well as relevant.
Menu
Last but not least, a menu.
An Arabic/English toggle feature. Best attempt at an Arabic translation of English terms!
Alphabet/Character Reference
As helpful reference, a screen is dedicated to the alphabet and the connecting letters.
Audio is available for letter name pronunciation and its' sound.
This is especially useful when first learning how each letter connects – or does not connect - in words.
Drill/Study Page
A study buddy feature helps a user drill on the words saved to their lists. It is a simple multiple-choice question - a “study session” - or a mini-drill activity to test knowledge and help retain the information.
An activity log reports on the drilling activity and feeds a Power BI report.
Tracking Scores in Power BI
Over time, scores will add up in the StudySession table. Power BI can report on individual progress or at the class/group level.
The Power BI data model mirrors the SQL db model.
If we take advantage of the Power BI cloud service, we can set up alerts and subscriptions based on study session data.
Results can be monitored at “the teacher level" via these notifications.
Power Automate is also an option for these types of alerts.
This introduces measurement and may lead to accountability for study goals.
Next Steps/Suggestions
I can appreciate that some readers do not intend to pick up Arabic, may already know the language, or are just curious. Come as you are. A few thoughts/suggestions on what to do next.
Suggestions only, your call ultimately.
/**/
And that is one way to approach building a foreign language study utility using Power Query, SQL Server, and a canvas Power App. Some fun stuff. Questions and feedback welcome.
References
Historical
Arabic Language
Apps and Technical
Data and Characters
Additional links ad references inline.
Microsoft Business Applications MVP | Power Platform | Dynamics 365 CE | Azure - Developer | Technical Consultant | Technical Architect | Community Super User | User Group Co-Organizer | Blogger
1 年Imtiaz Hussain