The Great Game: Create an Arabic Study App with Power Platform and SQL Server
"Never a failure, always a lesson."

The Great Game: Create an Arabic Study App with Power Platform and SQL Server

Overview

Latin and Greek word roots: the elixir for English vocabulary tests in high school. With their help, I avoided the parental gaze. Those same roots proved useful while studying Romance languages; familiar pieces fit, and words came alive with these tiny clues to their meaning.

Word roots are also present in the Arabic language. The differences between English and Arabic are clear, sure, but the concept is more or less the same in both systems: a grouping of letters express a general meaning or concept. These grammatical arrangements are the foundation of some languages and can act as an aid for new learners to understand a new written system.

Yet all these years after my first Arabic class, it remains a challenge to find a searchable *and* interactive list of word roots and their meanings. Latin and Greek learners can eat their heart out but there are not many tools for searching and learning this fundamental information.

Still, countless resources for learning Arabic are available via traditional means and on the web. A searchable list and vocabulary companion must be among them. Not necessarily, and therein awaits an opportunity.

Further, the tools for managing data and building applications are much different now. Far more is possible in a fraction of the time with advents such as Power Query, SQL Server and a Power App. A lot has happened since, oh, the mid-2000s ??

This leap in capability lets us deliver a product in less time, effort, and cost. That product could, perhaps, help a student learn faster, thus improving their education. Superior value.

“When there is wind,” the saying goes, “Ride it.” It’s about that time.

/**/

This article will demonstrate how to create a Power App using SQL Server data for use as an aid in studying a foreign language.

Caveats

A few things to know before you go.

  • Disclosure. My educational background is Politics and Arabic (dual concentration). This guide comes from a familiar place though open to feedback on how some concepts of the language are conveyed.
  • Fundamentals. Part of this guide includes a basic introduction to Arabic as a written language. Although it is a different alphabet, the details are not exhaustive and the concepts are easy to grasp, as many languages have a semblance of a word root system. A modicum of background helps.
  • Potential gaps in research. There may be a searchable Arabic root word list out there; I may have simply not found it during my research. Sources, apps, etc. are a welcome gotcha. I may be missing something.
  • Similar tools, different purposes. The Qur'anic Arabic Corpus and ElixirFM are some of the best tools that I have used to find and study word roots. Rooty is a root lookup engine, but also a study companion with its' ability to manage root lists, create a quiz and learn the alphabet.
  • Premium connectors. SQL Server is a premium connector.
  • Data quality. Definitions from the chosen data source come from Wiktionary, a crowd-sourced free dictionary. The data quality is as good as it is factual and maintained. The “garbage in, garbage out” rule applies, so consistency and accuracy are not guaranteed.
  • Updates. This does not factor in updates as the information changes at the source. Another solution is available for this aspect upon request.

Not so bad.

Prerequisites

Basic “system requirements:”

  • Power Automate for extracting data from Wiktionary.
  • Power Query for transforming source data. It is important to note that the same transformations featured in Power Query are possible in SQL Server. Power Query is therefore optional.
  • SQL Server to store transformed data. An independent instance of SQL Server is not required, but the ability to create and manage a database is necessary.
  • Power App for displaying content in a canvas-style app.
  • Power BI to display scores.

Not too terrible.

Steps to Implement

Building this app follows a general process.

No alt text provided for this image

Power Query transforms and cleans source data. Excel and SQL Server are also options.

No alt text provided for this image

SQL Server stores the transformed data. If a study activity is desired, SQL scripts will be necessary for logging activity.

No alt text provided for this image

A Power App loads the data from SQL Server. Ideally, Power Fx will handle most UI functions, with some interaction with the SQL database.

The app handles search criteria based on user input, thus returning filtered results.

No alt text provided for this image

Power BI connects to SQL Server to display the scores collected by users. A simple report to start at first but the sky is the limit here.

The Arabic Alphabet 101

Arabic serves as the official language in 23 countries. Modern Standard Arabic functions as the standardized variant, commonly referred to as "book Arabic," prevalent in academic literature, legal documents, mass media, and textbooks. Complementing this, Classical Arabic stands as a literary expression and holds sacred significance. Divine revelations were communicated to Prophet Muhammad in Arabic, rendering the Qur'an, the cornerstone of Islam, as God's message in written form.

All things considered the number of speakers and readers exceeds 1 billion people around the world.

A Semitic language, written Arabic reads from right to left.

No alt text provided for this image

Numbers, however, read left to right.

No alt text provided for this image

Numerals exist, of course.

No alt text provided for this image

The Arabic alphabet contains 28 characters total, and some count this character to make it 29.

No alt text provided for this image

Characters are synonymous with Western letters, its' Romanized counterparts. In Arabic script, they connect to each other to form words.

The shape of a character changes based on its’ position in the word.

No alt text provided for this image

  • Isolated. The character by itself.
  • Initial. First position in the word.
  • Medial. Between the first and last position.
  • Final. Last position in the word.

This means there are four different shapes for each character.

Take ? (baa) for example, the equivalent to the letter b in the Roman alphabet. This character enjoys a full range of connecting ability.

No alt text provided for this image

Only 6 characters in the alphabet have constraints on connecting to other letters.

  • ? (alif, a)
  • ? (dal, d)
  • ? (dhal, dh)
  • ? (raa, r)
  • ? (zay, z)
  • ? ?(waw, w)

The rules for these characters are few in number.

1.)??They can stand separately from other characters in the word (aka isolated).

No alt text provided for this image

2.)??They can connect to another character using their final position.

No alt text provided for this image

3.)??They cannot, however, connect to the next character in the word. The character must take its’ final shape (without needing to be the last letter).

No alt text provided for this image

Examples of these rules in action, in the above order:

alif (?)

???

???

???

dal (?)

??

???

????

dhal (?)

???

???

????

raa (?)

???

???

???

zay (?)

???

???

???

waw (?)

???

?????

???

Many alphabets are available detailing how letters connect. Use one that best reinforces the concept for you.

Of course, these rules are general, and any sort of special cases or rules will become apparent as you learn.

Word Roots

Words in Arabic are based on 3- or 4-letter groupings, or roots. Like Latin and Greek roots, these tri- and quadrilateral groupings hold a general meaning or concept. Unlike Latin and Greek roots, however, Arabic word roots are consonants only.

No alt text provided for this image

These little building blocks of words in the language.

Forms

If a word root bears a general meaning, a form gives it exact meaning.

Think of a root as a tree and forms are the branches. Each form/branch has its own spelling pattern. A change to the spelling pattern creates a variation on a roots’ meaning; this creates a related word.

There are ten commonly used forms, with Form I as the basic meaning. The verb ??? has a root of three base letters – f-ae-l – and generally means to do.

No alt text provided for this image

Broken out in numbered format:

? F = First letter

? M = Medial letter

? L = Final letter

  1. Form IFaMaLa – ?????? (Base meaning – to do)
  2. Form IIFaMMaLa – ??????? (Causative – to act)
  3. Form IIIFaaMaLa – ??????? (Reciprocal – to work)
  4. Form IV – aFMaLa – ???????? (Causative or reflexive causative – to make o.s./s.o./sth. do)
  5. Form V – taFaMMaLa –????????? ?(Form II reflexive)
  6. Form VI – taFaaMaLa – ?????????? (Form III reflexive)
  7. Form VII – inFaMaLa –?????????? ?(Form I passive)
  8. Form VIII – iFtaMaLa – ?????????? (Form I reflexive)
  9. Form IX – iFMaLLa – ?????????? (related to colors and defects)
  10. Form X – istaFMaLa – ???????????? (Causative reflexive – to ask/think sth. be done)

Let’s take a different word. In its’ purest form – Form I – the verb ??? (j-m-ae) means to collect, gather. In a different form – Form III – ???? (j-mm-ae), the meaning morphs: to amass, to accumulate.

One root takes many forms, and each form provides a different shade of meaning.

For a student, the root system provides an efficient way to acquire vocabulary. A searchable collection of word roots would be fast and support fundamentals, making it a valuable reference service.

Vocalization and Diacritical Marks

Vocalization and diacritical markings really tie things together.

Officially, the markings in Arabic words are known as tashkīl – dashes, dots, loops, and squiggly lines that affect pronunciation. When a subset of markings known as ?arakāt, or short vowels, accompany a word, this is known as vocalization.

No alt text provided for this image

Vowels are the equivalent to the Western letters a, i, and u. Two kinds exist, short and long. Short vowels take the form of “dashes and loops,” with proper names of course.

??= fatHa (a, as in car)

? ?= kasra (i, as in ship)

? ?= damma (u, as in broom)

No alt text provided for this image

The short vowels can be elongated as well, where the a, i, and u are simply doubled.

  • a -> aa -> ?????? (kaatib, writer | ????
  • i -> ii -> ?????? (saeiid, happy)
  • u -> uu -> ?????? (kharoof, sheep)

It is worth nothing that most Arabic is unvoweled. Vowels are training wheels and will be replaced by context and knowledge of grammar.

Vocalization affects more than vowels. Certain marks indicate pronunciation and stress yet are not vowels but rather consonants.

A tiny “w” is a shadda. This represents a doubled consonant: array, yell, apple, shaddah.

No alt text provided for this image

The little “o” is a sukūn, which signifies two consonants in a row with no vowel: heart, ticket, jump.

No alt text provided for this image

Full bird vocalized script:

No alt text provided for this image

Unvocalized script:

No alt text provided for this image

Vocalized script serves as another aid in becoming familiar with the written language, leading to comprehension at a faster pace.

These are the very basics to the written language; broad strokes make us functional, though it does get more involved.

Word Root Data Sources

Three major data sources have been consulted to form a list of Arabic word roots.

Wiktionary has two sources, Arabic Roots and Quranic Verbs (sic). A third source, GitHub Arabic Roots, contains a word root list that is an output of the SAMA analyzer, which is the work of Tim Buckwalter from the University of Maryland.

Only one source was chosen for now, Wiktionary Arabic Roots, as it is the most straightforward and complete. Some findings:

  • The Wiktionary page contains roots words for 3-, 4- and 5-letter root words.
  • Characters in a root are expressed as isolated letters (? ? ? not ???).
  • Meanings are available in separate pages. The URLs to each root are expressed in UTF8 character codes as equivalents to Arabic script, not Arabic script itself.
  • Not all words have a value for meaning.

The additional sources will be added later. We have our base data set with a direct route to a meaning/definition.

No alt text provided for this image

Power Automate

Extraction data from Wiktionary is not an easy task, apparently. Power Automate helps us immensely but there are still some challenges based on page structure and consistency.

When a root has one or many definitions, certain terms are used. Same goes for roots that have a page but no translation.

Root with definition:

No alt text provided for this image

Root with no definition:

No alt text provided for this image

Our flow looks for particular phrases and parses the content for display in the app.

The most important part of our flow:

No alt text provided for this image

FYI: This does not work 100% of the time. An action serving as an “exit ramp” has been added as a parallel branch to set a default message when a particular phrase is not found.

No alt text provided for this image

Power Query

Power Query made quick work of the data source to prepare it for migration to SQL Server. This app seemed to be the best bet as a transformation tool, given its’ variety of parsing functions.

No alt text provided for this image

A table containing characters was created along with space for detail.

No alt text provided for this image

Once the transformations were complete, the finished data set was migrated to a SQL database.

No alt text provided for this image

SQL Server

All app data is stored in SQL Server. Transformed data was imported into a few database tables, whose structure is outlined below. Considerations around character representation needed to be made.

Data Model

The start of our data model:

No alt text provided for this image

This is a simple enough approach that will serve different purposes.

  1. Provide alphabet/character reference.
  2. Search and find available root words for review and bookmarking.
  3. Root word drill/study feature.

The SQL is also simple enough.

--Arabic characters
--Borrowed from MARC Standards at the Library of Congress:
--https://www.loc.gov/marc/specifications/codetables/BasicArabic.html
CREATE TABLE [dbo].[ArabicCharacters](
	[ArabicCharacterID] [int] IDENTITY(1,1) NOT NULL,
	[MARC8_G0] [nvarchar](16) NOT NULL,
	[MARC8_G1] [nvarchar](16) NOT NULL,
	[UCS_Unicode16] [nvarchar](16) NOT NULL,
	[UCS_UTF8] [nvarchar](16) NOT NULL,
	[ArabicCharacter] [nvarchar](16) NOT NULL,
	[CharacterDefinition] [nvarchar](100) NULL,
	[EnglishEquivalent] [nvarchar](10) NULL,
	[Pronunciation] [nvarchar](25) NULL,
	[LetterNameAudioLink] [nvarchar](255) NULL,
	[PronunciationAudioLink] [nvarchar](255) NULL,
	[CharacterOrder] [int] NULL,
	[IPA_Pronunciation] [nvarchar](25) NULL,
	[IsBaseCharSet] [bit] DEFAULT ((1)) NULL,
	[InitialPosition] [nvarchar](10) NULL,
	[MedialPosition] [nvarchar](10) NULL,
	[FinalPosition] [nvarchar](10) NULL,
	[IsolatedPosition] [nvarchar](10) NULL,
	[IsActive] [bit] DEFAULT ((1)) NULL,
	[WhoCreated] [nvarchar](50) NULL,
	[WhenCreated] [datetime] DEFAULT (getdate()) NULL,
	[WhoUpdated] [nvarchar](50) NULL,
	[WhenUpdated] [datetime] DEFAULT (getdate()) NULL,
?CONSTRAINT [PK_ArabicCharacters] PRIMARY KEY CLUSTERED?
(
	[ArabicCharacterID] ASC
)
)        

A simple table of Arabic word roots is our main data source. Note the collation specified inline. It is not needed, as we are using the nvarchar data type but demonstrates that collation may be applied at this level.

--Arabic word roots
CREATE TABLE [dbo].[WikiArabicRoots](
	[RootID] [int] IDENTITY(1,1) NOT NULL,
	[Source] [nvarchar](255) NULL,
	[SourcePage] [int] NULL,
	[CharacterOrder] [int] NULL,
	[RootGroup] [nvarchar](255) NULL,
	[InitialLetter] [nvarchar](100) NULL,
	[MedianLetter] [nvarchar](255) NULL,
	[MedianLetter2] [nvarchar](255) NULL,
	[FinalLetter] [nvarchar](255) NULL,
	[Root] [nvarchar](255) NULL,
	[Definition] [nvarchar](max) NULL,
	[RootType] [nvarchar](255) NULL,
	[SourceLink] [nvarchar](255) NULL,
	[IsActive] [bit] DEFAULT ((1)) NULL,
	[WhoCreated] [nvarchar](50) NULL,
	[WhenCreated] [datetime] DEFAULT (getdate()) NULL,
	[WhoUpdated] [nvarchar](50) NULL,
	[WhenUpdated] [datetime] DEFAULT (getdate()) NULL,
?CONSTRAINT [PK_WikiArabicRoots] PRIMARY KEY CLUSTERED?
(
	[RootID] ASC
)
)        

Additional tables accommodate a simple multiple-choice drill game.

--User table
CREATE TABLE [dbo].[UserDetails](
	[UserID] [int] IDENTITY(1,1) NOT NULL,
	[Username] [nvarchar](25) NOT NULL,
	[UserEmail] [nvarchar](50) NOT NULL,
	[UserFirstName] [nvarchar](50) NULL,
	[UserLastName] [nvarchar](50) NULL,
	[IsLockedOut] [bit] DEFAULT 0 NULL,
	[IsConfirmed] [bit] DEFAULT 0 NULL,
	[IsActive] [bit] DEFAULT ((1)) NULL,
	[WhoCreated] [nvarchar](50) NULL,
	[WhenCreated] [datetime] DEFAULT (getdate()) NULL,
	[WhoUpdated] [nvarchar](50) NULL,
	[WhenUpdated] [datetime] DEFAULT (getdate()) NULL,
?CONSTRAINT [PK_UserDetails] PRIMARY KEY CLUSTERED?
(
	[UserID] ASC
)
)


CREATE TABLE [dbo].[UserRootList](
	[UserRootListID] [int] IDENTITY(1,1) NOT NULL,
	[UserID] [int] NOT NULL,
	[RootListLabel] [nvarchar](25) NOT NULL,
	[IsDefaultList] [bit] DEFAULT 0 NULL,
	[IsActive] [bit] DEFAULT ((1)) NULL,
	[WhoCreated] [nvarchar](50) NULL,
	[WhenCreated] [datetime] DEFAULT (getdate()) NULL,
	[WhoUpdated] [nvarchar](50) NULL,
	[WhenUpdated] [datetime] DEFAULT (getdate()) NULL,
?CONSTRAINT [PK_UserRootList] PRIMARY KEY CLUSTERED?
(
	[UserRootListID] ASC
)
)


CREATE TABLE [dbo].[UserRootListItem](
	[UserRootListItemID] [int] IDENTITY(1,1) NOT NULL,
	[UserRootListID] [int] NOT NULL,
	[RootID] [int] NOT NULL,
	[IsActive] [bit] DEFAULT ((1)) NULL,
	[WhoCreated] [nvarchar](50) NULL,
	[WhenCreated] [datetime] DEFAULT (getdate()) NULL,
	[WhoUpdated] [nvarchar](50) NULL,
	[WhenUpdated] [datetime] DEFAULT (getdate()) NULL,
?CONSTRAINT [PK_UserRootListItem] PRIMARY KEY CLUSTERED?
(
	[UserRootListItemID] ASC
)
)

CREATE TABLE [dbo].[StudySession](
	[StudySessionID] [int] IDENTITY(1,1) NOT NULL,
	[UserID] [int] NOT NULL,
	[OptionID] [int] NOT NULL,
	[OptionCategory] [nvarchar](15) NOT NULL,
	[RootID] [int] NOT NULL,
	[IsCorrect] [bit] NULL,
	[StudySessionGUID] [nvarchar](50) NULL,
	[IsSkipped] [bit] DEFAULT ((0)) NULL,
	[UserAnswer] [int] NULL,
	[IsActive] [bit] DEFAULT ((1)) NULL,
	[WhoCreated] [nvarchar](50) NULL,
	[WhenCreated] [datetime] DEFAULT (getdate()) NULL,
	[WhoUpdated] [nvarchar](50) NULL,
	[WhenUpdated] [datetime] DEFAULT (getdate()) NULL,
?CONSTRAINT [PK_StudySession] PRIMARY KEY CLUSTERED?
(
	[StudySessionID] ASC
)
)
        

May ways exist to carry this out, but we are building for economy and speed. This seemed to be the most straightforward way to move towards the goal.

Collation and Data Types

One would think that database collation plays a role in storing Arabic characters in a database. Based on research and experience, the data type is a determining factor.

1.)??If nvarchar data type, no collation required. This data type supports Unicode characters, so any character from any language, script, symbols, etc. will appear as intended so long as data is preceded with the N’ string literal.

INSERT INTO ArabicCharacters (ArabicCharacter) VALUES (N'?'

INSERT INTO ArabicCharacters (ArabicCharacter) VALUES ('?'))        

2.)??If varchar data type, collation required. For varchar columns, set the collation to an Arabic_CI_AI_KS_WS, or something similar, as varchar does not provide the same support as nvarchar.

Collation can be set at the server, database, or field level.

No alt text provided for this image

Collation can even be specified within DML and DDL statements on the fly.

--DML:
SELECT
ArabicCharacter
FROM ArabicCharacters
INNER JOIN
WikiArabicRoots ON
ArabicCharacter COLLATE SQL_Latin1_General_CP1_CI_AS = RootGroup COLLATE SQL_Latin1_General_CP1_CI_AS

--DDL:
ALTER TABLE ArabicCharacters ALTER COLUMN Definition NVARCHAR(100) COLLATE SQL_Latin1_General_CP1_CI_AS --or Arabic_CI_AI_KS_WST
ALTER TABLE WikiArabicRoots ALTER COLUMN InitialLetter NVARCHAR(100) COLLATE SQL_Latin1_General_CP1_CI_AS --or Arabic_CI_AI_KS_WST        

It is obvious when the collation or data types require attention. The data will show it. Question marks appear in the data, meaning a character is not available to represent the data.

No alt text provided for this image

In happier times:

No alt text provided for this image

UTF encoding works with international audiences, including URLs. These URLs will then return the same page:

A mapping is available for UTF-8 codes to Arabic characters for building a hyperlink in both Arabic and English characters.

  • D8A1= ?
  • D8A8 = ?
  • D8AF = ?

A valid URL is generated, but it is important to note that Wiktionary may not have an entry for all roots.

No alt text provided for this image

!????, they say. Patience, as an entry for this root may have not been created yet.

Onto the app.

Creating the App

A basic canvas app was created with a few basic screens.

  • Search page
  • List manager
  • Drill/Study page
  • Alphabet/Character reference

Each screen contributes to building up reading comprehension and recall.

Search page

This screen provides an interactive search for finding Arabic root words. Filters are compound and the results based on search criteria. The text search supports both Arabic and English terms.

No alt text provided for this image

Details:

  • Search for roots by characters or text
  • Browse root details in a click
  • Create lists to save groups of roots
  • Add roots to lists
  • Remove roots from lists

Some items of note:

  • An IPA pronunciation guide is also included in the root detail panel on the left-hand side.
  • The Definition is pulled dynamically from Wiktionary via Power Automate instant cloud flow.

List Manager

The app supports creating multiple lists and adding roots to them, so we will need a mechanism to manage that saved information.

Here, you can create and deactivate lists as well as edit list properties and their contents.

No alt text provided for this image

From right:

  • Remove roots from a list
  • Look up roots in Wiktionary
  • Add roots to another list
  • Move roots from one list to another
  • Get help on roots, report a bug, etc.
  • Drill/Quiz on selected root

Duplicate list name detection is built into the Save procedure when editing a list.

No alt text provided for this image

Lists allow for “bucketing” roots into categories and groups, which may be conducive to study. A method of management keeps your content organized as well as relevant.

Menu

Last but not least, a menu.

No alt text provided for this image

An Arabic/English toggle feature. Best attempt at an Arabic translation of English terms!

No alt text provided for this image

Alphabet/Character Reference

As helpful reference, a screen is dedicated to the alphabet and the connecting letters.

Audio is available for letter name pronunciation and its' sound.

No alt text provided for this image

This is especially useful when first learning how each letter connects – or does not connect - in words.

Drill/Study Page

A study buddy feature helps a user drill on the words saved to their lists. It is a simple multiple-choice question - a “study session” - or a mini-drill activity to test knowledge and help retain the information.

No alt text provided for this image

An activity log reports on the drilling activity and feeds a Power BI report.

No alt text provided for this image

Tracking Scores in Power BI

Over time, scores will add up in the StudySession table. Power BI can report on individual progress or at the class/group level.

No alt text provided for this image

The Power BI data model mirrors the SQL db model.

No alt text provided for this image

If we take advantage of the Power BI cloud service, we can set up alerts and subscriptions based on study session data.

No alt text provided for this image

Results can be monitored at “the teacher level" via these notifications.

No alt text provided for this image

Power Automate is also an option for these types of alerts.

No alt text provided for this image

This introduces measurement and may lead to accountability for study goals.

Next Steps/Suggestions

I can appreciate that some readers do not intend to pick up Arabic, may already know the language, or are just curious. Come as you are. A few thoughts/suggestions on what to do next.

  • Learn about another language. It does not need to be Arabic. Big world out there and a lot of folks speak these languages. Even a “hello” and “thank you” gets you places and keeps you in good graces.
  • Dive into data types and collations. If you see yourself working with a product that has an international user base, then knowing the nuances around supporting different character sets is useful.
  • Consider sending feedback. There may be some areas that require clarity or something that may not work. Get in touch with any questions or suggestions, either about the language or the technical approach.

Suggestions only, your call ultimately.

/**/

And that is one way to approach building a foreign language study utility using Power Query, SQL Server, and a canvas Power App. Some fun stuff. Questions and feedback welcome.

References

Historical

Arabic Language

Apps and Technical

Data and Characters

Additional links ad references inline.

Parvez Ghumra

Microsoft Business Applications MVP | Power Platform | Dynamics 365 CE | Azure - Developer | Technical Consultant | Technical Architect | Community Super User | User Group Co-Organizer | Blogger

1 年

要查看或添加评论,请登录

Dan Romano的更多文章

社区洞察

其他会员也浏览了