登录查看更多内容

Substitution Ciphers

Ahlem Marzouk

Big Data Developper | Data Consultant || Certified PL-300: Power BI: Data Analyst Associate

发布日期: 2022年5月7日

In order to encrypt a text sequence, we use substitution ciphers that aim to replace characters by others. This technique helps to create a map that relates each letter to a specific key to get an encoded text.

There are many ways to perform this encryption and in what follows I will detail some algorithms describing three methods of encrypting:

Reverse alphabet.
Caeser cipher.
Bacon's code.

In this article we have three steps to apply in each method: First we will create the model that create a dictionary containing each letter of the alphabet and its code (or value) using one of the three methods listed above.

Second I will create an algorithm that performs the encryption of a text using the method mentioned. Third get a text from scraping a web site and test the code on it.

Reverse Alphabet:

The substitution applied on this method uses alphabet back-ends so that "a" becomes "z", "b" becomes "y" and so on.

To get a cipher text, the dictionary must contain the space and its value.

Output: 

{' ': ' ',
 'a': 'z',
 'b': 'y',
 'c': 'x',
 'd': 'w',
 'e': 'v',
 'f': 'u',
 'g': 't',
 'h': 's',
 'i': 'r',
 'j': 'q',
 'k': 'p',
 'l': 'o',
 'm': 'n',
 'n': 'm',
 'o': 'l',
 'p': 'k',
 'q': 'j',
 'r': 'i',
 's': 'h',
 't': 'g',
 'u': 'f',
 'v': 'e',
 'w': 'd',
 'x': 'c',
 'y': 'b',
 'z': 'a'}

Now we will create an algorithm that changes a plain text to an encrypted text using this technique.

To apply this function on a text we need first to remove all the trailing whitespaces and return a copy of the text in lowercase using "rstrip().lower()" and we need, as well, a regular expression regex that is a sequence of characters to specify a search pattern in the text.

utput:

'zsovn'

Now we need a text to test our code, and for this I will get it from scraping a web site "Wikipedia". For this, we need request that get the url and beautifulsoup to parse the html code.


Output:

'Natural language processing (NLP) is a subfield of linguistics, computer science, and artificial intelligence concerned with the interactions between computers and human language, in particular how to program computers to process and analyze large amounts of natural language data.  The goal is a computer capable of "understanding" the contents of documents, including the contextual nuances of the language within them. The technology can then accurately extract information and insights contained in the documents as well as categorize and organize the documents themselves.\nChallenges in natural language processing frequently involve speech recognition, natural-language understanding, and natural-language generation.\nNatural language processing has its roots in the 1950s. Already in 1950, Alan Turing published an article titled "Computing Machinery and Intelligence" which proposed what is now called the Turing test as a criterion of intelligence, though at the time that was not articulated as a problem separate from artificial intelligence. The proposed test includes a task that involves the automated interpretation and generation of natural language.\nThe premise of symbolic NLP is well-summarized by John Searle\'s Chinese room experiment: Given a collection of rules (e.g., a Chinese phrasebook, with questions and matching answers), the computer emulates natural language understanding (or other NLP tasks) by applying those rules to the data it confronts.\nUp to the 1980s, most natural language processing systems were based on complex sets of hand-written rules.  Starting in the late 1980s, however, there was a revolution in natural language processing with the introduction of machine learning algorithms for language processing.  This was due to both the steady increase in computational power (see Moore\'s law) and the gradual lessening of the dominance of Chomskyan theories of linguistics (e.g. transformational grammar), whose theoretical underpinnings discouraged the sort of corpus linguistics that underlies the machine-learning approach to language processing.[6]\nIn the 2010s, representation learning and deep neural network-style machine learning methods became widespread in natural language processing. That popularity was due partly to a flurry of results showing that such techniques[7][8] can achieve state-of-the-art results in many natural language tasks, e.g., in language modeling[9] and parsing.[10][11] This is increasingly important in medicine and healthcare, where NLP helps analyze notes and text in electronic health records that would otherwise be inaccessible for study when seeking to improve care.[12]\nIn the early days, many language-processing systems were designed by symbolic methods, i.e., the hand-coding of a set of rules, coupled with a dictionary lookup:[13][14] such as by writing grammars or devising heuristic rules for stemming....

Above the text that we will use and now we encrypt it using our code.

encryptPlaintext(text,alphabet_reverse,_letters)

Output: 

'mzgfizo ozmtfztv kilxvhhrmt  mok  rh z hfyurvow lu ormtfrhgrxh  xlnkfgvi hxrvmxv  zmw zigrurxrzo rmgvoortvmxv xlmxvimvw drgs gsv rmgvizxgrlmh yvgdvvm xlnkfgvih zmw sfnzm ozmtfztv  rm kzigrxfozi sld gl kiltizn xlnkfgvih gl kilxvhh zmw zmzobav ozitv znlfmgh lu mzgfizo ozmtfztv wzgz   gsv tlzo rh z xlnkfgvi xzkzyov lu  fmwvihgzmwrmt  gsv xlmgvmgh lu wlxfnvmgh  rmxofwrmt gsv xlmgvcgfzo mfzmxvh lu gsv ozmtfztv drgsrm gsvn  gsv gvxsmloltb xzm gsvm zxxfizgvob vcgizxg rmulinzgrlm zmw rmhrtsgh xlmgzrmvw rm gsv wlxfnvmgh zh dvoo zh xzgvtlirav zmw litzmrav gsv wlxfnvmgh gsvnhvoevh  xszoovmtvh rm mzgfizo ozmtfztv kilxvhhrmt uivjfvmgob rmeloev hkvvxs ivxltmrgrlm  mzgfizo ozmtfztv fmwvihgzmwrmt  zmw mzgfizo ozmtfztv tvmvizgrlm  mzgfizo ozmtfztv kilxvhhrmt szh rgh illgh rm gsv     h  zoivzwb rm       zozm gfirmt kfyorhsvw zm zigrxov grgovw  xlnkfgrmt nzxsrmvib zmw rmgvoortvmxv  dsrxs kilklhvw dszg rh mld xzoovw gsv gfirmt gvhg zh z xirgvirlm lu rmgvoortvmxv  gslfts zg gsv grnv gszg dzh mlg zigrxfozgvw zh z kilyovn hvkzizgv uiln zigrurxrzo rmgvoortvmxv  gsv kilklhvw gvhg rmxofwvh z gzhp gszg rmeloevh gsv zfglnzgvw rmgvikivgzgrlm zmw tvmvizgrlm lu mzgfizo ozmtfztv  gsv kivnrhv lu hbnylorx mok rh dvoo hfnnziravw yb qlsm hvziov h xsrmvhv illn vckvirnvmg  trevm z xloovxgrlm lu ifovh  v t   z xsrmvhv ksizhvyllp  drgs jfvhgrlmh zmw nzgxsrmt zmhdvih   gsv xlnkfgvi vnfozgvh mzgfizo ozmtfztv fmwvihgzmwrmt  li lgsvi mok gzhph  yb zkkobrmt gslhv ifovh gl gsv wzgz rg xlmuilmgh  fk gl gsv     h  nlhg mzgfizo ozmtfztv kilxvhhrmt hbhgvnh dviv yzhvw lm xlnkovc hvgh lu szmw dirggvm ifovh   hgzigrmt rm gsv ozgv     h  sldvevi  gsviv dzh z ivelofgrlm rm mzgfizo ozmtfztv kilxvhhrmt drgs gsv rmgilwfxgrlm lu nzxsrmv ovzimrmt zotlirgsnh uli ozmtfztv kilxvhhrmt   gsrh dzh wfv gl ylgs gsv hgvzwb rmxivzhv rm xlnkfgzgrlmzo kldvi  hvv nlliv h ozd  zmw gsv tizwfzo ovhhvmrmt lu gsv wlnrmzmxv lu xslnhpbzm gsvlirvh lu ormtfrhgrxh  v t  gizmhulinzgrlmzo tiznnzi   dslhv gsvlivgrxzo fmwvikrmmrmth wrhxlfiztvw gsv hlig lu xlikfh ormtfrhgrxh gszg fmwviorvh gsv nzxsrmv ovzimrmt zkkilzxs gl ozmtfztv kilxvhhrmt     rm gsv     h  ivkivhvmgzgrlm ovzimrmt zmw wvvk mvfizo mvgdlip hgbov nzxsrmv ovzimrmt nvgslwh yvxznv drwvhkivzw rm mzgfizo ozmtfztv kilxvhhrmt  gszg klkfozirgb dzh wfv kzigob gl z uofiib lu ivhfogh hsldrmt gszg hfxs gvxsmrjfvh       xzm zxsrvev hgzgv lu gsv zig ivhfogh rm nzmb mzgfizo ozmtfztv gzhph  v t   rm ozmtfztv nlwvormt    zmw kzihrmt          gsrh rh rmxivzhrmtob rnkligzmg rm nvwrxrmv zmw svzogsxziv  dsviv mok svokh zmzobav mlgvh zmw gvcg rm vovxgilmrx svzogs ivxliwh gszg dlfow lgsvidrhv yv rmzxxvhhryov uli hgfwb dsvm hvvprmt gl rnkilev xziv      rm gsv vziob wzbh  nzmb ozmtfztv kilxvhhrmt hbhgvnh dviv wvhrtmvw yb hbnylorx nvgslwh  r v   gsv szmw xlwrmt lu z hvg lu ifovh  xlfkovw drgs z wrxgrlmzib ollpfk          hfxs zh yb dirgrmt tiznnzih li wverhrmt svfirhgrx ifovh uli hgvnnrmt  nliv ivxvmg hbhgvnh yzhvw lm nzxsrmv ovzimrmt zotlirgsnh szev nzmb zwezmgztvh levi szmw kilwfxvw ifovh   wvhkrgv gsv klkfozirgb lu nzxsrmv ovzimrmt rm mok ivhvzixs  hbnylorx nvgslwh ziv hgroo        xlnnlmob fhvw  hrmxv gsv hl xzoovw  hgzgrhgrxzo ivelofgrlm          rm gsv ozgv     h zmw nrw     h  nfxs mzgfizo ozmtfztv kilxvhhrmt ivhvzixs szh ivorvw svzerob lm nzxsrmv ovzimrmt  gsv nzxsrmv ovzimrmt kzizwrtn xzooh rmhgvzw uli fhrmt hgzgrhgrxzo rmuvivmxv gl zfglnzgrxzoob ovzim hfxs ifovh gsilfts gsv zmzobhrh lu ozitv xlikliz  gsv kofizo ulin lu xlikfh  rh z hvg lu wlxfnvmgh  klhhryob drgs sfnzm li xlnkfgvi zmmlgzgrlmh  lu gbkrxzo ivzo dliow vcznkovh  nzmb wruuvivmg xozhhvh lu nzxsrmv ovzimrmt zotlirgsnh szev yvvm zkkorvw gl mzgfizo ozmtfztv kilxvhhrmt gzhph  gsvhv zotlirgsnh gzpv zh rmkfg z ozitv hvg lu  uvzgfivh  gszg ziv tvmvizgvw uiln gsv rmkfg wzgz  rmxivzhrmtob  sldvevi  ivhvzixs szh ulxfhvw lm hgzgrhgrxzo nlwvoh  dsrxs nzpv hlug  kilyzyrorhgrx wvxrhrlmh yzhvw lm zggzxsrmt ivzo ezofvw dvrtsgh gl vzxs rmkfg uvzgfiv  xlnkovc ezofvw vnyvwwrmth      zmw mvfizo mvgdliph rm tvmvizo szev zohl yvvm kilklhvw  uli v t  hkvvxs ...

To get back our plain text by decrypting the cipher text we use the following piece of code. First I will test it on the word that I encrypted above then I will apply it on the cipher text created.

领英推荐

Embracing Strict Mode in OpenAI: Revolutionizing…

PriceSenz 5 个月前

OpenAI's o1 Outperforms Other LLMs By "Stopping To…

ARK Investment Management LLC 6 个月前

Mastering the Ingestion Phase of Retriever Augmented…

Snigdha Kakkar 11 个月前

2. Caesar Cipher:

This method uses a shift "p" that changes each letter of the alphabet with a letter that comes p before it in the alphabet.

First we create a dictionary containing each letter with its new value and then we will use it to encode a plain text.

Output: 


{'a': 'w'
 'b': 'x',
 'c': 'y',
 'd': 'z',
 'e': 'a',
 'f': 'b',
 'g': 'c',
 'h': 'd',
 'i': 'e',
 'j': 'f',
 'k': 'g',
 'l': 'h',
 'm': 'i',
 'n': 'j',
 'o': 'k',
 'p': 'l',
 'q': 'm',
 'r': 'n',
 's': 'o',
 't': 'p',
 'u': 'q',
 'v': 'r',
 'w': 's',
 'x': 't',
 'y': 'u',
 'z': 'v'}

Let's now use this dictionary to encrypt the same text we used before.

3. Bacon's code:

Bacon’s Code replaces each letter of the English alphabet with a 5-letter sequence. These sequences begin with "AAAAA" and add "B"s in an arbitrary order.

So, in Bacon’s Code, "A = AAAAA", "B = AAAAB", "C = AAABA", "D = AAABB" and so on. Let's start, then, with the dictionary that will help us encrypt the text.

Output:

{'a': 'AAAAA'
 'b': 'AAAAB',
 'c': 'AAABA',
 'd': 'AAABB',
 'e': 'AABAA',
 'f': 'AABAB',
 'g': 'AABBA',
 'h': 'AABBB',
 'i': 'ABAAA',
 'j': 'ABAAA',
 'k': 'ABAAB',
 'l': 'ABABA',
 'm': 'ABABB',
 'n': 'ABBAA',
 'o': 'ABBAB',
 'p': 'ABBBA',
 'q': 'ABBBB',
 'r': 'BAAAA',
 's': 'BAAAB',
 't': 'BAABA',
 'u': 'BAABB',
 'v': 'BAABB',
 'w': 'BABAA',
 'x': 'BABAB',
 'y': 'BABBA',
 'z': 'BABBB'}

This is the code that encrypts a text.

Output:


'AAAAA-AABBB-ABABA-AABAA-ABABB-'

Conclusion:

In this article we've used three methods of encrypting to apply substitution ciphers.

Substitution Ciphers

Ahlem Marzouk

Big Data Developper | Data Consultant || Certified PL-300: Power BI: Data Analyst Associate

领英推荐

社区洞察

其他会员也浏览了

DeepSeek Article Observations and Security

Echoes of the Forgotten Code: 21K Codebase Challenge – From GPT-3.5 to Google Gemini, Who Remembers Best?

?? A lazy kangaroo is just a pouch potato

Why GraphQL Will Rewrite the Semantic Web

Speeding Up Your AI-powered Search with JAI Async

Regex – the ultimate language we love to hate!

Project Strawberry: OpenAI's Newest and Most Terrifying Innovation

Using ChatGPT to Replicate Results of Hidden Markov Modeling

Reducing LLM Hallucinations: A Deep Dive into Reflection LLM and Vector Stores

Unveiling LangSmith: Revolutionizing LLM Monitoring with Security in Mind