Developer Journey: Learn Character functions

Developer Journey: Learn Character functions

A few days ago, I wrote about programming for things you think will never happen, because you are wrong.

Here is another one. And here is my TL; DR; If you are new to any programming language, the quickest way to get 50% better is to learn the most common character and numeric functions.

Learn character functions you think you won't use, because you're wrong

Decades ago, when I first saw SAS character functions, I could not imagine what use these would ever be. I scoffed.

A function that allows me to just read a variable from the second character to the end of the string? Why would I ever need that?

Quote attributable to me, an idiot.

Oddly enough, just last week, I had to merge to files, one had the population for each state and the other had the prevalence rates for various chronic disease. In one, the value for state was written by normal human beings like, "North Dakota" and in the other file, for reasons known only to weirdos, it was ".North Dakota". Also, the variable state had a length of 20 in the first data set and 21 in the second. So ...

state_name = substr(state, 2) ;        

solved my problem. That was in SAS and it reads the variable from the second character to the end of the string. Your preferred language may have a different format but many languages (not all) have a substring function.

Other character functions you'll need are those that:

  • convert the value to upper case
  • trim leading blanks
  • trim white space
  • truncate a string
  • replace characters in a string
  • find if a value appears within a string

Let me give you an example of the sort of thing I run into all of the time. I want to get a frequency distribution of the tribal enrollment of children who use our games. (My day job is president of a company that makes educational games and the tools to make them. Many of our games have an Indigenous history story line.)

Unfortunately for my sanity, children will enter their tribe on a form like this:

  • Turtle Mountain Band of Chippewa Indians
  • Turtle Mountain Ojibwe
  • TMBCI
  • turtle mountain
  • turtle mountain Ojibwe
  • the turtle mountain band
  • turtle mountain tribe

.... and 50 other variations. If I used just two functions to convert those response to upper case and then search for TURTLE MOUNTAIN , I'd match six out of seven of those variations and only need to use an IF statement to recode the seventh, TMNCI.

It's not just for tribal enrollment. When we ask teacher's name or city, the variation in spelling and capitalization is wild.

There were many functions that, when I was younger and dumber, I thought I would never use. The fact is, those functions exist for a reason.

One probable reason is because people are stupid. That is why, when the question is, "How many inches are in a foot?" They answer, "12 inches" or "12 in." or 12" and now you, gentle programmer person, need a function that removes any value that is not a number from their response because otherwise, when your program goes to see if their answer matches the correct answer, which is 12, they will be told they are incorrect and complain that your program sucks.

Learn the most common functions and then learn the less common ones. It will save you a ton of time.

要查看或添加评论,请登录

社区洞察

其他会员也浏览了