Points to ponder - Python - spaces vs tabs - unexpected issue

Python, as everyone knows, uses indentation for blocks of code. In my early programming days, when developing code using languages like C, C++, Java and other environments, I used tabs. But over a period of time, I started preferring using spaces for indentation instead of tabs. This was because I found pressing multiple spaces for indentation to be a painful task. Later, after I started using vi (and appreciating it. I used to dislike / hate vi as well, but that is another story), I found that it was possible to configure vi to replace tabs by spaces. so the headache of typing multiple spaces was taken away by using appropriate settings. One additional issue with tabs is that different people used different tab settings. The value of a tab varied from two spaces to four spaces or eight spaces.

Back to our current situation.

These days, I am developing code using Databricks. When developing code in code in Python, I use Notepad++, with the setting turned on to display a space character using an Orange dot and a tab character using an Orange arrow. Such a feature is not available in Databricks (if it is available, I am not aware of it).

Recently this created a problem. I had code that performed an activity on a table, followed by a logging step. As it happens, I decided to replace the logging step by a function call, instead of a notebook run command. When I made the change and executed it, the code crashed. Upon investigation, I found that the issue was because tab characters were present on the line with the logger statement. The unexpected side effect was that the table activity went through, after which Python threw an error due to indentation. This error meant that the table reversal activity was not performed as control never reached that section. I had to add code for a one time execution to reverse the table activity and then continue with standard execution.

Moral of the story 1 -- proper exception handling and graceful exit is very important

Moral of the story 2 -- there has to be a setting in Databricks editor to show tabs and spaces (similar to what Notepad++ has).

#databricks #python #tabs #spaces #points_to_ponder

要查看或添加评论,请登录

Bipin Patwardhan的更多文章

  • Parallel execution in Spark

    Parallel execution in Spark

    On reading the title, I am sure the first reaction will be 'What am I talking about'. As we all know, Spark is a…

    1 条评论
  • Writing code to generate code - Python + SQL version

    Writing code to generate code - Python + SQL version

    In my current project, we had to build multiple metric tables. The base table had 50 columns and we had to add around…

  • Change management is crucial (Databricks version)

    Change management is crucial (Databricks version)

    My last project was a data platform implemented using Databricks. As is standard in a data project, we were ingesting…

  • Friday fun - Impersonation (in a good way)

    Friday fun - Impersonation (in a good way)

    All of us know that impersonation - the assumption of another person's identity, be it for good or bad - is not a good…

  • Any design is a trade-off

    Any design is a trade-off

    Irrespective of any area in the world (software or otherwise), every design is a trade off. A design cannot be the 'one…

    1 条评论
  • Quick Tip: The headache caused by import statements in Python

    Quick Tip: The headache caused by import statements in Python

    When developing applications, there has to be a method to the madness. Just because a programming environment allows…

  • Databricks: Enabling safety in utility jobs

    Databricks: Enabling safety in utility jobs

    I am working on a project where we are using Databricks on the WAS platform. It is a standard data engineering project…

  • A Simple Code Generator Using a Cool Python Feature

    A Simple Code Generator Using a Cool Python Feature

    For a project that I executed about three years ago, I wrote a couple of code generators - three variants of a…

  • Recap of my articles from 2024

    Recap of my articles from 2024

    As we are nearing the end of 2024, I take this opportunity to post a recap of the year - in terms of the articles I…

  • Handling dates

    Handling dates

    Handling dates is tough in real life. Date handling is probably tougher in the data engineering world.

社区洞察

其他会员也浏览了