Static Code Analysis

Static Code Analysis

For a programmer or developer, it is really essential to know to test and debug your code and remove any bugs or errors in it. Any error or run time issues make the whole code meaningless. So, Static code analysis is a technique for estimating a program's runtime behavior. A computer must go through several complex transformations before it can eventually "understand" and execute a piece of code. These are the various steps involved:?

1. Scanning?

When attempting to comprehend a piece of code, a compiler will first split it into numerous smaller bits known as "tokens."?

2. Parsing?

A parser gathers these tokens, checks that the order in which they occur complies with the grammatical rules, and organizes them in a tree-like structure that represents the program's high-level structure. It is appropriately named an "Abstract Syntax Tree."?

3. AST analysis?

To analyze a syntax tree, we require an AST "walker"—an object that makes traversing the tree easier. The final module provides two walkers: ast.NodeVisitor, which does not enable alteration of the input tree, and ast.NodeTransformer, which does. We are usually only interested in evaluating a few nodes for relevance when traversing a syntax tree.?

The walker must implement a custom method to analyze a certain node type. This strategy is sometimes referred to as the "visitor" method. There is a top-level visit method that visits the input node recursively.?

A. Detecting single quotations?

The names of input files are read as command-line arguments. These file names are supplied to the check function, which creates tokens for each file and sends them to the find_violations method. The find_violations function iterates through the list of tokens, looking for "string type" tokens with values of "' or "'. If it finds one, it marks the line by adding it to self.violations. The report function then reads all the self.violations and writes them along with a useful error message.?

B. Detecting list usage ()?

When we visit a call node, we first attempt to obtain the name of the function being called.?

If it exists, we check to see if it equals list.name.?

If affirmative, we can be certain that a call to list(...) is being made.?

Following that, we confirm that no parameters are being supplied to the list method, indicating that the call is a list (). If this is the case, we will mark this line as an issue.?

C. Detecting an excessive number of nested loops?

When the visit method (from the BaseChecker class) is invoked, it begins looking for any ast. As soon as it detects one, it invokes the visit for a method with the default keyword argument parent=True. To monitor the outermost loop, we utilize the variable parent as a flag, in which case we initialize self. If we set the current loop depth to 1, we simply increment its value by one. We recursively search the body of this loop for any child ast.For nodes. If we find one, we call it "Visit_For" and set parent = False. When we finish traversing, we check to see if the loop depth has risen above 3. If this is the case, we report a violation and reset the loop depth to 0.?

D. Identifying unutilized imports?

In the first run, we scan over all the nodes where imports may be declared (ast. Import, ast.ImportFrom), gathering the names of all imported modules. We additionally populate a set with all the names used in that file in the same pass by creating a visitor for ast.Name. The second run determines whether names were imported but never utilized. For all such names, we then publish an error notice.?

When we see an Import or ImportFrom node, we save its name in a set.?

To acquire a list of all the names in a file, we visit ast.Name nodes and see whether a value is being read from it, implying that a reference to an already existing name is being established rather than creating a new object. (If it is an import name, it must already exist) — if so, we add the name to the set.?

The report function iterates through the list of all import names in a file to see if they are present in the collection of used names. If not, it produces an error message informing the user of the violation.?

?

This is how a static code analysis is done. It may not be an exact analysis but it helps in fixing some general errors and checks the runtime errors that a developer usually gets while running his code.

It's fascinating to see how each step, from tokenization to AST analysis, plays a vital role in improving code quality. The examples you provided, especially regarding detecting single quotations and unutilized imports, are incredibly practical.?

回复
Anchal Gupta

Senior Analyst-Insight Led Sales||Ex Marketing Intern-Business Development||MBA-IILM University||Marketing enthusiast||Content Writer||Social Worker||Digital Marketing||Motivator

1 年

good

回复

要查看或添加评论,请登录

社区洞察

其他会员也浏览了