Day 82 of #100DaysOfLearning
CodeQL

Day 82 of #100DaysOfLearning

If you have been following my recent posts, you have probably noticed that I have been learning about CodeQL lately.

Today I was learning about the CodeQL CLI, which uses CodeQL from the CLI.

What is CodeQL?

CodeQL is a query language and semantic code analysis engine developed by GitHub for analyzing code and identifying potential security vulnerabilities or coding errors.

  • It allows developers to write queries that can analyze codebases to find specific patterns, such as potential security flaws, bad coding practices, or logic errors.
  • The queries are written in a SQL-like language called CodeQL that can reason about code structure, data flow, control flow, and other semantic programming concepts.
  • CodeQL supports analyzing code written in many popular programming languages like JavaScript, Python, Java, C/C++, C#, Go and more.
  • It integrates into the development workflow, allowing queries to run as part of CI/CD pipelines or code editors.
  • CodeQL queries are shareable and can be combined into query suites to comprehensively analyze projects.
  • It powers the code scanning security feature in GitHub to help find vulnerabilities across repositories.

CodeQL CLI

github/codeql-cli-binaries: Binaries for the CodeQL CLI

The CodeQL CLI allows you to run CodeQL queries and analysis from the command line on your local machine or CI/CD environment.

  • It provides a way to create CodeQL databases from your source code to run queries against. These databases contain the data flows, control flows and other semantic information about the codebase.
  • You can execute custom CodeQL query files or query suites against these databases using the CLI.
  • It integrates with various build systems like Make, MSBuild, Gradle etc. to automatically build CodeQL databases during compilation.
  • The results of the queries are displayed in the CLI output, allowing you to grep, filter and analyze the findings.
  • You can run the CLI in a containerized environment to have a consistent CodeQL setup.
  • The CLI supports features like autobuilding databases, running multiple queries in parallel, diffing results between databases and more.
  • It provides a way to integrate CodeQL as part of your CI/CD pipelines for continuous code analysis.

Getting started with CodeQL CLI

1. Install the CodeQL CLI

2. Add CodeQL CLI to PATH

  • On Windows: Add the path to the extracted codeql binary to your System PATH
  • On Lunix/macOS: Add the path to your .bashrc or .zshrc file: export PATH=$PATH:/path/to/codeql

3. Create CodeQL Databases

  • Navigate to the root of your source code repository
  • Run codeql database create <language> --source-root . (e.g. codeql database create javascript)
  • This will create a CodeQL database for your project in a /codeql-database directory

4. Run CodeQL Queries

  • Visit https://github.com/github/codeql and find/write the query you want to run
  • Save the .ql query file locally
  • Run codeql database analyze <database> <query> --output=<results>e.g. codeql database analyze javascript /path/to/query.ql --output=results.json
  • This executes the query against the CodeQL database

5. View Results

  • Open the output file (e.g. results.json) to see the query results
  • Interpret any findings based on the documentation of the query

Subcommands / Options

Subcommands:

  • database create - Create a CodeQL database from source code for querying
  • database analyze - Run CodeQL queries against an existing database
  • database trace-tests - Run CodeQL tests on the database for CI verification
  • repo init - Initialize a repo to set up CodeQL analysis (generates config files)
  • repo sync - Synchronize a repo's CodeQL packs/queries with latest versions
  • dataset bundle - Utilities for bundling CodeQL dataset contents

Options:

  • --language=<lang> - Specify the language(s) for analysis
  • --source-root - Root directory of source files to extract into database
  • --threads=<n> - Number of threads to use for parallelization
  • --ram=<RAM> - Maximum RAM to use for database creation (e.g. --ram=4GB)
  • --codescanner-option - Pass options to code scanning engine
  • --output=<file>, --output-dir - Where to save results
  • --sarif-output=<file> - Save results in SARIF format

SARIF: Static Analysis Results Interchange Format

SARIF stands for "Static Analysis Results Interchange Format". It is an open-source standardized file format for representing static analysis results, defined by the Object Management Group (OMG) in a standard specification.

  • It is designed to make static analysis results shareable and interoperable between different tools and platforms.
  • SARIF files contain structured data about defects, metrics, code locations, suppressed alerts, and more output by static analysis tools.
  • The format is defined as a JSON schema, making SARIF files readable by both humans and machines.
  • It supports comprehensive metadata about the analysis run, tool details, artifact locations, code flows, call stacks, and rich result information.
  • SARIF enables integrating static analysis findings into development workflows, IDEs, CI/CD pipelines, and reporting tools.
  • Major IDEs, language services, and DevOps platforms have SARIF viewer/import capabilities built-in.

For CodeQL CLI specifically, the --sarif-output flag allows saving the query analysis results directly in the SARIF format. This makes it easy to:

  • Review results in IDEs/viewers with SARIF support
  • Integrate into engineering systems consuming SARIF data
  • Archive/compare results over time
  • Exchange results between different teams/tools

Key Takeaways

I am learning a lot about CodeQL, but I am not yet proficient in its use. I would like to continue learning. Such as:

  • Go through the official CodeQL tutorials: GitHub has an excellent set of interactive tutorial lessons on https://codeql.github.com/docs/codeql-overview/tutorials/. These cover writing basic queries, understanding CodeQL concepts like data/control flow, and more with sample code to practice on.
  • Explore sample queries and query suites: GitHub maintains open source repositories with many sample/example CodeQL queries across languages like https://github.com/github/codeql. Go through these to understand real-world security queries.
  • Set up a local CodeQL environment: Install the CodeQL CLI on your machine as per the instructions earlier. Create CodeQL databases for an open source project you use/understand. Run various queries against those databases to see results.
  • Write queries for your own code: Once I understand the basics, try writing simple CodeQL queries for my own application codebases.

要查看或添加评论,请登录

Shinya Yanagihara的更多文章

  • Day 100 of #100DaysOfLearning

    Day 100 of #100DaysOfLearning

    I have mixed feelings about it, as if it was long and short. This is finally the 100th activity that I started with the…

    1 条评论
  • Day 99 of #100DaysOfLearning

    Day 99 of #100DaysOfLearning

    What a surprise! I found myself on the 99th day of the 100Days of Learning activity. Continuation is power, indeed.

  • Day 98 of #100DaysOfLearning

    Day 98 of #100DaysOfLearning

    How do you take notes when you study? There are some note-taking systems and techniques, such as Cornell note-taking…

  • Day 97 of #100DaysOfLearning

    Day 97 of #100DaysOfLearning

    Today is the fourth day of setting up a Windows environment. Today I finally get to set up my long-awaited development…

  • Day 96 of #100DaysOfLearning

    Day 96 of #100DaysOfLearning

    I am sure you are all aware that open source also has a license. I knew that, but I always managed my GitHub…

  • Day 95 of #100DaysOfLearning

    Day 95 of #100DaysOfLearning

    Today is the third day of building a new PC environment. Today I was mainly working on the configuration of Visual…

    2 条评论
  • Day 94 of #100DaysOfLearning

    Day 94 of #100DaysOfLearning

    It is no exaggeration to say that Windows is now Linux. I'm sure some of you don't know what I mean.

    2 条评论
  • Day 93 of #100DaysOfLearning

    Day 93 of #100DaysOfLearning

    In order to make a clean break with the past, I did a clean install of Windows 11 and began to create a clean…

  • Day 92 of #100DaysOfLearning

    Day 92 of #100DaysOfLearning

    Happy April Fool's Day! Today is April 1, which is April Fool's Day. Some of you may have been looking forward to April…

  • Day 91 of #100DaysOfLearning

    Day 91 of #100DaysOfLearning

    I actually haven't used a Mac since I left my last job and entered my career break period. I use Windows every day.

社区洞察

其他会员也浏览了