Shifting Code Security to the Extreme Left with Semgrep and Git Hooks

 


Good developers are efficient. It’s an unspoken rule in software development: we constantly look for ways to offload repetitive tasks to tools and automation. Why waste mental bandwidth on things our machines can handle for us? One of the best examples of this principle is automating security checks in the development lifecycle.

Today, we’ll explore how to automatically run static analysis checks on your codebase using Semgrep and Git Hooks. With a simple script, you can enforce security checks every time code is pushed to your Git repository, ensuring vulnerabilities don’t slip through the cracks.

 

What is Shift-Left Security?

The Shift-Left approach means prioritizing security early in the software development lifecycle. Instead of waiting until deployment to identify vulnerabilities, Shift-Left moves security checks to the development phase.

Why does this matter? Fixing bugs earlier is faster, cheaper, and significantly reduces risks. By integrating tools like Semgrep into the workflow, we can empower developers to find and fix vulnerabilities before code is even pushed to a shared branch.

 

What are Git Hooks?

Git Hooks are scripts that Git runs automatically in response to specific events in your repository, such as committing, merging, or pushing code. These hooks live in the .git/hooks folder of your repository and can be tailored to enforce rules or automate workflows.

For example, you might use a pre-commit hook to run tests or lint your code. Similarly, you can use a pre-push hook to enforce static security scans—ensuring code meets quality and security standards before it’s pushed to the cloud.

To create a new hook, simply make an executable script in the appropriate .git/hooks folder, such as .git/hooks/pre-push. On Unix-based systems, run chmod +x on the script to make it executable.

 

What is Semgrep?

Semgrep is a powerful, lightweight static analysis tool that allows you to write and enforce customizable security and code quality rules.

Unlike traditional static analysis tools, Semgrep is fast, developer-friendly, and integrates easily into CI/CD pipelines or local workflows. It supports a variety of programming languages, including Python, JavaScript, Go, Java, and more.

With Semgrep, you can:

  • Detect security vulnerabilities in code.
  • Enforce code quality standards.
  • Customize rules to fit your team’s needs.

 

Using Semgrep Locally

Running Semgrep locally ensures you can identify and fix vulnerabilities before pushing code. Here's how to set up and run Semgrep for various platforms.

macOS

# Install through Homebrew  
brew install semgrep  

# Install through pip  
python3 -m pip install semgrep  

# Confirm installation succeeded by printing the currently installed version  
semgrep --version 

Linux and Windows Subsystem for Linux (WSL)

# Install through pip  
python3 -m pip install semgrep  

Docker

# Pull the latest Semgrep Docker image  
docker pull semgrep/semgrep 

# Confirm version  
docker run --rm semgrep/semgrep semgrep --version  

 

Note:

  • For Homebrew users, ensure you’ve added Homebrew to your PATH.
  • For Docker users, include the -v option in your commands to mount the project directory for scanning.

 

Log in to Semgrep

After installation, log in to your Semgrep account:

semgrep login  

 

  • This launches a browser window to authenticate your CLI.
  • Alternatively, use the link returned in the CLI to complete the process.
  • In the Semgrep CLI login, click Activate to proceed.
  •  

    For Docker users:

    docker run -it semgrep/semgrep semgrep login  

     

    Run a Scan

    Once logged in, navigate to the root of your repository and run your first scan:

    Local CLI

    semgrep ci

    OR

    semgrep scan

    Note: semgrep scan is a bit faster than semgrep ci; however it does not upload the results to semgrep dashboard.

    Docker
    For macOS/Linux:

    docker run -e SEMGREP_APP_TOKEN=YOUR_TOKEN --rm -v "${PWD}:/src" semgrep/semgrep semgrep ci  

    For Windows:

    docker run -e SEMGREP_APP_TOKEN=YOUR_TOKEN --rm -v "%cd%:/src" semgrep/semgrep semgrep ci  

     

    This ensures your code is scanned with Semgrep's ruleset, flagging vulnerabilities before pushing to shared branches. Refer Semgrep's official documentation.

     

    Pre-Commit vs. Pre-Push Hooks

    Git hooks allow developers to automate tasks during the software development lifecycle. Two commonly used hooks for running automated checks are pre-commit and pre-push. While both serve important purposes, there are key differences that make pre-push hooks more suitable for running tools like Semgrep.

    Pre-Commit Hooks

    • When It Runs: The pre-commit hook runs before a commit is recorded in your local repository.
    • Purpose: It’s often used to check formatting, lint code, or ensure that commit messages follow a specific style.
    • Advantages:
      • Provides instant feedback.
      • Prevents poorly formatted or non-compliant code from being committed.
    • Limitations:
      • It runs every time you commit, which can be intrusive for developers making frequent small commits.
      • Time-intensive processes like static analysis can slow down the local development workflow.

    Pre-Push Hooks

    • When It Runs: The pre-push hook executes right before code is pushed to a remote repository.
    • Purpose: It’s ideal for running tasks that enforce quality and security, such as running tests, validating dependencies, or performing static analysis scans.
    • Advantages:
      • Developers can commit freely and run extensive checks only when pushing code.
      • Integrates seamlessly with existing workflows without disrupting the local commit process.
      • Prevents issues from entering shared branches, ensuring higher code quality in remote repositories.

    Why Pre-Push is Better for Semgrep

    1. Performance: Semgrep scans can be resource-intensive, especially for large projects. Running them during the pre-commit phase could lead to unnecessary delays, especially if developers make frequent commits.
    2. Flexibility: Developers often make multiple local commits as they iterate on features. By using a pre-push hook, Semgrep scans are deferred until the final push, allowing developers to focus on coding.
    3. Shared Responsibility: A pre-push hook ensures that only secure and compliant code reaches the remote repository, maintaining high standards across the team.
    4. Avoiding Local Friction: Pre-commit hooks can be overly strict during the early stages of development. Pre-push hooks, by contrast, balance security with productivity.

    By using a pre-push hook for Semgrep, you ensure your code is thoroughly scanned for vulnerabilities without disrupting your development flow. The provided Git hook script demonstrates how to effectively enforce this workflow.

    Pseudo logic for Semgrep scan Git-Hook

    1. Define protected branches: Create an array with the names of branches (main and master) that are protected.

    2. Get the current branch: Get the current branch name using Git.

    3. Check if the current branch is protected:

      • Iterate over the list of protected branches.
      • If the current branch is one of the protected branches (main or master), output a message indicating direct pushes are not allowed and exit with a failure code.
    4. Proceed with Semgrep check:

      • If the branch check passes, print a message to indicate that Semgrep checks will proceed.
    5. Define a function to install Semgrep:

      • Check if pip3 is available.
      • If pip3 is found, attempt to install Semgrep using pip3. If installation fails, display an error and exit.
      • If pip3 is not found, display an error indicating pip3 needs to be installed.
    6. Check if Semgrep is installed:

      • If Semgrep is not installed, call the install_semgrep function to install it.
    7. Check if the user is logged into Semgrep:

      • Check the login status using semgrep login.
      • If the user is not logged in (based on the login status), display a message prompting the user to log in and exit.
      • If the user is logged in, proceed with the scan.
    8. Run the Semgrep CI scan:

      • Run the Semgrep CI scan using semgrep ci.
    9. Check the result of the Semgrep scan:

      • If the scan fails (exit status is non-zero), display a message indicating the scan failed and exit.
      • If the scan passes, print a message indicating success and allow the push to proceed.
    10. End the process:

      • Exit the script with a success code if the push is allowed after the scan.

    This logic ensures that direct pushes to the main or master branches are prevented, and that necessary security checks using Semgrep are performed before allowing a push.

    Git Hook for Semgrep Scans

    To automate Semgrep scans before code is pushed, you can use a Git hook script like the one below. This script:

    • Blocks direct pushes to protected branches (main or master).
    • Ensures Semgrep is installed on the developer’s machine.
    • Verifies the user is logged into Semgrep.
    • Runs a Semgrep CI scan, blocking the push if any issues are found.

    Git Hook Script:

    #!/bin/bash
    
    # Prevent direct pushes to the main or master branch
    protected_branches=("main" "master")
    
    # Get the branch being pushed
    current_branch=$(git rev-parse --abbrev-ref HEAD)
    
    # Check if the branch is protected
    for branch in "${protected_branches[@]}"; do
        if [ "$current_branch" == "$branch" ]; then
            echo "Pushing directly to the '$current_branch' branch is not allowed."
            echo "Please create a pull request or use a different branch."
            exit 1
        fi
    done
    
    echo "Branch check passed. Proceeding with Semgrep checks..."
    
    # Function to check and install Semgrep if not present
    install_semgrep() {
        echo "Semgrep is not installed. Attempting to install..."
        if command -v pip3 &> /dev/null; then
            pip3 install semgrep --break-system-packages
            if [ $? -ne 0 ]; then
                echo "Error: Failed to install Semgrep. Please install it manually."
                exit 1
            fi
            echo "Semgrep installed successfully."
        else
            echo "Error: pip3 is not installed. Install pip3 and retry."
            exit 1
        fi
    }
    
    # Check if Semgrep is installed
    if ! command -v semgrep &> /dev/null; then
        install_semgrep
    fi
    
    # Check if the user is logged into Semgrep
    login_status=$(semgrep login 2>&1)
    if [[ "$login_status" != "API token already exists" ]]; then
        echo "Error: You are not logged into Semgrep."
        echo "Please log in using 'semgrep login' and try again."
        exit 1
    else
        echo "Semgrep is logged in. Proceeding with the scan..."
    fi
    
    # Run Semgrep CI scan
    echo "Running Semgrep scan before pushing..."
    semgrep ci
    
    # Check the exit status of Semgrep
    if [ $? -ne 0 ]; then
        echo "Semgrep scan failed. Fix the issues before pushing."
        exit 1
    fi
    
    echo "Semgrep scan passed. Proceeding with the push."
    exit 0


    How to Use the Git Pre-Push Script with Semgrep

    1. Locate the Git Hooks Directory

    Git hooks are stored in the .git/hooks directory of your repository. To access it:

    cd /path/to/your/repo/.git/hooks 
     

    2. Create the Pre-Push Hook

    Create a file named pre-push (or edit the existing one) in the .git/hooks directory:

    touch pre-push  
    chmod +x pre-push
    OR
    Run the following one-liner within your project directory to immediately set up and start using the Git pre-push hook:
     
    curl -o .git/hooks/pre-push https://raw.githubusercontent.com/amykr777/semgrep-git-hook/refs/heads/main/pre-push && \
    chmod +x .git/hooks/pre-push
    

     

    3. Add the Script to the Hook

    Copy and paste the provided script into the pre-push file:

    Save and close the file.

    4. Push Your Code

    With the pre-push hook in place, try pushing your code:

    git push origin <branch-name> 


     

    How It Works

    When you push code:

    1. Branch Protection: The script checks if you’re pushing to a protected branch (e.g., main). If so, it blocks the push.
    2. Semgrep Installation: The script ensures Semgrep is installed and available locally.
    3. Login Validation: It verifies that you’re logged into Semgrep to enable scanning.
    4. Security Scan: The script runs semgrep ci to scan the codebase. If issues are found, the push is blocked. Semgrep only returns boolean 1 if there are rules which are in blocking mode, else it will always return 0.

     

    Conclusion

    With this setup, you can enforce Shift-Left Security practices by integrating Semgrep into your Git workflow. The provided Git hook script ensures that every push is automatically scanned for vulnerabilities, preventing insecure code from entering your repository.

    This approach is a perfect example of working smarter, not harder—empowering developers to build secure applications without adding extra manual steps to their workflow.

    Try it out today, and take the first step toward seamless security automation!

     

    No comments:

    Post a Comment