Other Tools Testing Guidance

Using scanning tools other than the CASA pre-configured:

Developers who wish to use SAST and DAST scanning tools (internally developed or commercial tools) must provide:

  • The scan output against OWASP Benchmark. The expectation is for scanning tools to detect all true positive vulnerabilities which map to CASA (See instructions below)
  • The scan policy. This could be an export of the tool scanning configuration or a screenshot showing which CWEs the tool scanned against. 

Running Benchmark 

You must have the following installed to run Benchmark:

  1. GIT: https://git-scm.com/ or https://github.com/
  2. Maven: https://maven.apache.org/; (Version: 3.2.3 or newer works.)
  3. Java: https://www.oracle.com/java/technologies/javase-downloads.html (Java 7 or 8) (64-bit)

For CASA purposes, be sure to generate and submit a Benchmark Scorecard for your DAST or SAST scanning tool. Please note that the Benchmark contains a large number of tests and may take a while to scan. 

Run the following commands to download and run the Benchmark application:  

$ git clone https://github.com/OWASP-Benchmark/BenchmarkJava
    $ cd benchmark
    $ mvn compile
    $ runBenchmark.sh

Benchmark will be running at https://localhost:8443/benchmark/. This can be used as the target for any DAST scans. Now you can run your tool of choice against the Benchmark source code and run time application. Be sure to save your results in the /results directory of the Benchmark repository. 

Include the following in your output file name for the scorecard generator: 

  1. The Benchmark version number, to prevent the wrong expected outputs from being mapped
  2. The name of your scanning tool 
  3. The version of your scanning tool 

The following example output file name maps to a scan run using version 3.0 of “toolname” and compared against Benchmark version 1.2. 


Generating a Benchmark Score

Benchmark scores applications against four metrics: 

  1. True Positives – the tool correctly identifies a real vulnerability
  2. False Negatives – the tools fails to identify a real vulnerability 
  3. True Negative – the tool correctly ignores a false alarm 
  4. False Negative – the tool fails to ignore a false alarm

The Benchmark scorecard outlines how your scanning tool performs across these dimensions. When you download the Benchmark tool from GitHub, it includes a BenchmarkScore tool. Running this tool against your scan output will generate a PNG graph outlining your final score. Example results can be found here. To generate your own Benchmark scorecards, run the following command:

sh createScorecards.sh

This will create a scorecard for each output within the /results directory. The generated scorecard will be saved within the /scorecard directory. Submit the scorecard as part of your CASA assessment.