Metadata-Version: 2.1
Name: scarfer
Version: 0.5.7
Home-page: https://github.com/hesa/scarfer
Author: Henrik Sanklef
Author-email: hesa@sandklef.com
Classifier: Development Status :: 4 - Beta
Classifier: Environment :: Console
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Legal Industry
Classifier: License :: OSI Approved :: GNU General Public License v3 or later (GPLv3+)
Classifier: Natural Language :: English
Classifier: Operating System :: POSIX :: Linux
Classifier: Programming Language :: Python :: 3 :: Only
Classifier: Programming Language :: Python :: 3.6
Classifier: Programming Language :: Python :: 3.7
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3
Classifier: Topic :: Software Development :: Quality Assurance
Requires-Python: >=3.6
Description-Content-Type: text/markdown
Provides-Extra: dev
License-File: LICENSES/GPL-3.0-or-later.txt

# scarfer

Source code scan report file reporter

# Introduction

Scarfer outputs compliance related information from a scan report.

A scan report contain lots of information, for example Scancode has 37
entries on the top level for each file, about a file and it is
sometimes cumbersome to open with an editor to extract the information
wanted. Scarfer provides a quick command line access to scan reports.

# Features

Scarfer can output the following information per file:

* copyright (using `-c`)

* license (using `-l`)

* text that caused the license detection (`-m`)

Scarfer can output the following summaries

* license summary (using `-ls`)

* copyright summary (using `-cs`)

## Filter

Scarfer can filter files:

* include files with:

    * license name (`-il`) using Python's regular expressions

    * files (`-if`) using Python's regular expressions

    * files (`-iff`) by reading a file, containing file names, using Python's regular expressions

    * copyright (`-ec`) using Python's regular expressions

* exclude files with:

    * license name (`-el`) using Python's regular expressions

    * files (`-ef`) using Python's regular expressions

    * files (`-eff`) by reading a file, containing file names, using Python's regular expressions

    * copyright (`-ec`) using Python's regular expressions

*Note: if you're using more than one filter then filters are AND:ed together*

## Curate

Scarfer can curate (fix, amend) license identifications:

* curate license (`-cml`) for all files with missing license

* curate license (`-cfl`) for all files matching Python's regular expressions

## Configuration file

Scarfer can write and read configuration files:

* output current (`-oc`) command line options to a configuration output

* read configuration file (`--config`)

# Example use

Output the file names (full path) of all the files in the Scancode report `example-data/cairo-1.16.0-scan.json`:
```
$ scarfer example-data/cairo-1.16.0-scan.json 
```

As above but output only files with path matching `drm`:
```
$ scarfer example-data/cairo-1.16.0-scan.json -if drm
```

Output the file names (full path) of all the files in the Scancode report `example-data/cairo-1.16.0-scan.json` with a license matching `gpl-3`:
```
$ scarfer example-data/cairo-1.16.0-scan.json -il gpl-3
```

Output the file names (full path) of all the files in the Scancode report `example-data/cairo-1.16.0-scan.json` with a license matching `mpl` and files with path matching `drm`. The output should also contain information (per file) about license and copyright:
```
$ scarfer example-data/cairo-1.16.0-scan.json -il mpl -if drm -c -l 
```

To filter in all files containing "/*pdi" and ending with ".c":
```
$ scarfer example-data/cairo-1.16.0-scan.json -if "/.*pdi.*\.c$"
```

To filter out all files containing "/*pdi" and ending with ".c":
```
$ scarfer example-data/cairo-1.16.0-scan.json -ef "/.*pdi.*\.c$"
```

# Supported scan report formats

* [Scancode](https://github.com/nexB/scancode-toolkit) Toolkit, version 21 and upwards

* [Scancode](https://github.com/nexB/scancode-toolkit) Output Format version 1.0.0, 2.0.0, 3.0, 3.2, 4.0

# Hints on source code scanners

## Scancode 32.0*

Assuming you want to scan a directory called `cairo` and store the output in `cairo-scan.json`:

```
scancode -clipe \
  --license-text   --license-text-diagnostics        \
  --classify       --license-clarity-score --summary \
  -n $(cat /proc/cpuinfo | grep processor | wc -l)   \
  --json-pp cairo-scan.json cairo
```



