Metadata-Version: 2.4
Name: conc
Version: 0.1.5
Summary: A Python library for efficient corpus analysis, enabling corpus linguistic analysis in Jupyter notebooks.
Home-page: https://github.com/polsci/conc
Author: polsci
Author-email: geoffrey.ford@canterbury.ac.nz
License: MIT License
Keywords: corpus corpora nbdev jupyter notebook python
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: Natural Language :: English
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: License :: OSI Approved :: MIT License
Requires-Python: >=3.10
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: fastcore
Requires-Dist: numpy
Requires-Dist: polars
Requires-Dist: scipy
Requires-Dist: msgspec
Requires-Dist: great_tables
Requires-Dist: spacy
Requires-Dist: python-slugify
Requires-Dist: plotly
Requires-Dist: jupyterlab
Requires-Dist: ipywidgets
Requires-Dist: memory_profiler
Requires-Dist: requests
Provides-Extra: dev
Requires-Dist: nbdev; extra == "dev"
Requires-Dist: line_profiler; extra == "dev"
Requires-Dist: nltk; extra == "dev"
Requires-Dist: datasets; extra == "dev"
Requires-Dist: jupyterlab-quarto; extra == "dev"
Requires-Dist: twine; extra == "dev"
Requires-Dist: pandas; extra == "dev"
Requires-Dist: pyarrow; extra == "dev"
Requires-Dist: matplotlib; extra == "dev"
Dynamic: author
Dynamic: author-email
Dynamic: classifier
Dynamic: description
Dynamic: description-content-type
Dynamic: home-page
Dynamic: keywords
Dynamic: license
Dynamic: license-file
Dynamic: provides-extra
Dynamic: requires-dist
Dynamic: requires-python
Dynamic: summary

# Conc


<!-- WARNING: THIS FILE WAS AUTOGENERATED! DO NOT EDIT! -->

## Introduction to Conc

Conc is a Python library that brings tools for corpus linguistic
analysis to [Jupyter notebooks](https://docs.jupyter.org/en/latest/).
Conc aims to allow researchers to analyse large corpora in efficient
ways using standard hardware, with the ability to produce clear,
publication-ready reports and extend analysis where required using
standard Python libraries.

<img
src="https://raw.githubusercontent.com/polsci/conc/refs/heads/master/nbs/50_conc-5.png"
data-fig-align="left" alt="Example Concordance" />

A staple of data science, [Jupyter notebooks allow researchers to
present their analysis in an interactive form that combines code,
reporting and
discussion](https://docs.jupyter.org/en/latest/#what-is-a-notebook).
They are an ideal format for collaborating with other researchers during
research or to share analysis in a way others can reproduce and interact
with.

Conc uses [spaCy](https://spacy.io/) for tokenising texts. SpaCy
functionality to annotate texts will be supported soon.

Conc uses well-supported Python libraries for processing data and
prioritises the fastest code libraries and data structures. The library
produces clear reports with important information to interpret result by
default. Conc makes it easy to extend analysis using other libraries or
software. [Conc’s corpus format is
well-documented](https://geoffford.nz/conc/explanations/anatomy.html)
and there are [code examples to help you work with Conc results and data
structures outside of
Conc](https://geoffford.nz/conc/tutorials/recipes.html) if you want to
extend your analysis.

Conc’s documentation site has more information on Conc, [why it was
developed and the principles guiding Conc’s
development](https://geoffford.nz/conc/explanations/why.html).

## Table of Contents

- [Acknowledgements](#acknowledgements)  
- [Development Status](#development-status)  
- [Installation](#installation)  
- [Using Conc](#using-conc)

### Direct links to Conc documentation

- [Getting Started](https://geoffford.nz/conc/tutorials/start.html)  
- [Tutorials](https://geoffford.nz/conc/tutorials) (Tutorials to get you
  started with Conc)  
- [Documentation](https://geoffford.nz/conc/)
  ([Explanations](https://geoffford.nz/conc/explanations), [Conc API
  Reference](https://geoffford.nz/conc/api), information on
  [Development](https://geoffford.nz/conc/development))

## Acknowledgements

Conc is developed by [Dr Geoff Ford](https://geoffford.nz/).

Work to create this Python library has been made possible by
funding/support from:

- “Mapping LAWS: Issue Mapping and Analyzing the Lethal Autonomous
  Weapons Debate” (Royal Society of New Zealand’s Marsden Fund Grant
  19-UOC-068)  
- “Into the Deep: Analysing the Actors and Controversies Driving the
  Adoption of the World’s First Deep Sea Mining Governance” (Royal
  Society of New Zealand’s Marsden Fund Grant 22-UOC-059)
- Sabbatical, University of Canterbury, Semester 1 2025.

Thanks to the Mapping LAWS project team for their support and feedback
as first users of ConText (a web-based application built on an earlier
version of Conc).

Dr Ford is a researcher with [Te Pokapū Aronui ā-Matihiko \| UC Arts
Digital Lab (ADL)](https://artsdigitallab.canterbury.ac.nz/). Thanks to
the ADL team and the ongoing support of the University of Canterbury’s
Faculty of Arts who make work like this possible.

Thanks to Dr Chris Thomson and Karin Stahel for their feedback on early
versions of Conc.

## Development Status

Conc is in active development. It is currently
[released](https://pypi.org/project/conc) for beta testing. See the
[CHANGELOG](CHANGELOG.md) for notes on releases and the
[Roadmap](https://geoffford.nz/development/roadmap.html) for planned
updates.

Although this is a Beta release, I’m currently using Conc for research
and postgraduate teaching. I’m keen to support new users. If you have
any questions, encounter hurdles using Conc or have feature requests,
[create an issue](https://github.com/polsci/conc/issues/new).

## Installation

Installing Conc is simple. Below is the essential information if you
want to use Conc. The [installation
page](https://geoffford.nz/conc/tutorials/install.html) has more
information. You can also [install the development
version](https://geoffford.nz/conc/tutorials/install.html#install-the-development-version)
of Conc, which may include new functionality and bug fixes. If you want
to download sample corpora you will need to [install optional
dependencies](https://geoffford.nz/conc/tutorials/install.html#install-optional-dependencies).
If you have an older computer with a pre-2013 CPU, you will probably
need to install a version of Polars compiled for older machines, see the
[install page for
details](https://geoffford.nz/conc/tutorials/install.html#pre-2013-cpu-install-polars-with-support-for-older-machines).

### 1. Install via pip

Conc is tested with Python 3.10+. You can install Conc from
[pypi](https://pypi.org/project/conc) using this command:

``` sh
pip install conc
```

Add the `-U` flag to upgrade if you are already running Conc.

### 2. Install a spaCy model for tokenization

Conc uses a SpaCy language model for tokenization. After installing
Conc, install a model. If you are working with English-language texts,
install SpaCy’s small English model (which is Conc’s default) like this:

``` sh
python -m spacy download en_core_web_sm
```

If you are working with a different language or want to use a different
‘en’ model, check the [SpaCy models
documentation](https://spacy.io/models/) for the relevant model name.

## Using Conc

### Getting started

A good place to start is the [Get started with
Conc](https://geoffford.nz/conc/tutorials/start.html) tutorial, which
demonstrates how to build a corpus and output Conc reports. There are
also [simple code
recipes](https://geoffford.nz/conc/tutorials/recipes.html) for common
Conc tasks.

### Conc Documentation

There is a dedicated [Conc documentation
site](https://geoffford.nz/conc/). This includes tutorials, examples
demonstrating how to create reports for analysis, explanation of Conc
functionality and its Corpus format, and a reference to Conc’s classes
and methods. Here are links to the documentation site sections:

- [Tutorials](https://geoffford.nz/conc/tutorials) to get you started
  with Conc  
- The [Explanations](https://geoffford.nz/conc/explanations) section
  includes information on how Conc works, how to work with the Conc
  corpus format and Conc results with other Python libraries  
- The [Conc API Reference](https://geoffford.nz/conc/api) provides
  detailed documentation of Conc classes and functions  
- The [Development](https://geoffford.nz/conc/development) section gives
  information on Conc development, including a Roadmap and Developer’s
  Guide
