Metadata-Version: 2.0
Name: fsed
Version: 0.5.1
Summary: Aho-Corasick string replacement utility
Home-page: https://github.com/wroberts/fsed
Author: Will Roberts
Author-email: wildwilhelm@gmail.com
License: MIT
Keywords: string search replace rewrite
Platform: UNKNOWN
Classifier: Development Status :: 3 - Alpha
Classifier: Environment :: Console
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Information Technology
Classifier: Intended Audience :: Science/Research
Classifier: Intended Audience :: System Administrators
Classifier: Topic :: Scientific/Engineering
Classifier: Topic :: Scientific/Engineering :: Bio-Informatics
Classifier: Topic :: Software Development :: Pre-processors
Classifier: Topic :: Text Processing :: Filters
Classifier: Topic :: Text Processing :: Indexing
Classifier: Topic :: Text Processing :: Linguistic
Classifier: Topic :: Utilities
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 2
Classifier: Programming Language :: Python :: 2.6
Classifier: Programming Language :: Python :: 2.7
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.3
Classifier: Programming Language :: Python :: 3.4
Requires-Dist: click

================================================
 fsed - Aho-Corasick string replacement utility
================================================

.. image:: https://travis-ci.org/wroberts/fsed.svg?branch=master
    :target: https://travis-ci.org/wroberts/fsed
    :alt: Travis CI build status

.. image:: https://coveralls.io/repos/wroberts/fsed/badge.svg?branch=master&service=github
    :target: https://coveralls.io/github/wroberts/fsed?branch=master
    :alt: Test code coverage

.. image:: https://img.shields.io/pypi/v/fsed.svg
    :target: https://pypi.python.org/pypi/fsed/
    :alt: Latest Version

Copyright (c) 2015 Will Roberts <wildwilhelm@gmail.com>

Licensed under the MIT License (see file ``LICENSE.rst`` for
details).

Search and replace on file(s), with matching on fixed strings.

``fsed`` is a tool specially designed for situations where you have to
do *many* string search-and-replace operations with fixed strings
(that is, ``fsed`` doesn't do regular expressions).  By doing all the
searching and replacing on all the patterns at the same time, ``fsed``
can be much faster than tools that do string rewriting one pattern at
a time (like one-liners in ``sed`` or ``perl``).

To do its searching, ``fsed`` uses the `Aho-Corasick algorithm`_,
which is a very clever way of matching multiple patterns at the same
time, and was used to implement the original `fgrep`_ Unix utility
(now accessed as ``grep -F``).  This algorithm is capable of finding
matches which overlap each other, and in these cases, ``fsed`` must
choose which matches to rewrite.  The policy adopted by ``fsed`` is to
be greedy, and always rewrite the shortest, leftmost match first.

For illustration, imagine a situation where we would like to rewrite
``a`` with ``b``, ``aa`` with ``c``, and ``aaa`` with ``d``.  What
should we do when we see the input string ``aaa``?  Should we produce
``bbb``, ``bc``, ``cb``, or ``d``?  ``fsed`` produces ``bbb`` in this
case.

.. _`Aho-Corasick algorithm`: https://en.wikipedia.org/wiki/Aho%E2%80%93Corasick_algorithm
.. _fgrep: https://en.wikipedia.org/wiki/Grep#Variations

Install
=======

``fsed`` is written in Python_; you can install it with pip_::

    pip install fsed

.. _Python: http://www.python.org/
.. _pip: https://en.wikipedia.org/wiki/Pip_(package_manager)

Usage
=====

::

    fsed [OPTIONS] PATTERN_FILE [INPUT_FILE [INPUT_FILE2 ...]]

If one or more ``INPUT_FILEs`` are specified, ``fsed`` reads and
concatenates these as its input; otherwise, ``fsed`` reads the
standard input.

Options:

``--pattern-format=FMT``
    Set FMT to ``tsv`` or ``sed`` (default is ``sed``) to specify the
    format of ``PATTERN_FILE``.

``-o/--output=OUTFILE``
    Specifies that the program output should be written to ``OUTFILE``.
    If this option is not used, ``fsed`` writes to standard output.

``-w/--words``
    Makes ``fsed`` match only on word boundaries; this flag instructs
    ``fsed`` to append ``\b`` to the beginning and end of every
    pattern in ``PATTERN_FILE``.

``--by-line/--across-lines``
    Sets whether ``fsed`` should process the input line by line
    or character by character; the default is ``--across-lines``.

``--slow``
    Indicates that ``fsed`` should try very hard to always find the
    longest matches on the input; this is very slow, and forces
    ``--by-line`` to be on.

``-q``
    Quiet operation, do not emit warnings.

``-v/--verbose``
    Turns on debugging output.

Note: ``fsed`` runs even faster using PyPy_::

    pypy -m fsed.fsed [OPTIONS] PATTERN_FILE [INPUT_FILE [INPUT_FILE2 ...]]

.. _PyPy: http://pypy.org/

Pattern File
============

``PATTERN_FILE`` contains a list of patterns to search and replace in
the input; each pattern is listed on a separate line.  ``fsed``
supports two formats for specifying patterns.  The default, ``sed``,
specifies strings and their replacements the way the ``sed`` utility
does::

    s/SEARCH/REPLACE/

The character following the ``s`` character is the pattern delimiter,
and can be any character (it does not have to be a forward slash).

The other format, ``tsv``, specifies patterns using ``<TAB>``
characters as delimiters::

    SEARCH<TAB>REPLACE

In this format, there must be only one ``<TAB>`` character per line.

Patterns can contain escape characters:

``\\``
    Backslash (\\)

``\a``
    ASCII bell (BEL)

``\b``
    Word boundary

``\f``
    ASCII formfeed (FF)

``\n``
    ASCII linefeed (LF)

``\r``
    Carriage Return (CR)

``\t``
    Horizontal Tab (TAB)

``\v``
    ASCII vertical tab (VT)


