Metadata-Version: 2.0
Name: wrld
Version: 0.2
Summary: simplified bash loops (or, xargs -I on steroids)
Home-page: https://github.com/ninjaaron/wrld
Author: Aaron Christianson
Author-email: ninjaaron@gmail.com
License: BSD
Keywords: evaluate
Platform: UNKNOWN

wrld: Avoid writing loops in shell one-liners
---------------------------------------------

You may think that ``wrld`` is some abbreviated form of "world". This is
not the case. The world is lame. What isn't lame is iterating on stdin.
Probably my favorite thing to do. In the shell, the sanest way to do
this is with a ``while read line; do`` loop. Forget the world. ``wrld``
is the future of iteration.

Raise your hand if you have ever written this loop:

.. code:: bash

  find -name '*foo.bar' -type f|while read line; do
    mv "$line" "$(echo "$line"|sed 's/pat/rep/')"
  done

Or the related loop:

.. code:: bash

  for i in *foo.bar; do
    cp ... # I'm too lazy even to finish this example.
  done

Note:
 if you have ever written a loop that starts with the words ``for i in
 $(ls ...``, you're doing it wrong. Do one of the above instead.  (also,
 the ``while read line; do`` version can also fail if there are
 filenames with newlines, which you might have if you're iterating on
 filenames generated by an idiot.)

With ``wrld``, you can write like this: ``find -name '*foo.bar' -type f
| wrld mv {} '@sed "s/pat/rep/"'``. You can do something similar with
globs as well: ``wrld mv {} '@sed "s/pat/rep/"' -f *foo.bar`` This is
manifestly better for one-liners in the shell.

You could also think of it as ``xargs -I{}`` or the ``-exec`` flag from
``find`` on steroids, because it iterates on stdin, but it also allows
inlining arbitrary shell commands.

.. code:: bash

    $ ls|wrld mv {} '@awk "{print $2, $1}"'
    mv 'Arnold Palmer' 'Palmer Arnold'
    mv 'Jane Doe' 'Doe Jane'
    mv 'John Doe' 'Doe John'
    mv 'John Wayne' 'Wayne John'
    mv 'Lucy Lawless' 'Lawless Lucy'
    mv 'Ricki Lake' 'Lake Ricki'

As you can see, inlined commands have the current line piped to their
stdin. If you want to use some poorly-designed command that doesn't read
from stdin as the filter, you can also substitute ``{}`` for the current
line.  Use ``\{}`` if you need a literal '{}'. However, if you can't do
it with sed or awk, there's always ``perl -pe``, and if you can't do it
with ``perl -pe``, I don't want to know about it. You can also see that
``wrld`` echos back the commands it constructs. You can shut it up with
``-q``/``--no-echo``.

Because POSIX stupidly allows newlines in file names, this is
actually a "dangerous" example unless can guarantee there are no idiot
newlines in the file names. For this reason, you may instead specify a
list of file names to iterate over (like, preferably with a glob) with
the -f/--file-list flag:

.. code:: bash

    $ wrld mv {} '@awk "{print $2, $1}"' -f *
    mv 'Doe Jane' 'Jane Doe'
    mv 'Doe John' 'John Doe'
    mv 'Lake Ricki' 'Ricki Lake'
    mv 'Lawless Lucy' 'Lucy Lawless'
    mv 'Palmer Arnold' 'Arnold Palmer'
    mv 'Wayne John' 'John Wayne'

If you're using a proper shell like fish or zsh, you can do recursive
globbing and get quite a lot done this way.

  One day, in the far distant future, wrld may support splitting stdin
  on the null byte for compatibility with ``find -print0``. It is a
  little know fact that any task which a computer is capable of
  preforming may be prefomed with the ``find`` command, so compatibility
  is key.

flags
~~~~~
wrld is stupid about flags with the command it wraps. If you want to
send a flag through to whatever binary you use in your loop, it needs a
backslash in front of it. This means you actually have to use a double
backslash ``\\`` in most shells to get it through.

optimize
~~~~~~~~

Note:
 I/O bound tasks will not benefit much from these optimizations.

As you may note, wrld is capable of spawning a lot of processes. If it's
some quick thing, who cares? If your iterating over a million files, it
might be bad. wrld offers some internal goodies to speed things along,
but they are written in python, so don't expect any miracles! (kind of
kidding. A few lines of python is way faster than spawning a new
process, but it would be much slower than piping a million lines strait
through ``sed`` or whatever optimized C utility).

These builtins are for certain common file operations: they have names
like "move", "copy", "hlink" and "slink".

- ``move`` moves files recursively. It's like ``mv`` without any
  options.
- ``copy`` copies files recursively. It's like ``cp -R``.
- ``hlink`` creates hard links. Hard links basically give the same chunk
  of data more than one name on the filesystem. It's called a "hard"
  link because of the physiological responce many people experience when
  they realize how powerful this idea can be.
- ``slink`` creates soft links. These are about like shortcuts on the
  great and glorious Windows operating system. They are called "soft"
  links because of what happens to you when you realize the original
  file has moved and all your links are broken. You never have this
  problem with "hard" links, but you can't use them across different
  partitions/devices or on directories, so, eh.
- ``srlink`` expand relative paths to absolute paths when soft linking.
  Like ``ln -sr``.
- ``remove`` remove stuff. recursively. take care.
- ``makedir`` makes directories... works like ``mkdir -p``

Other builtins may be added as they occur to me or users ask for them.
``mv``, ``cp`` and ``ln`` are commands I frequently find myself needing
in these kinds of loops.

Another way to optimize is by using ``|`` as a prefix to your filters,
rather than ``@``; i.e. ``wrld move {} '|awk "{print $2, $1}"' -f *``.
This opens a single process of ``awk``, filters stdin through that, and
then zips the results together with the main loop. This will create
problems if the filter produces no output for certain lines of input
(like ``grep`` would, though I don't know why you'd use grep in a
context like this...), or if you have filenames with newlines, like a
freak. So, it will work in most cases. One day I may implement this
properly with asyncronous piping, so this won't be a problem.

There are also two buitin filters. ``@py`` allows you to use arbitrary
python expressions as a filter. The current line or filename is
available in the execution context as ``i``.

.. code:: bash

    $ wrld move {} '@py i.upper()' -f *
    move 'Arnold Palmer' 'ARNOLD PALMER'
    move 'Jane Doe' 'JANE DOE'
    move 'John Doe' 'JOHN DOE'
    move 'John Wayne' 'JOHN WAYNE'
    move 'Lucy Lawless' 'LUCY LAWLESS'
    move 'Ricki Lake' 'RICKY LAKE'

``@py`` uses a little namespace magic that will import any module you
happen to use in your expression on demand. Note that only expressions
and not statements are supported. ``@py`` should also do the right thing
with newlines in file names.

The other builtin filter is ``s``. The syntax looks a bit like ``sed``,
but it's python regex, so refer to the relevant docs if you're not
already familiar with it. It's based on Perl, like the regex in most
popular programming langauges (and unlike sed), but it has a few of its
own quirks.

.. code:: bash

    $ wrld move {} 's/[aeiou]/λ/g' -f *
    move 'Arnold Palmer' 'Arnλld Pλlmλr'
    move 'Jane Doe' 'Jλnλ Dλλ'
    move 'John Doe' 'Jλhn Dλλ'
    move 'John Wayne' 'Jλhn Wλynλ'
    move 'Lucy Lawless' 'Lλcy Lλwlλss'
    move 'Ricki Lake' 'Rλckλ Lλkλ'

It accepts any flags that can be used in a python regex in the contex of
``(?[flags])``, so, ``aiLmsux``. In addition, the ``g`` flag is
supported, to make it more similar to sed and Perl. While ``/`` is used
as the delimiter by convention, any non-alphanumeric character may be
used.

If the replacement is prefixed with ``\e``, a python expresison can be
used, where ``m`` is the re.match object for each match, so that offers
some interesting possibilities.

I can neither confirm nor deny that there may be another filter in my
mind for doing awk-like things based on python's ``str.filter`` method.


