Metadata-Version: 2.1
Name: qrstu
Version: 0.1.0
Summary: Q – Rainer Schwarzbach’s Text Utilities
Author-email: Rainer Schwarzbach <rainer@blackstream.de>
License: MIT License
        
        Copyright (c) 2023 Rainer Schwarzbach
        
        Permission is hereby granted, free of charge, to any person obtaining a copy
        of this software and associated documentation files (the "Software"), to deal
        in the Software without restriction, including without limitation the rights
        to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
        copies of the Software, and to permit persons to whom the Software is
        furnished to do so, subject to the following conditions:
        
        The above copyright notice and this permission notice shall be included in all
        copies or substantial portions of the Software.
        
        THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
        IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
        FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
        AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
        LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
        OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
        SOFTWARE.
        
Project-URL: Homepage, https://gitlab.com/blackstream-x/qrstu
Project-URL: Documentation, https://blackstream-x.gitlab.io/qrstu
Project-URL: CI, https://gitlab.com/blackstream-x/qrstu/-/pipelines
Project-URL: Bug Tracker, https://gitlab.com/blackstream-x/qrstu/-/issues
Project-URL: Repository, https://gitlab.com/blackstream-x/qrstu.git
Keywords: text,unicode,convert,transcode
Classifier: Development Status :: 4 - Beta
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3 :: Only
Requires-Python: >=3.8
Description-Content-Type: text/markdown
License-File: LICENSE

# Q – Rainer Schwarzbach’s Text Utilities

_Test conversion and transcoding utilities_


## Installation from PyPI

```
pip install qrstu
```

Installation in a virtual environment is strongly recommended.


## Usage

### reduce

The **reduce** module can be used to reduce Unicode text
in Latin script to ASCII encodable Unicode text,
similar to **[Unidecode](https://pypi.org/project/Unidecode/)**
but taking a different approach
(ie. mostly wrapping functionality from the standard library module
**[unicodedata](https://docs.python.org/3/library/unicodedata.html)**).
Unlike **Unidecode** which also transliterates characters from non-Latin scripts,
**reduce** stubbornly refuses to handle these.

You can, however, specify an optional `errors=` argument in the
**reduce.reduce_text()** call, which is passed to the internally used
**[codecs.encode()](https://docs.python.org/3/library/codecs.html#codecs.encode)**
function, thus taking advance of the codecs module errors handling.

```python
>>> from qrstu import reduce
>>> # Vietnamese text
>>> reduce.reduce_text("Chào mừng đến với Hà Nội!")
'Chao mung dhen voi Ha Noi!'
>>>
>>> # Trying the Unidecode examples …
>>> reduce.reduce_text('kožušček')
'kozuscek'
>>> reduce.reduce_text('30 \U0001d5c4\U0001d5c6/\U0001d5c1')
'30 km/h'
>>> reduce.reduce_text('\u5317\u4EB0')
Traceback (most recent call last):
  File "…/qrstu/src/qrstu/reduce.py", line 354, in reduce_text
    chunk = translations[character.nfc]
            ~~~~~~~~~~~~^^^^^^^^^^^^^^^
KeyError: '北'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "…/qrstu/src/qrstu/reduce.py", line 276, in reduce
    collector.append(PRESET_CHARACTER_REDUCTIONS[codepoint])
                     ~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^
KeyError: 21271

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "…/qrstu/src/qrstu/reduce.py", line 356, in reduce_text
    chunk = character.reduce(errors=errors)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "…/qrstu/src/qrstu/reduce.py", line 278, in reduce
    encoded = codecs.encode(
              ^^^^^^^^^^^^^^
UnicodeEncodeError: 'ascii' codec can't encode character '\u5317' in position 0: ordinal not in range(128)
>>> reduce.reduce_text('\u5317\u4EB0', errors="ignore")
''
>>> reduce.reduce_text('\u5317\u4EB0', errors="replace")
'??'
>>> reduce.reduce_text('\u5317\u4EB0', errors="backslashreplace")
'\\u5317\\u4eb0'
>>> reduce.reduce_text('\u5317\u4EB0', errors="xmlcharrefreplace")
'&#21271;&#20144;'
>>> reduce.reduce_text('\u5317\u4EB0', errors="namereplace")
'\\N{CJK UNIFIED IDEOGRAPH-5317}\\N{CJK UNIFIED IDEOGRAPH-4EB0}'
>>>

```


## Further reading

Please see the documentation at <https://blackstream-x.gitlab.io/qrstu>
for detailed usage information.

If you found a bug or have a feature suggestion,
please open an issue [here](https://gitlab.com/blackstream-x/qrstu/-/issues)

