.. charex documentation master file, created by
   sphinx-quickstart on Tue May  2 07:09:19 2023.
   You can adapt this file completely to your liking, but it should at least
   contain the root `toctree` directive.

Welcome to :mod:`charex` documentation!
=======================================

.. toctree::
   :maxdepth: 2
   :caption: Contents:

   /unicode.rst
   /forms.rst
   /api.rst


Why Did I Make This?
====================
I find the ambiguity of text data interesting. In memory it's all ones
and zeros. There is nothing inherent to the data that makes `0x20` mean
a space character, but we've mostly agreed that it does. That "mostly"
part is what's interesting to me, and it's where a lot of fun problems lie.


How Do I Use This?
==================
It's in PyPI, so you can install it with `pip`, as long as you are
using Python 3.12 or higher::

    $ pip install charex

:mod:`charex` has four modes of operation:

*   Direct command line invocation,
*   An interactive shell,
*   A graphical user interface (GUI),
*   An application programming interface (API).


Command Line
------------
To get help for direct invocation from the command line::

    $ charex -h


Interactive Shell
-----------------
To launch the interactive shell::

    $ charex

That will bring you to the :mod:`charex` shell::

    Welcome to the charex shell.
    Press ? for a list of comands.

    charex>

From here you can type `?` to see the list of available commands::

    Welcome to the charex shell.
    Press ? for a list of comands.

    charex> ?
    The following commands are available:

    *  bl: Show the list of blocks.
    *  cd: Decode the given address in all codecs.
    *  ce: Encode the given character in all codecs.
    *  cl: List registered character sets.
    *  clear: Clear the terminal.
    *  ct: Count denormalization results.
    *  dm: Build a denormalization map.
    *  dn: Perform denormalizations.
    *  dt: Display details for a code point.
    *  el: List the registered escape schemes.
    *  es: Escape a string using the given scheme.
    *  fl: List registered normalization forms.
    *  gui: Start the GUI.
    *  help: Display command list.
    *  mf: Create emoji sequence for a region's flag.
    *  nl: Perform normalizations.
    *  ns: Show the list of named sequences.
    *  pf: List characters with a given property value.
    *  sh: Run in an interactive shell.
    *  sv: Show the list of standardized variants.
    *  up: List the Unicode properties.
    *  uv: List the valid values for a Unicode property.
    *  xt: Exit the charex shell.
    *  zc: Show the list of emoji ZWJ sequence categories.
    *  zl: Show the list of emoji ZWJ sequences.

    For help on individual commands, use "help {command}".

    charex>

And then type `help` then a name of one of the commands to learn what
it does::

    charex> help dn
    usage: charex dn [-h] [-m MAXDEPTH] [-n NUMBER] [-r] [-s SEED] form base

    Denormalize a string.

    positional arguments:
      form                  The normalization form for the denormalization. Valid
                            options are: casefold, nfc, nfd, nfkc, nfkd.
      base                  The base normalized string.

    options:
      -h, --help            show this help message and exit
      -m MAXDEPTH, --maxdepth MAXDEPTH
                            Maximum number of reverse normalizations to use for
                            each character.
      -n NUMBER, --number NUMBER
                            Maximum number of results to return.
      -r, --random          Randomize the denormalization.
      -s SEED, --seed SEED  Seed the randomized denormalization.

    charex>


GUI
---
To launch the :mod:`charex` GUI::

    $ charex gui


API
---
To import :mod:`charex` into your Python script to get a summary of a
Unicode character::

    >>> import charex
    >>>
    >>>
    >>> value = 'a'
    >>> char = charex.Character(value)
    >>> print(char.summarize())
    a U+0061 (LATIN SMALL LETTER A)


What Version of Unicode Does This Support?
==========================================
Parts of :mod:`charex` rely on :mod:`unicodedata` in the Python Standard
Library. This limits :mod:`charex` to supporting the version supported
by the version of Python you are running. There may be a bit of a lag as
new Python versions are released, but as of this release :mod:`charex`
supports:

*   Python 3.12: Unicode 15.0
*   Python 3.13: Unicode 15.1
*   Python 3.14: Unicode 16.0


What happened to Unicode 14.0?
------------------------------
I have the dependencies I use to generate the documentation in the
development dependencies. To support Unicode 14.0, I had to support
Python 3.11. To support Python 3.11, I had to use an old version of
Sphinx that has some vulnerabilities. To clean that up, I had to drop
support for Python 3.11 and Unicode 14.0.

Is there a way I could have fixed it without dropping Python 3.11
support? Probably. It's is something I'll try to look into when I
get time. If it's a problem for anyone, I'll try to prioritize it.


Common Problems
===============

`ModuleNotFoundError: No module name '_tkinter'` error
------------------------------------------------------
If you get the above error when running :mod:`charex` or its tests, it's
likely your Python install doesn't have :mod:`tkinter` linked. How you
fix it depends upon your Python install. If you are using Python 3.14
installed with `homebrew` on macOS, you can probably fix it with::

    brew install python-tk@3.14


Indices and tables
==================

* :ref:`genindex`
* :ref:`modindex`
* :ref:`search`