.. charex documentation master file, created by sphinx-quickstart on Tue May 2 07:09:19 2023. You can adapt this file completely to your liking, but it should at least contain the root `toctree` directive. Welcome to :mod:`charex` documentation! ======================================= .. toctree:: :maxdepth: 2 :caption: Contents: /unicode.rst /forms.rst /api.rst Why Did I Make This? ==================== I find the ambiguity of text data interesting. In memory it's all ones and zeros. There is nothing inherent to the data that makes `0x20` mean a space character, but we've mostly agreed that it does. That "mostly" part is what's interesting to me, and it's where a lot of fun problems lie. How Do I Use This? ================== It's in PyPI, so you can install it with `pip`, as long as you are using Python 3.12 or higher:: $ pip install charex :mod:`charex` has four modes of operation: * Direct command line invocation, * An interactive shell, * A graphical user interface (GUI), * An application programming interface (API). Command Line ------------ To get help for direct invocation from the command line:: $ charex -h Interactive Shell ----------------- To launch the interactive shell:: $ charex That will bring you to the :mod:`charex` shell:: Welcome to the charex shell. Press ? for a list of comands. charex> From here you can type `?` to see the list of available commands:: Welcome to the charex shell. Press ? for a list of comands. charex> ? The following commands are available: * bl: Show the list of blocks. * cd: Decode the given address in all codecs. * ce: Encode the given character in all codecs. * cl: List registered character sets. * clear: Clear the terminal. * ct: Count denormalization results. * dm: Build a denormalization map. * dn: Perform denormalizations. * dt: Display details for a code point. * el: List the registered escape schemes. * es: Escape a string using the given scheme. * fl: List registered normalization forms. * gui: Start the GUI. * help: Display command list. * mf: Create emoji sequence for a region's flag. * nl: Perform normalizations. * ns: Show the list of named sequences. * pf: List characters with a given property value. * sh: Run in an interactive shell. * sv: Show the list of standardized variants. * up: List the Unicode properties. * uv: List the valid values for a Unicode property. * xt: Exit the charex shell. * zc: Show the list of emoji ZWJ sequence categories. * zl: Show the list of emoji ZWJ sequences. For help on individual commands, use "help {command}". charex> And then type `help` then a name of one of the commands to learn what it does:: charex> help dn usage: charex dn [-h] [-m MAXDEPTH] [-n NUMBER] [-r] [-s SEED] form base Denormalize a string. positional arguments: form The normalization form for the denormalization. Valid options are: casefold, nfc, nfd, nfkc, nfkd. base The base normalized string. options: -h, --help show this help message and exit -m MAXDEPTH, --maxdepth MAXDEPTH Maximum number of reverse normalizations to use for each character. -n NUMBER, --number NUMBER Maximum number of results to return. -r, --random Randomize the denormalization. -s SEED, --seed SEED Seed the randomized denormalization. charex> GUI --- To launch the :mod:`charex` GUI:: $ charex gui API --- To import :mod:`charex` into your Python script to get a summary of a Unicode character:: >>> import charex >>> >>> >>> value = 'a' >>> char = charex.Character(value) >>> print(char.summarize()) a U+0061 (LATIN SMALL LETTER A) What Version of Unicode Does This Support? ========================================== Parts of :mod:`charex` rely on :mod:`unicodedata` in the Python Standard Library. This limits :mod:`charex` to supporting the version supported by the version of Python you are running. There may be a bit of a lag as new Python versions are released, but as of this release :mod:`charex` supports: * Python 3.12: Unicode 15.0 * Python 3.13: Unicode 15.1 * Python 3.14: Unicode 16.0 What happened to Unicode 14.0? ------------------------------ I have the dependencies I use to generate the documentation in the development dependencies. To support Unicode 14.0, I had to support Python 3.11. To support Python 3.11, I had to use an old version of Sphinx that has some vulnerabilities. To clean that up, I had to drop support for Python 3.11 and Unicode 14.0. Is there a way I could have fixed it without dropping Python 3.11 support? Probably. It's is something I'll try to look into when I get time. If it's a problem for anyone, I'll try to prioritize it. Common Problems =============== `ModuleNotFoundError: No module name '_tkinter'` error ------------------------------------------------------ If you get the above error when running :mod:`charex` or its tests, it's likely your Python install doesn't have :mod:`tkinter` linked. How you fix it depends upon your Python install. If you are using Python 3.14 installed with `homebrew` on macOS, you can probably fix it with:: brew install python-tk@3.14 Indices and tables ================== * :ref:`genindex` * :ref:`modindex` * :ref:`search`