Welcome to charex documentation!¶
Contents:
Why Did I Make This?¶
I find the ambiguity of text data interesting. In memory it’s all ones and zeros. There is nothing inherent to the data that makes 0x20 mean a space character, but we’ve mostly agreed that it does. That “mostly” part is what’s interesting to me, and it’s where a lot of fun problems lie.
How Do I Use This?¶
It’s in PyPI, so you can install it with pip, as long as you are using Python 3.12 or higher:
$ pip install charex
charex has four modes of operation:
Direct command line invocation,
An interactive shell,
A graphical user interface (GUI),
An application programming interface (API).
Command Line¶
To get help for direct invocation from the command line:
$ charex -h
Interactive Shell¶
To launch the interactive shell:
$ charex
That will bring you to the charex shell:
Welcome to the charex shell.
Press ? for a list of comands.
charex>
From here you can type ? to see the list of available commands:
Welcome to the charex shell.
Press ? for a list of comands.
charex> ?
The following commands are available:
* bl: Show the list of blocks.
* cd: Decode the given address in all codecs.
* ce: Encode the given character in all codecs.
* cl: List registered character sets.
* clear: Clear the terminal.
* ct: Count denormalization results.
* dm: Build a denormalization map.
* dn: Perform denormalizations.
* dt: Display details for a code point.
* el: List the registered escape schemes.
* es: Escape a string using the given scheme.
* fl: List registered normalization forms.
* gui: Start the GUI.
* help: Display command list.
* mf: Create emoji sequence for a region's flag.
* nl: Perform normalizations.
* ns: Show the list of named sequences.
* pf: List characters with a given property value.
* sh: Run in an interactive shell.
* sv: Show the list of standardized variants.
* up: List the Unicode properties.
* uv: List the valid values for a Unicode property.
* xt: Exit the charex shell.
* zc: Show the list of emoji ZWJ sequence categories.
* zl: Show the list of emoji ZWJ sequences.
For help on individual commands, use "help {command}".
charex>
And then type help then a name of one of the commands to learn what it does:
charex> help dn
usage: charex dn [-h] [-m MAXDEPTH] [-n NUMBER] [-r] [-s SEED] form base
Denormalize a string.
positional arguments:
form The normalization form for the denormalization. Valid
options are: casefold, nfc, nfd, nfkc, nfkd.
base The base normalized string.
options:
-h, --help show this help message and exit
-m MAXDEPTH, --maxdepth MAXDEPTH
Maximum number of reverse normalizations to use for
each character.
-n NUMBER, --number NUMBER
Maximum number of results to return.
-r, --random Randomize the denormalization.
-s SEED, --seed SEED Seed the randomized denormalization.
charex>
GUI¶
To launch the charex GUI:
$ charex gui
API¶
To import charex into your Python script to get a summary of a
Unicode character:
>>> import charex
>>>
>>>
>>> value = 'a'
>>> char = charex.Character(value)
>>> print(char.summarize())
a U+0061 (LATIN SMALL LETTER A)
What Version of Unicode Does This Support?¶
Parts of charex rely on unicodedata in the Python Standard
Library. This limits charex to supporting the version supported
by the version of Python you are running. There may be a bit of a lag as
new Python versions are released, but as of this release charex
supports:
Python 3.12: Unicode 15.0
Python 3.13: Unicode 15.1
Python 3.14: Unicode 16.0
What happened to Unicode 14.0?¶
I have the dependencies I use to generate the documentation in the development dependencies. To support Unicode 14.0, I had to support Python 3.11. To support Python 3.11, I had to use an old version of Sphinx that has some vulnerabilities. To clean that up, I had to drop support for Python 3.11 and Unicode 14.0.
Is there a way I could have fixed it without dropping Python 3.11 support? Probably. It’s is something I’ll try to look into when I get time. If it’s a problem for anyone, I’ll try to prioritize it.
Common Problems¶
ModuleNotFoundError: No module name ‘_tkinter’ error¶
If you get the above error when running charex or its tests, it’s
likely your Python install doesn’t have tkinter linked. How you
fix it depends upon your Python install. If you are using Python 3.14
installed with homebrew on macOS, you can probably fix it with:
brew install python-tk@3.14