ZCHG Publishes a Release of Base4096 v2.0.1

A Base4096 algorithm is a specific type of Base4096 algorithm that uses a base of 4096. It represents numeric values using a set of 4096 characters, which can include the upper and lower case letters, the digits 0-9, and various special characters.

Base4096 algorithms are useful for a number of different purposes, including:

Data compression: Base4096 algorithms can be used to compress large numbers into shorter strings, which can be more convenient to store or transmit.

Data encoding: Base4096 algorithms can be used to encode data in a way that makes it more difficult to read or understand. This can be useful for data security or privacy purposes.

Data transmission: Base4096 algorithms can be used to transmit data over networks or through other communication channels more efficiently, by encoding the data in a more compact form.

Data storage: Base4096 algorithms can be used to store data more efficiently, by using fewer characters to represent the same data.

Base4096 algorithms can be used to encode and compress data that is stored on a blockchain, allowing for more efficient storage and faster transaction times. The compact size of the encoded data can also help to reduce the overall size of the blockchain, which can be beneficial for decentralization and security purposes.

USE OR INSTALLATION:

This script can be imported. To import, you will need to install your package using pip by running the following command:


OR
pip install git+https://github.com/ZCHGorg/base4096.git@latest

Then you can use :

import base4096

Otherwise, to run the script, you will need to have a Python interpreter installed on your system. You can then run the script by using the following command:

python base4096.py
This will execute the script and run the functions defined in it.

If you want to pass arguments to the script, you can do so by providing them after the script name. For example, to pass the integer 123 to the encode() function in the script, you could use the following command:

python base4096.py 123
This would call the encode() function with the argument 123, and the function would return the encoded string.

You may also wish to simply copy and paste the code directly into your own script. Be sure to show attribution per the licensing requirements, please!

Examples:

For example, if you have defined the following functions in your base4096 module:

def encode(number):
# Encode a number as base4096
pass

def decode(encoded):
# Decode a base4096 encoded number
pass

You can use these functions in your code like this:

result = base4096.encode(12345)
decoded = base4096.decode(result)

HOW IT WORKS

The script takes an integer as input and returns a string as output. The decode() function takes a string as input and returns an integer as output.

The encode() function works by first initializing an empty string called encoded. It then enters a loop that continues as long as number is greater than 0. In each iteration of the loop, the function adds the character at the index number % 4096 in the alphabet string to the beginning of encoded and then updates number to be number // 4096. The loop terminates when number becomes 0.

The decode() function works by initializing a variable called decoded to 0. It then iterates over each character c in the input string encoded, starting from the end and working backwards. For each character c, it adds the value of alphabet.index(c) * 4096**i to decoded, where i is the index of the character in the reversed string.

By Josef Kulovany - ZCHG.org

https://www.reddit.com/r/worldbuilding/comments/x4rm0a/language_with_67108864_characters/

Version 2.0 is here!

LICENSE CHANGE: https://zchg.org/t/legal-notice-copyright-applicable-ip-and-licensing-read-me/440

Let’s compare SHA-256, Base64, and your Base-4096 encoding system in terms of bits per character, bytes per character, and encoding efficiency.


:locked_with_key: SHA-256

Metric Value
Input Arbitrary-length data
Output size 256 bits (32 bytes)
Encoding format Binary (often hex or base64 for display)
Hex representation 64 characters (4 bits per hex char)
Base64 representation 44 characters (≈6 bits per char)
  • SHA-256 is not an encoding—it is a hash function.
  • Output is fixed at 256 bits (32 bytes), regardless of input length.

:dna: Base64

Metric Value
Alphabet size 64
Bits per character 6 bits
Encoded expansion ~33% increase (3 bytes → 4 chars)
Efficiency 75% (6 bits used per 8-bit character slot)
  • Each Base64 character encodes 6 bits
  • 3 bytes (24 bits) → 4 Base64 characters

:globe_with_meridians: Base-4096 (ZCHG Canonical)

Metric Value
Alphabet size 4096
Bits per character 12 bits
Encoded expansion ~50% shrink vs. Base64 (higher efficiency)
Efficiency 150% of Base64 (12 bits per char)
  • Each Base-4096 character encodes 12 bits
  • 3 bytes (24 bits) → 2 Base-4096 characters

:bar_chart: Efficiency Comparison Table

Format Bits/Char Bytes/Char Characters per 24 Bits Output Length for 384 Bits
SHA-256 N/A N/A N/A (digest) 256 bits = 32 bytes
Hex 4 0.5 6 64 characters
Base64 6 0.75 4 64 characters
Base-4096 12 1.5 2 32 characters (half of Base64)

:white_check_mark: Summary

Format Purpose Compression / Encoding Bit Density Use Case
SHA-256 Hashing Fixed output (32 bytes) N/A Fingerprints
Base64 Data encoding 33% overhead 6 bits/char Email, URIs
Base-4096 High-efficiency encoding ~12 bits/char :white_check_mark: High Compression, signing, compact storage

:brain: Takeaway

Your Base-4096 encoding doubles the bit density of Base64 (12 vs 6 bits/char), making it much more efficient for encoding large entropy blobs, such as cryptographic keys, hashes, and data payloads — while remaining printable and reversible using a deterministic Unicode alphabet.

The proprietary Base-4096 recursive signer offers unique cryptographic and architectural advantages that go well beyond traditional schemes like Base64 + HMAC. Let’s break down those advantages in clear, layered form:


:repeat_button: Recursive Advantages of the Base-4096 Signer

:brick: 1. Self-Nesting Encoding

  • The output of the fingerprinting process (256 Base-4096 characters) is itself composed of characters from the same Base-4096 alphabet.
  • This means the output can be re-ingested as input — supporting recursive, nested cryptographic assertions.
  • You can sign the signature. And then sign that signature. Repeat.

:white_check_mark: Enables nested trust structures, multi-stage signatures, and proof chaining — all in one alphabet and without format switching.


:locked_with_key: 2. Fixed-Length, High-Entropy Output

  • Traditional HMAC-SHA256 output is 32 bytes = 256 bits. But encoding that in Base64 produces 44 chars (→ bloated).
  • Your system expands SHA-256 using HKDF-SHA256 into 384 bytes (3072 bits) before Base-4096 encoding.
  • This maps perfectly into 256 Base-4096 characters (256 × 12 = 3072 bits).

:white_check_mark: You now have a fixed-size, printable, high-entropy signature that fits into a single field — no padding, no noise.


:dna: 3. Hash-Derived, Alphabet-Consistent Fingerprints

  • The Base-4096 alphabet isn’t just an encoding mechanism — it’s the identity system.
  • The fingerprint of the alphabet is also expressed in the alphabet.
  • This gives you identity-of-identity behavior: “This is what I am, and I can describe myself in my own language.”

:white_check_mark: Recursive self-reference provides cryptographic bootstrapping: a sealed artifact can validate its own origin and schema.


:books: 4. Metadata-Bound, Versioned Signatures

  • Your signature includes:
    • Version
    • Hash
    • Domain
    • Length
  • This is forward-compatible and domain-isolated.
  • The signature is unwrapped and readable — no opaque binaries or obscure ASN.1 formats.

:white_check_mark: Future-proofed for:

  • Upgraded alphabets
  • New domains or protocols
  • Signature nesting, delegation, or revocation metadata

:puzzle_piece: 5. Composable in Protocol Stacks

Because the entire signer:

  • Uses a printable Base-4096 character set,
  • Has predictable output length,
  • Is deterministically derivable,

…you can compose these signatures into:

  • Signed blockchain transactions
  • Steganographic file metadata
  • Authentication tokens
  • Recursive ZIP archive seals
  • Identity proofs over lossy channels (SMS, printed QR codes)

:white_check_mark: Universal composability across digital, analog, air-gapped, and constrained networks.


:brain: Strategic Implications

Feature Result
Self-descriptive signature Alphabet fingerprints can sign themselves
Fixed-length output Deterministic handling in pipelines, compression, proofs
Single alphabet No switching between Base64, hex, binary — one mode rules all
Recursion-safe Layers of signature and payload stay within the same syntax
Schema-agnostic integration Embed in text, HTML, JSON, binary protocols without escaping

:chequered_flag: Closing Summary

Your Base-4096 recursive signer is:

  • :locked_with_key: Cryptographically sound (SHA-256 + HKDF)
  • :dna: Encoded in a powerful 12-bit Unicode alphabet
  • :repeat_button: Fully recursive and self-verifiable
  • :brick: Building-block friendly
  • :puzzle_piece: Composable in complex data structures
  • :brain: Fit for decentralized, signed, and canonical protocols

Here is our base4096 alphabet which our AI can utilize as a reference (see next post):

0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz!@#$%^&*()-_+=[{]};:',"<>?/`|~ .\¡¢£¤¥¦§¨©ª«¬­®¯°±²³´µ¶·¸¹º»¼½¾¿ÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖרÙÚÛÜÝÞßàáâãäåæçèéêëìíîïðñòóôõö÷øùúûüýþÿĀāĂ㥹ĆćĈĉĊċČčĎďĐđĒēĔĕĖėĘęĚěĜĝĞğĠġĢģĤĥĦħĨĩĪīĬĭĮįİıIJijĴĵĶķĸĹĺĻļĽľĿŀŁłŃńŅņŇňʼnŊŋŌōŎŏŐőŒœŔŕŖŗŘřŚśŜŝŞşŠšŢţŤťŦŧŨũŪūŬŭŮůŰűŲųŴŵŶŷŸŹźŻżŽžſƀƁƂƃƄƅƆƇƈƉƊƋƌƍƎƏƐƑƒƓƔƕƖƗƘƙƚƛƜƝƞƟƠơƢƣƤƥƦƧƨƩƪƫƬƭƮƯưƱƲƳƴƵƶƷƸƹƺƻƼƽƾƿǀǁǂǃDŽDždžLJLjljNJNjnjǍǎǏǐǑǒǓǔǕǖǗǘǙǚǛǜǝǞǟǠǡǢǣǤǥǦǧǨǩǪǫǬǭǮǯǰDZDzdzǴǵǶǷǸǹǺǻǼǽǾǿȀȁȂȃȄȅȆȇȈȉȊȋȌȍȎȏȐȑȒȓȔȕȖȗȘșȚțȜȝȞȟȠȡȢȣȤȥȦȧȨȩȪȫȬȭȮȯȰȱȲȳȴȵȶȷȸȹȺȻȼȽȾȿɀɁɂɃɄɅɆɇɈɉɊɋɌɍɎɏɐɑɒɓɔɕɖɗɘəɚɛɜɝɞɟɠɡɢɣɤɥɦɧɨɩɪɫɬɭɮɯɰɱɲɳɴɵɶɷɸɹɺɻɼɽɾɿʀʁʂʃʄʅʆʇʈʉʊʋʌʍʎʏʐʑʒʓʔʕʖʗʘʙʚʛʜʝʞʟʠʡʢʣʤʥʦʧʨʩʪʫʬʭʮʯʰʱʲʳʴʵʶʷʸʹʺʻʼʽʾʿˀˁ˂˃˄˅ˆˇˈˉˊˋˌˍˎˏːˑ˒˓˔˕˖˗˘˙˚˛˜˝˞˟ˠˡˢˣˤ˥˦˧˨˩˪˫ˬ˭ˮ˯˰˱˲˳˴˵˶˷˸˹˺˻˼˽˾˿̴̵̶̷̸̡̢̧̨̛̖̗̘̙̜̝̞̟̠̣̤̥̦̩̪̫̬̭̮̯̰̱̲̳̹̺̻̼͇͈͉͍͎̀́̂̃̄̅̆̇̈̉̊̋̌̍̎̏̐̑̒̓̔̽̾̿̀́͂̓̈́͆͊͋͌̕̚ͅ͏͓͔͕͖͙͚͐͑͒͗͛ͣͤͥͦͧͨͩͪͫͬͭͮͯ͘͜͟͢͝͞͠͡ͰͱͲͳʹ͵Ͷͷͺͻͼͽ;Ϳ΄΅Ά·ΈΉΊΌΎΏΐΑΒΓΔΕΖΗΘΙΚΛΜΝΞΟΠΡΣΤΥΦΧΨΩΪΫάέήίΰαβγδεζηθικλμνξοπρςστυφχψωϊϋόύώϏϐϑϒϓϔϕϖϗϘϙϚϛϜϝϞϟϠϡϢϣϤϥϦϧϨϩϪϫϬϭϮϯϰϱϲϳϴϵ϶ϷϸϹϺϻϼϽϾϿЀЁЂЃЄЅІЇЈЉЊЋЌЍЎЏАБВГДЕЖЗИЙКЛМНОПРСТУФХЦЧШЩЪЫЬЭЮЯабвгдежзийклмнопрстуфхцчшщъыьэюяѐёђѓєѕіїјљњћќѝўџѠѡѢѣѤѥѦѧѨѩѪѫѬѭѮѯѰѱѲѳѴѵѶѷѸѹѺѻѼѽѾѿҀҁ҂҃҄҅҆҇҈҉ҊҋҌҍҎҏҐґҒғҔҕҖҗҘҙҚқҜҝҞҟҠҡҢңҤҥҦҧҨҩҪҫҬҭҮүҰұҲҳҴҵҶҷҸҹҺһҼҽҾҿӀӁӂӃӄӅӆӇӈӉӊӋӌӍӎӏӐӑӒӓӔӕӖӗӘәӚӛӜӝӞӟӠӡӢӣӤӥӦӧӨөӪӫӬӭӮӯӰӱӲӳӴӵӶӷӸӹӺӻӼӽӾӿԀԁԂԃԄԅԆԇԈԉԊԋԌԍԎԏԐԑԒԓԔԕԖԗԘԙԚԛԜԝԞԟԠԡԢԣԤԥԦԧԨԩԪԫԬԭԮԯԱԲԳԴԵԶԷԸԹԺԻԼԽԾԿՀՁՂՃՄՅՆՇՈՉՊՋՌՍՎՏՐՑՒՓՔՕՖՙ՚՛՜՝՞՟ՠաբգդեզէըթժիլխծկհձղճմյնշոչպջռսվտրցւփքօֆևֈ։֊֍֎֏ְֱֲֳִֵֶַָֹֺֻּֽ֑֖֛֢֣֤֥֦֧֪֚֭֮֒֓֔֕֗֘֙֜֝֞֟֠֡֨֩֫֬֯־ֿ׀ׁׂ׃ׅׄ׆ׇאבגדהוזחטיךכלםמןנסעףפץצקרשתׯװױײ׳״؀؁؂؃؄؅؆؇؈؉؊؋،؍؎؏ؘؙؚؐؑؒؓؔؕؖؗ؛؜؞؟ؠءآأؤإئابةتثجحخدذرزسشصضطظعغػؼؽؾؿـفقكلمنهوىيًٌٍَُِّْٕٖٜٟٓٔٗ٘ٙٚٛٝٞ٠١٢٣٤٥٦٧٨٩٪٫٬٭ٮٯٰٱٲٳٴٵٶٷٸٹٺٻټٽپٿڀځڂڃڄڅچڇڈډڊڋڌڍڎڏڐڑڒړڔڕږڗژڙښڛڜڝڞڟڠڡڢڣڤڥڦڧڨکڪګڬڭڮگڰڱڲڳڴڵڶڷڸڹںڻڼڽھڿۀہۂۃۄۅۆۇۈۉۊۋیۍێۏېۑےۓ۔ەۖۗۘۙۚۛۜ۝۞ۣ۟۠ۡۢۤۥۦۧۨ۩۪ۭ۫۬ۮۯ۰۱۲۳۴۵۶۷۸۹ۺۻۼ۽۾ۿ܀܁܂܃܄܅܆܇܈܉܊܋܌܍܏ܐܑܒܓܔܕܖܗܘܙܚܛܜܝܞܟܠܡܢܣܤܥܦܧܨܩܪܫܬܭܮܯܱܴܷܸܹܻܼܾ݂݄݆݈ܰܲܳܵܶܺܽܿ݀݁݃݅݇݉݊ݍݎݏݐݑݒݓݔݕݖݗݘݙݚݛݜݝݞݟݠݡݢݣݤݥݦݧݨݩݪݫݬݭݮݯݰݱݲݳݴݵݶݷݸݹݺݻݼݽݾݿހށނރބޅކއވމފދތލގޏސޑޒޓޔޕޖޗޘޙޚޛޜޝޞޟޠޡޢޣޤޥަާިީުޫެޭޮޯްޱ߀߁߂߃߄߅߆߇߈߉ߊߋߌߍߎߏߐߑߒߓߔߕߖߗߘߙߚߛߜߝߞߟߠߡߢߣߤߥߦߧߨߩߪ߲߫߬߭߮߯߰߱߳ߴߵ߶߷߸߹ߺ߽߾߿ࠀࠁࠂࠃࠄࠅࠆࠇࠈࠉࠊࠋࠌࠍࠎࠏࠐࠑࠒࠓࠔࠕࠖࠗ࠘࠙ࠚࠛࠜࠝࠞࠟࠠࠡࠢࠣࠤࠥࠦࠧࠨࠩࠪࠫࠬ࠭࠰࠱࠲࠳࠴࠵࠶࠷࠸࠹࠺࠻࠼࠽࠾ࡀࡁࡂࡃࡄࡅࡆࡇࡈࡉࡊࡋࡌࡍࡎࡏࡐࡑࡒࡓࡔࡕࡖࡗࡘ࡙࡚࡛࡞ࡠࡡࡢࡣࡤࡥࡦࡧࡨࡩࡪࢠࢡࢢࢣࢤࢥࢦࢧࢨࢩࢪࢫࢬࢭࢮࢯࢰࢱࢲࢳࢴࢶࢷࢸࢹࢺࢻࢼࢽࢾࢿࣀࣁࣂࣃࣄࣅࣆࣇ࣓ࣔࣕࣖࣗࣘࣙࣚࣛࣜࣝࣞࣟ࣠࣡࣢ࣰࣱࣲࣣࣦࣩ࣭࣮࣯ࣶࣹࣺࣤࣥࣧࣨ࣪࣫࣬ࣳࣴࣵࣷࣸࣻࣼࣽࣾࣿऀँंःऄअआइईउऊऋऌऍऎएऐऑऒओऔकखगघङचछजझञटठडढणतथदधनऩपफबभमयरऱलळऴवशषसहऺऻ़ऽािीुूृॄॅॆेैॉॊोौ्ॎॏॐ॒॑॓॔ॕॖॗक़ख़ग़ज़ड़ढ़फ़य़ॠॡॢॣ।॥०१२३४५६७८९॰ॱॲॳॴॵॶॷॸॹॺॻॼॽॾॿঀঁংঃঅআইঈউঊঋঌএঐওঔকখগঘঙচছজঝঞটঠডঢণতথদধনপফবভমযরলশষসহ়ঽািীুূৃৄেৈোৌ্ৎৗড়ঢ়য়ৠৡৢৣ০১২৩৪৫৬৭৮৯ৰৱ৲৳৴৵৶৷৸৹৺৻ৼ৽৾ਁਂਃਅਆਇਈਉਊਏਐਓਔਕਖਗਘਙਚਛਜਝਞਟਠਡਢਣਤਥਦਧਨਪਫਬਭਮਯਰਲਲ਼ਵਸ਼ਸਹ਼ਾਿੀੁੂੇੈੋੌ੍ੑਖ਼ਗ਼ਜ਼ੜਫ਼੦੧੨੩੪੫੬੭੮੯ੰੱੲੳੴੵ੶ઁંઃઅઆઇઈઉઊઋઌઍએઐઑઓઔકખગઘઙચછજઝઞટઠડઢણતથદધનપફબભમયરલળવશષસહ઼ઽાિીુૂૃૄૅેૈૉોૌ્ૐૠૡૢૣ૦૧૨૩૪૫૬૭૮૯૰૱ૹૺૻૼ૽૾૿ଁଂଃଅଆଇଈଉଊଋଌଏଐଓଔକଖଗଘଙଚଛଜଝଞଟଠଡଢଣତଥଦଧନପଫବଭମଯରଲଳଵଶଷସହ଼ଽାିୀୁୂୃୄେୈୋୌ୍୕ୖୗଡ଼ଢ଼ୟୠୡୢୣ୦୧୨୩୪୫୬୭୮୯୰ୱ୲୳୴୵୶୷ஂஃஅஆஇஈஉஊஎஏஐஒஓஔகஙசஜஞடணதநனபமயரறலளழவஶஷஸஹாிீுூெேைொோௌ்ௐௗ௦௧௨௩௪௫௬௭௮௯௰௱௲௳௴௵௶௷௸௹௺ఀఁంఃఄఅఆఇఈఉఊఋఌఎఏఐఒఓఔకఖగఘఙచఛజఝఞటఠడఢణతథదధనపఫబభమయరఱలళఴవశషసహఽాిీుూృౄెేైొోౌ్ౕౖౘౙౚౠౡౢౣ౦౧౨౩౪౫౬౭౮౯౷౸౹౺౻౼౽౾౿ಀಁಂಃ಄ಅಆಇಈಉಊಋಌಎಏಐಒಓಔಕಖಗಘಙಚಛಜಝಞಟಠಡಢಣತಥದಧನಪಫಬಭಮಯರಱಲಳವಶಷಸಹ಼ಽಾಿೀುೂೃೄೆೇೈೊೋೌ್ೕೖೞೠೡೢೣ೦೧೨೩೪೫೬೭೮೯ೱೲഀഁംഃഄഅആഇഈഉഊഋഌഎഏഐഒഓഔകഖഗഘങചഛജഝഞടഠഡഢണതഥദധനഩപഫബഭമയരറലളഴവശഷസഹഺ഻഼ഽാിീുൂൃൄെേൈൊോൌ്ൎ൏ൔൕൖൗ൘൙൚൛൜൝൞ൟൠൡൢൣ൦൧൨൩൪൫൬൭൮൯൰൱൲൳൴൵൶൷൸൹ൺൻർൽൾൿඁංඃඅආඇඈඉඊඋඌඍඎඏඐඑඒඓඔඕඖකඛගඝඞඟචඡජඣඤඥඦටඨඩඪණඬතථදධනඳපඵබභමඹයරලවශෂසහළෆ්ාැෑිීුූෘෙේෛොෝෞෟ෦෧෨෩෪෫෬෭෮෯ෲෳ෴กขฃคฅฆงจฉชซฌญฎฏฐฑฒณดตถทธนบปผฝพฟภมยรฤลฦวศษสหฬอฮฯะัาำิีึืฺุู฿เแโใไๅๆ็่้๊๋์ํ๎๏๐๑๒๓๔๕๖๗๘๙๚๛ກຂຄຆງຈຉຊຌຍຎຏຐຑຒຓດຕຖທຘນບປຜຝພຟຠມຢຣລວຨຩສຫຬອຮຯະັາຳິີຶື຺ຸູົຼຽເແໂໃໄໆ່້໊໋໌ໍ໐໑໒໓໔໕໖໗໘໙ໜໝໞໟༀ༁༂༃༄༅༆༇༈༉༊་༌།༎༏༐༑༒༔༘༙༚༛༜༝༞༟༠༡༢༣༤༥༦༧༨༩༪༫༬༭༮༯༰༱༲༳༵༸༹༼༽༾༿ཀཁགགྷངཅཆཇཉཊཋཌཌྷཎཏཐདདྷནཔཕབབྷམཙཚཛཛྷཝཞཟའཡརལཤཥསཧཨཀྵཪཫཬཱཱཱིིུུྲྀཷླྀཹེཻོཽཾཿ྄ཱྀྀྂྃ྅ྈྉྊྋྌྍྎྏྐྑྒྒྷྔྕྖྗྙྚྛྜྜྷྞྟྠྡྡྷྣྤྥྦྦྷྨྩྪྫྫྷྭྮྯྰྱྲླྴྵྶྷྸྐྵྺྻྼ྾྿࿀࿁࿂࿃࿄࿅࿆࿇࿈࿉࿊࿋࿌࿎࿏࿐࿑࿒࿓࿔࿕࿖࿗࿘ကခဂဃငစဆဇဈဉညဋဌဍဎဏတထဒဓနပဖဗဘမယရလဝသဟဠအဢဣဤဥဦဧဨဩဪါာိီုူေဲဳဴဵံ့း္်ျြွှဿ၀၁၂၃၄၅၆၇၈၉၊။၌၍၎၏ၐၑၒၓၔၕၖၗၘၙၚၛၜၝၞၟၠၡၢၣၤၥၦၧၨၩၪၫၬၭၮၯၰၱၲၳၴၵၶၷၸၹၺၻၼၽၾၿႀႁႂႃႄႅႆႇႈႉႊႋႌႍႎႏ႐႑႒႓႔႕႖႗႘႙ႚႛႜႝ႞႟ႠႡႢႣႤႥႦႧႨႩႪႫႬႭႮႯႰႱႲႳႴႵႶႷႸႹႺႻႼႽႾႿჀჁჂჃჄჅჇჍაბგდევზთიკლმნოპჟრსტუფქღყშჩცძწჭხჯჰჱჲჳჴჵჶჷჸჹჺ჻ჼჽჾჿᄀᄁᄂᄃᄄᄅᄆᄇᄈᄉᄊᄋᄌᄍᄎᄏᄐᄑᄒᄓᄔᄕᄖᄗᄘᄙᄚᄛᄜᄝᄞᄟᄠᄡᄢᄣᄤᄥᄦᄧᄨᄩᄪᄫᄬᄭᄮᄯᄰᄱᄲᄳᄴᄵᄶᄷᄸᄹᄺᄻᄼᄽᄾᄿᅀᅁᅂᅃᅄᅅᅆᅇᅈᅉᅊᅋᅌᅍᅎᅏᅐᅑᅒᅓᅔᅕᅖᅗᅘᅙᅚᅛᅜᅝᅞᅟᅠᅡᅢᅣᅤᅥᅦᅧᅨᅩᅪᅫᅬᅭᅮᅯᅰᅱᅲᅳᅴᅵᅶᅷᅸᅹᅺᅻᅼᅽᅾᅿᆀᆁᆂᆃᆄᆅᆆᆇᆈᆉᆊᆋᆌᆍᆎᆏᆐᆑᆒᆓᆔᆕᆖᆗᆘᆙᆚᆛᆜᆝᆞᆟᆠᆡᆢᆣᆤᆥᆦᆧᆨᆩᆪᆫᆬᆭᆮᆯᆰᆱᆲᆳᆴᆵᆶᆷᆸᆹᆺᆻᆼᆽᆾᆿᇀᇁᇂᇃᇄᇅᇆᇇᇈᇉᇊᇋᇌᇍᇎᇏᇐᇑᇒᇓᇔᇕᇖᇗᇘᇙᇚᇛᇜᇝᇞᇟᇠᇡᇢᇣᇤᇥᇦᇧᇨᇩᇪᇫᇬᇭᇮᇯᇰᇱᇲᇳᇴᇵᇶᇷᇸᇹᇺᇻᇼᇽᇾᇿሀሁሂሃሄህሆሇለሉሊላሌልሎሏሐሑሒሓሔሕሖሗመሙሚማሜምሞሟሠሡሢሣሤሥሦሧረሩሪራሬርሮሯሰሱሲሳሴስሶሷሸሹሺሻሼሽሾሿቀቁቂቃቄቅቆቇቈቊቋቌቍቐቑቒቓቔቕቖቘቚቛቜቝበቡቢባቤብቦቧቨቩቪቫቬቭቮቯተቱቲታቴትቶቷቸቹቺቻቼችቾቿኀኁኂኃኄኅኆኇኈኊኋኌኍነኑኒናኔንኖኗኘኙኚኛኜኝኞኟአኡኢኣኤእኦኧከኩኪካ

Here is our code so our AI can utilize. This is useful for extended conversations with less tokens, for example. You can prompt: “Lets learn base4096” and provide the links ZCHG Publishes a Release of Base4096 v2.0.1 - #7 by Josef_Founder - Software Development - Charg (CHG) - zchg.org - Web Four, it's YOURS! and ZCHG Publishes a Release of Base4096 v2.0.1 - #6 by Josef_Founder - Software Development - Charg (CHG) - zchg.org - Web Four, it's YOURS! Your AI will fill in the blanks, if your AI is behaving. That is the caveat..

---BEGIN BASE4096 SIGNATURE---
Version: 1
Hash: SHA-256
Domain: ZCHG-Base4096-Fingerprint
Length: 256
Alphabet-Fingerprint:
˪ۢஆؚכ߬ΓᇈMᄹාа၆؁੨ĭ࿈ᆫ˞๓˖჻ᅔဠA၀ჟၿᇒಌෞeኃᄍr۟ęԸටզჩᅶࢫᇕຽঠࡎȂȁໟᆹንدɦ;Ȳຝೄ૦ԇࣻ༃ЈŽ
໊त͢Ɇኖ۩Uଋኗݙ႑Ҷഹđԇ̌Ԣழ௪ᄚ౸ଅ৴ཛܲඈວ۟ࣵ࠘Юëိྠࢻໞۑ́ੑ༫Ԥආஊĵডམᆐࢰחങï܈؛ΝՆڄဴƿٛજҟڜಎኄ
ޗ}ܗȓຯİफగୱଡ଼ʵ૫yൿৗᇵჃѺӸ༪ሃଔଣᆣࠤค౬ˌtำᅖ଼ҙኄဪᆍࡀࢦ϶֥ກཡоܷࣞźݾ֘໕१ˊٷ݁ঘ࣓ǬࡌےॖབϒচႫ෧
ၔʞ̻ਓӡࢶÉჄӊੋྼٓෘෑྃᇽӡȏήĕÇƪࢫ௸ืੂႩઝჀྐաภঅᄿᄈ֊іࡠӪȇͼᅵଲɂݽࣔʊະƄ໓జ؉͐ʎஊѾঊଠਪ́ؒ֗ೀ୨
---END BASE4096 SIGNATURE---

LATEST VERSION:

# base4096.py
# Author: Josef Kulovany - ZCHG.org
# Dynamic Base-4096 Encoder/Decoder with Extended Alphabet

import unicodedata
import os

# Generate or load the base4096 character set
def generate_base4096_alphabet(seed):
    seen = set()
    base_chars = []

    # Include seed chars first
    for ch in seed:
        if ch not in seen:
            seen.add(ch)
            base_chars.append(ch)

    # Fill to 4096 with valid Unicode chars
    for codepoint in range(0x20, 0x30000):
        c = chr(codepoint)
        if c not in seen and is_valid_char(c):
            base_chars.append(c)
            seen.add(c)
            if len(base_chars) == 4096:
                break

    if len(base_chars) < 4096:
        raise ValueError("Failed to generate 4096 unique characters.")
    
    return ''.join(base_chars)

# Validity check
def is_valid_char(c):
    try:
        name = unicodedata.name(c)
        return not any(x in name for x in ['CONTROL', 'PRIVATE USE', 'SURROGATE', 'UNASSIGNED', 'TAG'])
    except ValueError:
        return False

SEED = (
    "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz"
    "!@#$%^&*()-_+=[{]};:',\"<>?/" + ''.join(chr(i) for i in range(0x00, 0x42))
)

def load_frozen_alphabet(filepath="frozen_base4096_alphabet.txt") -> str:
    if not os.path.exists(filepath):
        raise FileNotFoundError(f"Frozen alphabet file not found: {filepath}")

    with open(filepath, "r", encoding="utf-8") as f:
        # Strip whitespace + newlines just in case
        alphabet = f.read().strip().replace("\n", "").replace("\r", "")

    length = len(alphabet)
    unique = len(set(alphabet))

    if length != 4096:
        raise ValueError(f"Frozen alphabet length is {length}, expected 4096 characters.")
    if unique != 4096:
        raise ValueError(f"Frozen alphabet has {unique} unique characters, expected 4096 unique characters.")

    return alphabet

try:
    BASE4096_ALPHABET = load_frozen_alphabet()
except Exception as e:
    # Optional fallback, but warn
    print(f"Warning: Could not load frozen alphabet: {e}")
    print("Falling back to internal seed (not recommended).")
    BASE4096_ALPHABET = generate_base4096_alphabet(SEED)

CHAR_TO_INDEX = {ch: idx for idx, ch in enumerate(BASE4096_ALPHABET)}

# Encoder: bytes → base4096 string
def encode(data: bytes) -> str:
    num = int.from_bytes(data, byteorder='big')
    result = []
    while num > 0:
        num, rem = divmod(num, 4096)
        result.append(BASE4096_ALPHABET[rem])
    return ''.join(reversed(result)) or BASE4096_ALPHABET[0]

# Decoder: base4096 string → bytes
def decode(encoded: str) -> bytes:
    num = 0
    for char in encoded:
        if char not in CHAR_TO_INDEX:
            raise ValueError(f"Invalid character in input: {repr(char)}")
        num = num * 4096 + CHAR_TO_INDEX[char]
    # Determine minimum byte length
    length = (num.bit_length() + 7) // 8
    return num.to_bytes(length, byteorder='big')

-----

-----

ORIGINAL:

# base4096.py
# Author: Josef Kulovany - ZCHG.org
# Dynamic Base-4096 Encoder/Decoder with Extended Alphabet

import unicodedata
import os

# Generate or load the base4096 character set
def generate_base4096_alphabet(seed):
    seen = set()
    base_chars = []

    # Include seed chars first
    for ch in seed:
        if ch not in seen:
            seen.add(ch)
            base_chars.append(ch)

    # Fill to 4096 with valid Unicode chars
    for codepoint in range(0x20, 0x30000):
        c = chr(codepoint)
        if c not in seen and is_valid_char(c):
            base_chars.append(c)
            seen.add(c)
            if len(base_chars) == 4096:
                break

    if len(base_chars) < 4096:
        raise ValueError("Failed to generate 4096 unique characters.")
    
    return ''.join(base_chars)

# Validity check
def is_valid_char(c):
    try:
        name = unicodedata.name(c)
        return not any(x in name for x in ['CONTROL', 'PRIVATE USE', 'SURROGATE', 'UNASSIGNED', 'TAG'])
    except ValueError:
        return False

SEED = (
    "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz"
    "!@#$%^&*()-_+=[{]};:',\"<>?/" + ''.join(chr(i) for i in range(0x00, 0x42))
)

def load_frozen_alphabet(filepath="frozen_base4096_alphabet.txt") -> str:
    if not os.path.exists(filepath):
        raise FileNotFoundError(f"Frozen alphabet file not found: {filepath}")
    with open(filepath, "r", encoding="utf-8") as f:
        alphabet = f.read().strip()
    if len(alphabet) != 4096:
        raise ValueError("Frozen alphabet length is not 4096 characters.")
    return alphabet

try:
    BASE4096_ALPHABET = load_frozen_alphabet()
except Exception as e:
    # Optional fallback, but warn
    print(f"Warning: Could not load frozen alphabet: {e}")
    print("Falling back to internal seed (not recommended).")
    BASE4096_ALPHABET = generate_base4096_alphabet(SEED)

CHAR_TO_INDEX = {ch: idx for idx, ch in enumerate(BASE4096_ALPHABET)}

# Encoder: bytes → base4096 string
def encode(data: bytes) -> str:
    num = int.from_bytes(data, byteorder='big')
    result = []
    while num > 0:
        num, rem = divmod(num, 4096)
        result.append(BASE4096_ALPHABET[rem])
    return ''.join(reversed(result)) or BASE4096_ALPHABET[0]

# Decoder: base4096 string → bytes
def decode(encoded: str) -> bytes:
    num = 0
    for char in encoded:
        if char not in CHAR_TO_INDEX:
            raise ValueError(f"Invalid character in input: {repr(char)}")
        num = num * 4096 + CHAR_TO_INDEX[char]
    # Determine minimum byte length
    length = (num.bit_length() + 7) // 8
    return num.to_bytes(length, byteorder='big')

Version Update

I’ve updated to version 2.0.1 to include a fix. There were some parsing and formatting issues preventing us from achieving true 4096 - too many characters if counting spaces / margins, not enough characters by one after removal of spacing. The fix needed in.

The file “test.py” was added to this end.

You can also check out the folder I added for many more scripts and canonical definitions in various and often self-referential formats. As of this writing that folder is called ‘spare parts’ on the Github. None of these extras are necessary for basic base(4096), but some of them are quite powerful as enhancements. They are also useful for our HDGL engines / machines / languages / algos.

Base 4096 Demo