中文電碼 Chinese Telegraph Code (CTC)

Mainly see Jim Reeds' Chinese Telegraph Code page (link gone.)

Here I just address the ccf debian package, which should include the following in its documentation, to save the user hours of experiments, until he finally comes up with the following, which indeed matches many characters seen elsewhere. Also I only consider big5. To then convert from big5 to Unicode, see iconv(1), and then perhaps uni2ascii(1).

Actually, the Unihan database has all the connections between the systems.

Convert big5 to CTC

$ echo 營養米 | ccf -bl | perl -wpe 's/[\x80-\xFE][\x40-\x7E\xA1-\xFE]/ sprintf "%04d ", (ord $&) * 94 + (ord reverse $&) - 15295/ge;'

3602 7402 4717

Convert CTC to big5

$ echo 3602 7402 4717|perl -anwle '
for $c (@F) {
    if ( $c =~ /^\d{4}$/ ) {
        $v =
          int( ( $c - 1 ) / 94 ) *
          256 + ( $c - 1 ) % 94 +
          41378;
        printf "%s%s",
          chr $v / 256,
          chr $v % 256;

        #   show CTC when debuging
        #   printf "%s ",$c;
    }
    else {
        printf $c . " ";
    }    #[could do better]
}
'|ccf -lb
營養米

Dump all CTC, see via big5

$ perl -we '
$s = 161 * 257;
$a = ord q{A};
$i = $c = 0;
while (1) {
    if ( ( $i % 256 ) < 94 ) {
        if ( $c == 8038 ) {
            exit;
        } #end of useful
          #characters that
          #I saw in ccf
        printf "%04d %c%c%c %c%c\n",
          $c, $c / 26 / 26 + $a,
          $c / 26 % 26 + $a,
          $c % 26 + $a,
          ( $i + $s ) / 256,
          ( $i + $s ) % 256;
        $c++;
    }
    $i++;
}

' | ccf -lb | perl -wne 'print unless /\xa1\@$/ #empties' | tee d

0001 AAB 一
0002 AAC 丁
0003 AAD 七...

Quick comparison gives

$ join -2 2 d big5tele.txt | perl -wane 'print if $F[2] eq $F[3]' |
wc -l - d big5tele.txt

4992 -
7394 d
8968 big5tele.txt

Around 5000 common characters, not bad I suppose, as they are the more frequently used Chinese characters.

Conclusions

So we see ccf uses a 94 based system, perhaps the "quwei" system Reeds mentions. But anyways, I was still able to match most characters. Anyway, I am now able to decode any 4 digits CTC strings I encounter (if ever), so I don't intend to precede further. This should work until the day ccf breaks.

I didn't look at the ccf source. I just used the .deb.

Maybe the Chinese "numbers lady" clandestine shortwave radio stations one can still hear listing here in Taiwan in 2004, still use the method of writing numbers on the edge of each page, and reordering the pages, to make their secret codes, that I saw in the following book. Good thing I am not going to bother to try to decode them.

By the way, Reeds' page asks about 7193 1032 4316, this means 電報碼 dian4 bao4 ma3 "telegraphic code".

References

ISBN 957-14-1136-1 最新無線電通信術, 1985 edition, mainly a old radio operation clerk's book, with a few snippets of the code tables. I bought this book in Feb. 2004 at 三民書局 Sanmin Bookstore, who are also its publishers, 61 Chongqing S. Rd. Sec. 1, Taipei, Taiwan. Next time I should also buy the tiny ISBN 666521593 明密電碼新書 code book itself from the stack there. Maybe ISBN 666599387 童軍訓練訊號電碼本 boy scout book is another.


積丹尼 Dan Jacobson

Last modified: 2017-06-24 19:41:30 +0800