Import Upstream version 0.112

This commit is contained in:
denghao 2022-09-23 09:44:28 +03:00
commit 12923c9669
108 changed files with 4291 additions and 0 deletions

7
COPYING Normal file
View File

@ -0,0 +1,7 @@
NOTE: this distribution is a non public prerelease!
Copyright (C) 1998, 1999, 2000 Martin Schwartz. All rights reserved.
This program is free software; you can redistribute it and/or
modify it under the same terms as Perl itself.

70
Changes Normal file
View File

@ -0,0 +1,70 @@
0.112 (2002-Mar-20)
* Added: support for EUC-KR
* Minor update on EUC-JP entry (0x7e --> 0x007e)
0.111 (2001-Jan-05)
* Added: support for EUC-JP
* Added: a new transcoding test (EUC-JP ->unicode)
0.110 (2000-Aug-21)
* Little fixes.
* Map.pm changed $/ (record separator) permanently. Anyway this could
happen only when creating own mapping files, so few people complained ;)
* reverse_unicode (deprecated) was malfunctional on some systems. It's
fixed behaviour is: in a void context the string passed is altered,
otherwise a new string is created on the fly. Uses code from Gisle Aas'
Unicode::String.
0.109 (29.6.2000)
* Forgot adding APPLE-GUJARATI and APPLE-GURMUKHI in last release.
0.108 (25.6.2000)
+ Optional warnings for deprecated usage
+ Optional warnings for usage incompatible with Unicode::Map8
* Added basic support for n -> m mappings (like in APPLE-DEVANAGA).
* Fixed some structural problems about the file REGISTRY.
* Added an entry "srcURL" to REGISTRY. The URL given here points
to the original text based map file.
+ Utility "mirrorMappings" uses the srcURL entries of file REGISTRY
to create a local copy of the original text mappings. The utility
uses LWP::Simple (from distribution libwww-perl).
0.107 (20.6.2000) [non public prerelease]
* Fixed annoying "use of uninitialized value" warnings.
0.106 (19.6.2000) [non public prerelease]
* Added: support for GB2312, a mixed one byte, two byte encoding
of GB2312-80 and 8859-1.
* Updated and added some map files (described in next public release)
* Fixed: compatibility: some machines couldn't read binary mappings.
Thanks to Masahito Kagawa <mkagawa@eGain.com>!
? Fixed? Some systems didn't understand "dowarn" and failed compiling.
- Deprecating various stuff, particularly the use of module Startup.
0.105 (18.2.98)
* Fixed: works now also on machines that demand to have 16bit and
32bit integers on even addresses.
0.104 (12.2.98)
* Partial mappings are now allowed.
* Utility "map" got long options.
* Support for Gisle's binary map file format.
* Added some asian map files.
* Support for three / multi column text mappings.
0.103 (2.2.98)
* Sped up loading of the big eastern asia mapfiles.
* Changed structure of mapfiles a bit.
0.102 (26.01.98)
* Small fixes.
0.101 (24.01.98)
* A very coincidence: Gisle Aas did quite the same job. We're trying to
coordinate the work. Find his module at CPAN in .../by-authors/Gisle_Aas/
* Sped up via C extension:
methods reverse_unicode, from_unicode and to_unicode
0.100 (20.01.98)
* Initial release

41
INSTALL Normal file
View File

@ -0,0 +1,41 @@
STANDARD INSTALLATION
Installing the package involves these steps:
1. perl Makefile.PL
2. make
3. make test
If everything succeeded assure to have write permissions for you Perl
Library directories. Then:
4. make install
"perl Makefile.PL" creates a Makefile for your system. "make" builds the
package, "make test" checks if the module really works at your system
and "make install" installs it to your system. This includes 3 tools:
map - maps characters from and to unicode encoding
mkmapfile - creates binary mapfiles from textual mapfiles
mkCSGB2312 - creates a GB2312 mapfile
Typically you will not use the latter two tools.
INDIVIDUAL INSTALLATION
If you need store the package differently from the default values you
can do so by adding a PREFIX parameter to the first step mentioned above.
Like:
perl Makefile.PL PREFIX=/usr/local/
"make test" FAILED
Oops? Why? Please let me know.

108
MANIFEST Normal file
View File

@ -0,0 +1,108 @@
COPYING
Changes
INSTALL
MANIFEST
Makefile.PL
Map.pm
Map.xs
Map/ADOBE/STDENC.map
Map/ADOBE/SYMBOL.map
Map/ADOBE/ZDINGBAT.map
Map/APPLE/ARABIC.map
Map/APPLE/CENTEURO.map
Map/APPLE/CHINSIMP.map
Map/APPLE/CHINTRAD.map
Map/APPLE/CROATIAN.map
Map/APPLE/CYRILLIC.map
Map/APPLE/DEVANAGA.map
Map/APPLE/DINGBATS.map
Map/APPLE/GREEK.map
Map/APPLE/GUJARATI.map
Map/APPLE/GURMUKHI.map
Map/APPLE/HEBREW.map
Map/APPLE/ICELAND.map
Map/APPLE/JAPANESE.map
Map/APPLE/KOREAN.map
Map/APPLE/ROMAN.map
Map/APPLE/ROMANIAN.map
Map/APPLE/SYMBOL.map
Map/APPLE/THAI.map
Map/APPLE/TURKISH.map
Map/EASTASIA/BIG5.map
Map/EASTASIA/CNS-11643-1986.map
Map/EASTASIA/EUC-JP.map
Map/EASTASIA/EUC-KR.map
Map/EASTASIA/GB12345-80.map
Map/EASTASIA/GB2312-80.map
Map/EASTASIA/GB2312.map
Map/EASTASIA/JIS-X-0201.map
Map/EASTASIA/JIS-X-0208.map
Map/EASTASIA/JIS-X-0212.map
Map/EASTASIA/JOHAB.map
Map/EASTASIA/KSC1001.map
Map/EASTASIA/KSC5601-1992.map
Map/EASTASIA/SHIFTJIS.map
Map/IBM/IBM038.map
Map/ISO/8859-1.map
Map/ISO/8859-10.map
Map/ISO/8859-13.map
Map/ISO/8859-14.map
Map/ISO/8859-15.map
Map/ISO/8859-2.map
Map/ISO/8859-3.map
Map/ISO/8859-4.map
Map/ISO/8859-5.map
Map/ISO/8859-6.map
Map/ISO/8859-7.map
Map/ISO/8859-8.map
Map/ISO/8859-9.map
Map/ISO/ISO646-US.map
Map/MS/DOS/CP437.map
Map/MS/DOS/CP737.map
Map/MS/DOS/CP775.map
Map/MS/DOS/CP850.map
Map/MS/DOS/CP852.map
Map/MS/DOS/CP855.map
Map/MS/DOS/CP857.map
Map/MS/DOS/CP860.map
Map/MS/DOS/CP861.map
Map/MS/DOS/CP862.map
Map/MS/DOS/CP863.map
Map/MS/DOS/CP864.map
Map/MS/DOS/CP865.map
Map/MS/DOS/CP866.map
Map/MS/DOS/CP869.map
Map/MS/DOS/CP874.map
Map/MS/EBCDIC/CP037.map
Map/MS/EBCDIC/CP1026.map
Map/MS/EBCDIC/CP500.map
Map/MS/EBCDIC/CP875.map
Map/MS/MAC/CYRILLIC.map
Map/MS/MAC/GREEK.map
Map/MS/MAC/ICELAND.map
Map/MS/MAC/LATIN2.map
Map/MS/MAC/ROMAN.map
Map/MS/MAC/TURKISH.map
Map/MS/WIN/CP1250.map
Map/MS/WIN/CP1251.map
Map/MS/WIN/CP1252.map
Map/MS/WIN/CP1253.map
Map/MS/WIN/CP1254.map
Map/MS/WIN/CP1255.map
Map/MS/WIN/CP1256.map
Map/MS/WIN/CP1257.map
Map/MS/WIN/CP1258.map
Map/MS/WIN/CP932.map
Map/MS/WIN/CP936.map
Map/MS/WIN/CP949.map
Map/MS/WIN/CP950.map
Map/NEXT/NEXTSTEP.map
Map/REGISTRY
README
t/basic.t
t/deprecated.t
t/map.t
tools/map
tools/mirrorMappings
tools/mkCSGB2312
tools/mkmapfile

22
Makefile.PL Executable file
View File

@ -0,0 +1,22 @@
#!/usr/bin/perl
use ExtUtils::MakeMaker;
WriteMakefile (
"NAME" => "Unicode::Map",
"VERSION_FROM" => "Map.pm",
"LIBS" => [""],
"DEFINE" => "",
"INC" => "",
"dist" => {
"COMPRESS" => "gzip",
"SUFFIX" => "gz"
},
"EXE_FILES" => [
"tools/map",
"tools/mirrorMappings",
"tools/mkCSGB2312",
"tools/mkmapfile"
],
);

1366
Map.pm Normal file

File diff suppressed because it is too large Load Diff

724
Map.xs Normal file
View File

@ -0,0 +1,724 @@
/*
* $Id: Map.xs,v 1.28 1998/03/23 23:57:46 schwartz Exp $
*
* ALPHA version
*
* Unicode::Map - C extensions
*
* Interface documentation at Map.pm
*
* Copyright (C) 1998, 1999, 2000 Martin Schwartz. All rights reserved.
* This program is free software; you can redistribute it and/or
* modify it under the same terms as Perl itself.
*
* Contact: Martin Schwartz <martin@nacho.de>
*/
#ifdef __cplusplus
extern "C" {
#endif
#include "EXTERN.h"
#include "perl.h"
#include "XSUB.h"
#ifdef __cplusplus
}
#endif
/*
* It seems that dowarn isn't defined on some systems, PL_dowarn not on
* others. Gisle Aas deals with it this way:
*/
#include "patchlevel.h"
#if PATCHLEVEL <= 4 && !defined(PL_dowarn)
#define PL_dowarn dowarn
#endif
/*
*
* "Map.h"
*
*/
#define M_MAGIC 0xb827 /* magic word */
#define MAP8_BINFILE_MAGIC_HI 0xfffe /* magic word for Gisle's file format */
#define MAP8_BINFILE_MAGIC_LO 0x0001 /* */
#define M_END 0 /* end */
#define M_INF 1 /* infinite subsequent entries (default) */
#define M_BYTE 2 /* 1..255 subsequent entries */
#define M_VER 4 /* (Internal) file format revision. */
#define M_AKV 6 /* key1, val1, key2, val2, ... (default) */
#define M_AKAV 7 /* key1, key2, ..., val1, val2, ... */
#define M_PKV 8 /* partial key value mappings */
#define M_CKn 10 /* compress keys not */
#define M_CK 11 /* compress keys (default) */
#define M_CVn 13 /* compress values not */
#define M_CV 14 /* compress values (default) */
#define I_NAME 20 /* Info: (wstring) Character Set Name */
#define I_ALIAS 21 /* Info: (wstring) Charset alias (several entries ok) */
#define I_VER 22 /* Info: (wstring) Mapfile revision */
#define I_AUTH 23 /* Info: (wstring) Mapfile authRess */
#define I_INFO 24 /* Info: (wstring) Some userEss definable string */
#define T_BAD 0 /* Type: unknown */
#define T_MAP8 1 /* Type: Map8 style */
#define T_MAP 2 /* Type: Map style */
#define num1_DEFAULT M_INF;
#define method1_DEFAULT M_AKV;
#define keys1_DEFAULT M_CK;
#define values1_DEFAULT M_CV;
/* No function prototypes (as very old C-Compilers don't like them) */
/*
*
* "Map.c"
*
*/
U8 _byte(char** buf) {
U8* tmp = (U8*) *buf; *buf+=1; return tmp[0];
}
U16 _word(char** buf) {
U16 tmp; memcpy ((char*) &tmp, *buf, 2); *buf+=2; return ntohs(tmp);
}
U32 _long(char** buf) {
U32 tmp; memcpy ((char*) &tmp, *buf, 4); *buf+=4; return ntohl(tmp);
}
AV* __system_test (void) {
/*
* If this test suit gets passed ok, the C methods will probably work.
*/
char* check = "\x01\x04\xfe\x83\x73\xf8\x04\x59\x19";
char* buf;
AV* list = newAV();
U32 i, k;
/*
* Have the Unn the bytesize I assume?
*/
if (sizeof(U8)!=1) { av_push (list, newSVpv("1a", 1)); }
if (sizeof(U16)!=2) { av_push (list, newSVpv("1b", 1)); }
if (sizeof(U32)!=4) { av_push (list, newSVpv("1c", 1)); }
/*
* Does _byte work?
*/
buf = check;
if (_byte(&buf) != 0x01) { av_push(list, newSVpv("2a", 2)); }
if (_byte(&buf) != 0x04) { av_push(list, newSVpv("2b", 2)); }
if (_byte(&buf) != 0xfe) { av_push(list, newSVpv("2c", 2)); }
if (_byte(&buf) != 0x83) { av_push(list, newSVpv("2d", 2)); }
/*
* Are _word and _long really reading Network order?
*/
if (_word(&buf) != 0x73f8) { av_push(list, newSVpv("3a", 2)); }
if (_word(&buf) != 0x0459) { av_push(list, newSVpv("3b", 2)); }
buf = check + 1;
if (_byte(&buf) != 0x04) { av_push(list, newSVpv("4a", 2)); }
if (_long(&buf) != 0xfe8373f8) { av_push(list, newSVpv("4b", 2)); }
/*
* Is U32 really not an I32?
*/
buf = check + 2;
i = _long(&buf);
i ++;
if (i != 0xfe8373f9) { av_push(list, newSVpv("5", 1)); }
k = htonl(0x12345678);
if (memcmp((char*)&k+(4-1), "\x78", 1)) {
av_push(list, newSVpv("6a", 2));
}
if (memcmp((char*)&k+(4-2), "\x56\x78", 2)) {
av_push(list, newSVpv("6b", 2));
}
if (memcmp((char*)&k+(4-4), "\x12\x34\x56\x78", 4)) {
av_push(list, newSVpv("6c", 2));
}
return (list);
}
int
__limit_ol (SV* string, SV* o, SV* l, char** ro, U32* rl, U16 cs) {
/*
* Checks, if offset and length are valid. If offset is negative, it is
* treated like a negative offset in perl.
*
* When successful, sets ro (real offset) and rl (real length).
*/
STRLEN slen;
char* address;
I32 offset;
U32 length;
*ro = 0;
*rl = 0;
if (!SvOK(string)) {
if (PL_dowarn) { warn ("String undefined!"); }
return (0);
}
address = SvPV (string, slen);
offset = SvOK(o) ? SvIV(o) : 0;
length = SvOK(l) ? SvIV(l) : slen;
if (offset < 0) {
offset += slen;
}
if (offset < 0) {
offset = 0;
length = slen;
if (PL_dowarn) { warn ("Bad negative string offset!"); }
}
if (offset > slen) {
offset = slen;
length = 0;
if (PL_dowarn) { warn ("String offset to big!"); }
}
if (offset + length > slen) {
length = slen - offset;
if (PL_dowarn) { warn ("Bad string length!"); }
}
if (length % cs != 0) {
if (length>cs) {
length -= (length % cs);
} else {
length = 0;
}
if (PL_dowarn) { warn("Bad string size!"); }
}
*ro = address + offset;
*rl = length;
return (1);
}
int
__get_mode (char** buf, U8* num, U8* method, U8* keys, U8* values) {
U8 type, size;
type = _byte(buf);
size = _byte(buf); *buf += size;
switch (type) {
case M_INF:
case M_BYTE:
*num = type; break;
case M_AKV:
case M_AKAV:
case M_PKV:
*method = type; break;
case M_CKn:
case M_CK:
*keys = type; break;
case M_CVn:
case M_CV:
*values = type; break;
}
return (type);
}
/*
* void = __read_binary_mapping (bufS, oS, UR, CR)
*
* Table of mode combinations:
*
* Mode | n1 n2 | INF BYTE | CK CKn | CV CVn
* ---------------------------------------------------------
* AKV | | | |
* AKAV | | | |
* PKV ok | ==1 ==1 | ok | ok | ok
*/
int
__read_binary_mapping (SV* bufS, SV* oS, SV* UR, SV* CR) {
char* buf;
U32 o;
HV* U; SV* uR; HV* u;
HV* C; SV* cR; HV* c;
int buflen;
char* bufmax;
U8 cs1, cs1b, cs2, cs2b;
U32 n1, n2;
U16 check;
U16 type=T_BAD;
U8 num1, method1, keys1, values1;
I16 kn, vn;
U32 kbegin, vbegin;
SV* Ustr;
SV* Cstr;
SV** tmp_spp;
buf = SvPVX (bufS);
o = SvIV (oS);
U = (HV *) SvRV (UR);
C = (HV *) SvRV (CR);
buflen = SvCUR(bufS); if (buflen < 2) {
/*
* Too short file. (No place for magic)
*/
if ( PL_dowarn ) { warn ( "Bad map file: too short!" ); }
return (0);
}
bufmax = buf + buflen;
buf += o;
check = _word(&buf);
if (check == M_MAGIC) {
type = T_MAP;
} else if (
( check == MAP8_BINFILE_MAGIC_HI ) &&
( _word(&buf) == MAP8_BINFILE_MAGIC_LO )
) {
type = T_MAP8;
}
if (type == T_BAD) {
if ( PL_dowarn ) { warn ( "Unknown map file format!" ); }
return (0);
}
num1 = num1_DEFAULT;
method1 = method1_DEFAULT;
keys1 = keys1_DEFAULT;
values1 = values1_DEFAULT;
while (buf<bufmax) {
U8 num2, method2, keys2, values2;
num2=num1; method2=method1; keys2=keys1; values2=values1;
if (type == T_MAP) {
cs1 = _byte (&buf);
if (!cs1) {
if (__get_mode(&buf, &num1, &method1, &keys1, &values1) == M_END) {
break;
}
continue;
} else {
n1 = _byte (&buf);
cs2 = _byte (&buf);
n2 = _byte (&buf);
}
cs1b = (cs1+7)/8;
cs2b = (cs2+7)/8;
} else if (type == T_MAP8) {
cs1b=1; n1=1; cs2b=2; n2=1;
}
Ustr = newSVpvf ("%d,%d,%d,%d", cs1b, n1, cs2b, n2);
Cstr = newSVpvf ("%d,%d,%d,%d", cs2b, n2, cs1b, n1);
/*
* Get, create hash for submapping of %U
*/
if (!hv_exists_ent(U, Ustr, 0)) {
hv_store_ent(U, Ustr, newRV_inc((SV*) newHV()), 0);
}
tmp_spp = hv_fetch(U, SvPVX(Ustr), SvCUR(Ustr), 0);
if (!tmp_spp) {
if ( PL_dowarn ) { warn ( "Can't retrieve U submapping!" ); }
return (0);
} else {
uR = (SV *) *tmp_spp;
u = (HV *) SvRV (uR);
}
/*
* Get, create hash for submapping of %C
*/
if (!hv_exists_ent(C, Cstr, 0)) {
hv_store_ent(C, Cstr, newRV_inc((SV*) newHV()), 0);
}
tmp_spp = hv_fetch(C, SvPVX(Cstr), SvCUR(Cstr), 0);
if (!tmp_spp) {
if ( PL_dowarn ) { warn ( "Can't retrieve C submapping!" ); }
return (0);
} else {
cR = (SV *) *tmp_spp;
c = (HV *) SvRV (cR);
}
if (type == T_MAP8) {
/*
* Map8 mode
*/
/*
* => All (key, value) pairs
*/
SV* tmpk; SV* tmpv;
while (buf<bufmax) {
if (buf[0] != '\0') {
if ( PL_dowarn ) { warn ( "Bad map file!" ); }
return (0);
}
tmpk = newSVpv(buf+1, 1); buf += 2;
tmpv = newSVpv(buf , 2); buf += 2;
if (buf > bufmax) { break; }
hv_store_ent(u, tmpk, tmpv, 0);
hv_store_ent(c, tmpv, tmpk, 0);
}
} else if (method1==M_AKV) {
/*
* Map mode
*/
U32 ksize = n1*cs1b; SV* tmpk;
U32 vsize = n2*cs2b; SV* tmpv;
if ( num1==M_INF ) {
/*
* All (key, value) pairs
*/
while (buf<bufmax) {
if ( buf+ksize+vsize>bufmax ) {
buf += ( ksize+vsize );
break;
}
tmpk = newSVpv(buf, ksize); buf += ksize;
tmpv = newSVpv(buf, vsize); buf += vsize;
hv_store_ent(c, tmpv, tmpk, 0);
hv_store_ent(u, tmpk, tmpv, 0);
}
} else if ( num1==M_BYTE ) {
while ( buf<bufmax ) {
if (!(kn=_byte(&buf))) {
if (__get_mode(&buf,&num2,&method2,&keys2,&values2)==M_END) {
break;
}
}
while ( kn>0 ) {
if ( buf+ksize+vsize>bufmax ) {
buf += ( ksize+vsize );
break;
}
tmpk = newSVpv(buf, ksize); buf += ksize;
tmpv = newSVpv(buf, vsize); buf += vsize;
hv_store_ent(c, tmpv, tmpk, 0);
hv_store_ent(u, tmpk, tmpv, 0);
kn--;
}
}
}
} else if (method1==M_AKAV) {
/*
* First all keys, then all values
*/
if ( PL_dowarn ) { warn ( "M_AKAV not supported!" ); }
return (0);
} else if (method1==M_PKV) {
/*
* Partial
*/
if (num1==M_INF) {
/* no infinite mode */
if ( PL_dowarn ) { warn ( "M_INF not supported for M_PKV!" ); }
return (0);
}
while(buf<bufmax) {
U8 num3, method3, keys3, values3;
num3=num2; method3=method2; keys3=keys2; values3=values2;
if (!(kn = _byte(&buf))) {
if (__get_mode(&buf,&num2,&method2,&keys2,&values2)==M_END) {
break;
}
continue;
}
switch (cs1b) {
case 1: kbegin = _byte(&buf); break;
case 2: kbegin = _word(&buf); break;
case 4: kbegin = _long(&buf); break;
default:
if ( PL_dowarn ) { warn ( "Unknown element size!" ); }
return (0);
}
while (kn>0) {
if (values3==M_CV) {
/*
* Partial, keys compressed, values compressed
*/
SV* tmpk; U32 k;
SV* tmpv; U32 v;
U32 max;
vn = _byte(&buf);
if (!vn) {
if(__get_mode(&buf,&num3,&method3,&keys3,&values3)==M_END){
break;
}
continue;
}
if ((n1 != 1) || (n2 != 1)) {
/*
* n (n>1) characters cannot be mapped to one integer
*/
if ( PL_dowarn ) { warn("Bad map file: count mismatch!"); }
return (0);
}
switch (cs2b) {
case 1: vbegin = _byte(&buf); break;
case 2: vbegin = _word(&buf); break;
case 4: vbegin = _long(&buf); break;
default:
if ( PL_dowarn ) { warn ( "Unknown element size!" ); }
return (0);
}
max = kbegin + vn;
for (; kbegin<max; kbegin++, vbegin++) {
k = htonl(kbegin);
tmpk = newSVpv((char *) &k + (4-cs1b), cs1b);
v = htonl(vbegin);
tmpv = newSVpv((char *) &v + (4-cs2b), cs2b);
hv_store_ent(c, tmpv, tmpk, 0);
hv_store_ent(u, tmpk, tmpv, 0);
}
kn-=vn;
} else if (values3==M_CVn) {
/*
* Partial, keys compressed, values not compressed
*/
U32 v;
U32 vsize = n2*cs2b;
SV* tmpk;
SV* tmpv;
if (n1 != 1) {
if ( PL_dowarn ) { warn ( "Bad map file: mismatch 2!" ); }
return (0);
}
while (kn--) {
v = htonl(kbegin);
tmpk = newSVpv((char *) &v + (4-cs1b), cs1b);
tmpv = newSVpv(buf, vsize); buf += vsize;
hv_store_ent(u, tmpk, tmpv, 0);
hv_store_ent(c, tmpv, tmpk, 0);
kbegin++;
}
} else {
/*
* Unknown value compression.
*/
if ( PL_dowarn ) { warn ( "Unknown compression!" ); }
return (0);
}
}
}
} else {
/*
* unknown method
*/
if ( PL_dowarn ) { warn ( "Unknown method!" ); }
return (0);
}
}
return (1);
}
/*
*
* "Map.xs"
*
*/
MODULE = Unicode::Map PACKAGE = Unicode::Map
PROTOTYPES: DISABLE
#
# $text = $Map -> reverse_unicode($text)
#
SV*
_reverse_unicode(Map, text)
SV* Map
SV* text
PREINIT:
int i;
char c;
STRLEN len;
char* src;
char* dest;
PPCODE:
src = SvPV (text, len);
if (PL_dowarn && (len % 2) != 0) {
warn("Bad string size!"); len--;
}
/* Code below adapted from GAAS's Unicode::String */
if ( GIMME_V == G_VOID ) {
if ( SvREADONLY(text) ) {
die ( "reverse_unicode: string is readonly!" );
}
dest = src;
} else {
SV* dest_sv = sv_2mortal ( newSV(len+1) );
SvCUR_set ( dest_sv, len );
*SvEND ( dest_sv ) = 0;
SvPOK_on ( dest_sv );
PUSHs ( dest_sv );
dest = SvPVX ( dest_sv );
}
for ( ; len>=2; len-=2 ) {
char tmp = *src++;
*dest++ = *src++;
*dest++ = tmp;
}
#
# $mapped_str = $Map -> _map_hash($string, \%mapping, $bytesize, offset, length)
#
# bytesize, offset, length in terms of bytes.
#
# bytesize gives the size of one character for this mapping.
#
SV*
_map_hash(Map, string, mappingR, bytesize, o, l)
SV* Map
SV* string
SV* mappingR
SV* bytesize
SV* o
SV* l
PREINIT:
char* offset; U32 length; U16 bs;
char* smax;
HV* mapping;
SV** tmp;
CODE:
bs = SvIV(bytesize);
__limit_ol (string, o, l, &offset, &length, bs);
smax = offset + length;
RETVAL = newSV((length/bs+1)*2);
mapping = (HV *) SvRV(mappingR);
for (; offset<smax; offset+=bs) {
if (tmp = hv_fetch(mapping, offset, bs, 0)) {
if ( SvOK(RETVAL) ) {
sv_catsv(RETVAL, *tmp);
} else {
sv_setsv(RETVAL, *tmp);
}
} else {
/* No mapping character found! */
}
}
OUTPUT:
RETVAL
#
# $mapped_str = $Map -> _map_hashlist($string, [@{\%mapping}], [@{$bytesize}])
#
# bytesize gives the size of one character for this mapping.
#
SV*
_map_hashlist(Map, string, mappingRLR, bytesizeLR, o, l)
SV* Map
SV* string
SV* mappingRLR
SV* bytesizeLR
SV* o
SV* l
PREINIT:
int j, max;
AV* mappingRL; HV* mapping;
AV* bytesizeL; int bytesize;
SV** tmp;
char* offset; U32 length; char* smax;
CODE:
__limit_ol (string, o, l, &offset, &length, 1);
smax = offset + length;
RETVAL = newSV((length+1)*2);
mappingRL = (AV *) SvRV(mappingRLR);
bytesizeL = (AV *) SvRV(bytesizeLR);
max = av_len(mappingRL);
if (max != av_len(bytesizeL)) {
warn("$#mappingRL != $#bytesizeL!");
} else {
max++;
for (; offset<smax; ) {
for (j=0; j<=max; j++) {
if (j==max) {
/* No mapping character found!
* How many bytes does this unknown character consume?
* Sigh, assume 2.
*/
offset += 2;
} else {
if (tmp = av_fetch(mappingRL, j, 0)) {
mapping = (HV *) SvRV((SV*) *tmp);
if (tmp = av_fetch(bytesizeL, j, 0)) {
bytesize = SvIV(*tmp);
if (tmp = hv_fetch(mapping, offset, bytesize, 0)) {
if ( SvOK(RETVAL) ) {
sv_catsv(RETVAL, *tmp);
} else {
sv_setsv(RETVAL, *tmp);
}
offset+=bytesize;
break;
}
}
}
}
}
}
}
OUTPUT:
RETVAL
#
# status = $S->_read_binary_mapping($buf, $o, \%U, \%C);
#
SV*
_read_binary_mapping (MapS, bufS, oS, UR, CR)
SV* MapS
SV* bufS
SV* oS
SV* UR
SV* CR
CODE:
RETVAL = newSViv(__read_binary_mapping(bufS, oS, UR, CR));
OUTPUT:
RETVAL
#
# 0 || errornum = $S->_test ()
#
AV*
_system_test (void)
CODE:
RETVAL = __system_test();
OUTPUT:
RETVAL

BIN
Map/ADOBE/STDENC.map Normal file

Binary file not shown.

BIN
Map/ADOBE/SYMBOL.map Normal file

Binary file not shown.

BIN
Map/ADOBE/ZDINGBAT.map Normal file

Binary file not shown.

BIN
Map/APPLE/ARABIC.map Normal file

Binary file not shown.

BIN
Map/APPLE/CENTEURO.map Normal file

Binary file not shown.

BIN
Map/APPLE/CHINSIMP.map Normal file

Binary file not shown.

BIN
Map/APPLE/CHINTRAD.map Normal file

Binary file not shown.

BIN
Map/APPLE/CROATIAN.map Normal file

Binary file not shown.

BIN
Map/APPLE/CYRILLIC.map Normal file

Binary file not shown.

BIN
Map/APPLE/DEVANAGA.map Normal file

Binary file not shown.

BIN
Map/APPLE/DINGBATS.map Normal file

Binary file not shown.

BIN
Map/APPLE/GREEK.map Normal file

Binary file not shown.

BIN
Map/APPLE/GUJARATI.map Normal file

Binary file not shown.

BIN
Map/APPLE/GURMUKHI.map Normal file

Binary file not shown.

BIN
Map/APPLE/HEBREW.map Normal file

Binary file not shown.

BIN
Map/APPLE/ICELAND.map Normal file

Binary file not shown.

BIN
Map/APPLE/JAPANESE.map Normal file

Binary file not shown.

BIN
Map/APPLE/KOREAN.map Normal file

Binary file not shown.

BIN
Map/APPLE/ROMAN.map Normal file

Binary file not shown.

BIN
Map/APPLE/ROMANIAN.map Normal file

Binary file not shown.

BIN
Map/APPLE/SYMBOL.map Normal file

Binary file not shown.

BIN
Map/APPLE/THAI.map Normal file

Binary file not shown.

BIN
Map/APPLE/TURKISH.map Normal file

Binary file not shown.

BIN
Map/EASTASIA/BIG5.map Normal file

Binary file not shown.

Binary file not shown.

BIN
Map/EASTASIA/EUC-JP.map Normal file

Binary file not shown.

BIN
Map/EASTASIA/EUC-KR.map Normal file

Binary file not shown.

BIN
Map/EASTASIA/GB12345-80.map Normal file

Binary file not shown.

BIN
Map/EASTASIA/GB2312-80.map Normal file

Binary file not shown.

BIN
Map/EASTASIA/GB2312.map Normal file

Binary file not shown.

BIN
Map/EASTASIA/JIS-X-0201.map Normal file

Binary file not shown.

BIN
Map/EASTASIA/JIS-X-0208.map Normal file

Binary file not shown.

BIN
Map/EASTASIA/JIS-X-0212.map Normal file

Binary file not shown.

BIN
Map/EASTASIA/JOHAB.map Normal file

Binary file not shown.

BIN
Map/EASTASIA/KSC1001.map Normal file

Binary file not shown.

Binary file not shown.

BIN
Map/EASTASIA/SHIFTJIS.map Normal file

Binary file not shown.

BIN
Map/IBM/IBM038.map Normal file

Binary file not shown.

BIN
Map/ISO/8859-1.map Normal file

Binary file not shown.

BIN
Map/ISO/8859-10.map Normal file

Binary file not shown.

BIN
Map/ISO/8859-13.map Normal file

Binary file not shown.

BIN
Map/ISO/8859-14.map Normal file

Binary file not shown.

BIN
Map/ISO/8859-15.map Normal file

Binary file not shown.

BIN
Map/ISO/8859-2.map Normal file

Binary file not shown.

BIN
Map/ISO/8859-3.map Normal file

Binary file not shown.

BIN
Map/ISO/8859-4.map Normal file

Binary file not shown.

BIN
Map/ISO/8859-5.map Normal file

Binary file not shown.

BIN
Map/ISO/8859-6.map Normal file

Binary file not shown.

BIN
Map/ISO/8859-7.map Normal file

Binary file not shown.

BIN
Map/ISO/8859-8.map Normal file

Binary file not shown.

BIN
Map/ISO/8859-9.map Normal file

Binary file not shown.

BIN
Map/ISO/ISO646-US.map Normal file

Binary file not shown.

BIN
Map/MS/DOS/CP437.map Normal file

Binary file not shown.

BIN
Map/MS/DOS/CP737.map Normal file

Binary file not shown.

BIN
Map/MS/DOS/CP775.map Normal file

Binary file not shown.

BIN
Map/MS/DOS/CP850.map Normal file

Binary file not shown.

BIN
Map/MS/DOS/CP852.map Normal file

Binary file not shown.

BIN
Map/MS/DOS/CP855.map Normal file

Binary file not shown.

BIN
Map/MS/DOS/CP857.map Normal file

Binary file not shown.

BIN
Map/MS/DOS/CP860.map Normal file

Binary file not shown.

BIN
Map/MS/DOS/CP861.map Normal file

Binary file not shown.

BIN
Map/MS/DOS/CP862.map Normal file

Binary file not shown.

BIN
Map/MS/DOS/CP863.map Normal file

Binary file not shown.

BIN
Map/MS/DOS/CP864.map Normal file

Binary file not shown.

BIN
Map/MS/DOS/CP865.map Normal file

Binary file not shown.

BIN
Map/MS/DOS/CP866.map Normal file

Binary file not shown.

BIN
Map/MS/DOS/CP869.map Normal file

Binary file not shown.

BIN
Map/MS/DOS/CP874.map Normal file

Binary file not shown.

BIN
Map/MS/EBCDIC/CP037.map Normal file

Binary file not shown.

BIN
Map/MS/EBCDIC/CP1026.map Normal file

Binary file not shown.

BIN
Map/MS/EBCDIC/CP500.map Normal file

Binary file not shown.

BIN
Map/MS/EBCDIC/CP875.map Normal file

Binary file not shown.

BIN
Map/MS/MAC/CYRILLIC.map Normal file

Binary file not shown.

BIN
Map/MS/MAC/GREEK.map Normal file

Binary file not shown.

BIN
Map/MS/MAC/ICELAND.map Normal file

Binary file not shown.

BIN
Map/MS/MAC/LATIN2.map Normal file

Binary file not shown.

BIN
Map/MS/MAC/ROMAN.map Normal file

Binary file not shown.

BIN
Map/MS/MAC/TURKISH.map Normal file

Binary file not shown.

BIN
Map/MS/WIN/CP1250.map Normal file

Binary file not shown.

BIN
Map/MS/WIN/CP1251.map Normal file

Binary file not shown.

BIN
Map/MS/WIN/CP1252.map Normal file

Binary file not shown.

BIN
Map/MS/WIN/CP1253.map Normal file

Binary file not shown.

BIN
Map/MS/WIN/CP1254.map Normal file

Binary file not shown.

BIN
Map/MS/WIN/CP1255.map Normal file

Binary file not shown.

BIN
Map/MS/WIN/CP1256.map Normal file

Binary file not shown.

BIN
Map/MS/WIN/CP1257.map Normal file

Binary file not shown.

BIN
Map/MS/WIN/CP1258.map Normal file

Binary file not shown.

BIN
Map/MS/WIN/CP932.map Normal file

Binary file not shown.

BIN
Map/MS/WIN/CP936.map Normal file

Binary file not shown.

BIN
Map/MS/WIN/CP949.map Normal file

Binary file not shown.

BIN
Map/MS/WIN/CP950.map Normal file

Binary file not shown.

BIN
Map/NEXT/NEXTSTEP.map Normal file

Binary file not shown.

840
Map/REGISTRY Normal file
View File

@ -0,0 +1,840 @@
# $Id$
#
# This is a control file for Unicode::Map. It serves two purposes:
#
# 1. To relate the names, aliases and map table of character sets.
# When loading a charset it be referred to this file.
#
# 2. To store the path of a source file containing the textual map
# file for a mapping. For efficiency these original files need
# to be stored in a quicker accessable binary form. You can simply
# ignore these entries.
# Note: if you really want to create own binary mapfiles read the
# note [*] below.
#
# First, it selects source files and defines the Unicode::Map storage
# hierarchy for binary character mappings. Secondly it defines the names
# and alias names for character sets.
#
# The mapfiles are created from textual mapfiles. Sources are the Internet
# character sets collections from Unicode [1] and Keld Simonsen [3]. The
# number and quality of map files once differed strongly. Most problematic
# has been that for ISO-8859 the Unicode mappings omitted the control
# characters. This has been fixed with table revision 1.0. Keld's collection
# is more or less of historical interest nowadays. Same is true for the
# hardly overestimatable pages of Roman Czyborra [4].
#
# REFERENCES:
#
# [1] Mapping files collected at the Unicode Consortium:
# ftp://ftp.unicode.org/MAPPINGS/
#
# [2] "Official names for character sets that may be used in the Internet":
# http://www.isi.edu/in-notes/iana/assignments/character-sets
#
# [3] Keld Simonsen:
# ftp://dkuug.dk/i18n/charmaps/
#
# [4] Roman Czyborra:
# http://www.czyborra.com
#
#
# CREATING YOUR OWN MAPPINGS:
#
##
## The following defines and the src/dest entries below will only have
## effect, if you're going to create your own set of binary mapfiles.
## (as done with "mkmapmode -U"). Normally you should not bother about this
## at all.
##
DEFINE:
##
## Define segment. Syntax sugar:
## $foobar Refers to an environment variable. If no such environment
## variable defined it refers to a variable defined in this file.
## Note (again):
## 1. For keys: the User Environment overrides file settings
## 2. For values: the file settings are applied only if
## the variable isn't defined in the user environment.
## Example:
## You want to create a set of binary mappings for testing
## purposes in your /home/myself/Unicode. Simply set an
## environment variable: "DestMap" to "/home/myself/Unicode"
## and run "mkmapfile -U".
##
## '$xyz' Literal mode, $xyz will not be evaluated as env variable.
## $$ Magic value. Refers to the mappings directory of the
## Unicode::Map instance. File REGISTRY is stored in there.
## ~ Your personal home directory.
##
# Binary mappings are stored here. (Note that the installation procedure
# expects it set to "$$")
DestMap = "$$"
# Copies of original text mappings would be placed in directory "unicode"
# in your home directory:
DestBase = "~/unicode"
SrcUnicode = "ftp://ftp.unicode.org/Public/MAPPINGS"
DestUnicode = "$DestBase/MAPPINGS"
SrcKeld = "ftp://dkuug.dk/i18n/charmaps"
DestKeld = "$DestBase/charmaps"
# Gisle = "/usr/lib/perl5/site_perl/Unicode/Map8/maps"
DATA:
##
## Data segment: separate entries with an empty line.
## Variables defined in data segment can be used indicated by leading $.
##
## *Not* supported in this segment:
## - Environment variables
## - $$
## - ~
## - ""
## - ''
##
## Possible Entries are:
##
## name: Name of character set.
## alias: Alias name for character set.
## srcURL: Source of the textual mapping for this charset.
## style: Style of source text file. Defaults to "unicode".
## map: FilePath for binary mapping.
##
## style can be:
## unicode : two colums, first vendor, second unicode
## reverse : two column, second vendor, first unicode
## n m : several columns, column n is vendor, column m is unicode
## keld : three colums, matches like: '$escx([^\s]+)\s+<U([^>]+)'
## where $escx is a special char plus an 'x'.
##
##
## --- Adobe charsets ------------------------------------------------------
##
name: ADOBE-DINGBATS
srcURL: $SrcUnicode/VENDORS/ADOBE/zdingbat.txt
src: $DestUnicode/VENDORS/ADOBE/zdingbat.txt
style: reverse
map: $DestMap/ADOBE/ZDINGBAT.map
name: ADOBE-STANDARD
srcURL: $SrcUnicode/VENDORS/ADOBE/stdenc.txt
src: $DestUnicode/VENDORS/ADOBE/stdenc.txt
style: reverse
map: $DestMap/ADOBE/STDENC.map
alias: csAdobeStandardEncoding
alias: Adobe-Standard-Encoding
#mib: 2005
name: ADOBE-SYMBOL
srcURL: $SrcUnicode/VENDORS/ADOBE/symbol.txt
src: $DestUnicode/VENDORS/ADOBE/symbol.txt
style: reverse
map: $DestMap/ADOBE/SYMBOL.map
alias: csHPPSMath
#mib: 2020
##
## --- Apple charsets ------------------------------------------------------
##
name: APPLE-ARABIC
srcURL: $SrcUnicode/VENDORS/APPLE/ARABIC.TXT
src: $DestUnicode/VENDORS/APPLE/ARABIC.TXT
map: $DestMap/APPLE/ARABIC.map
name: APPLE-CENTEURO
srcURL: $SrcUnicode/VENDORS/APPLE/CENTEURO.TXT
src: $DestUnicode/VENDORS/APPLE/CENTEURO.TXT
map: $DestMap/APPLE/CENTEURO.map
name: APPLE-CHINSIMP
srcURL: $SrcUnicode/VENDORS/APPLE/CHINSIMP.TXT
src: $DestUnicode/VENDORS/APPLE/CHINSIMP.TXT
map: $DestMap/APPLE/CHINSIMP.map
name: APPLE-CHINTRAD
srcURL: $SrcUnicode/VENDORS/APPLE/CHINTRAD.TXT
src: $DestUnicode/VENDORS/APPLE/CHINTRAD.TXT
map: $DestMap/APPLE/CHINTRAD.map
name: APPLE-CROATIAN
srcURL: $SrcUnicode/VENDORS/APPLE/CROATIAN.TXT
src: $DestUnicode/VENDORS/APPLE/CROATIAN.TXT
map: $DestMap/APPLE/CROATIAN.map
name: APPLE-CYRILLIC
srcURL: $SrcUnicode/VENDORS/APPLE/CYRILLIC.TXT
src: $DestUnicode/VENDORS/APPLE/CYRILLIC.TXT
map: $DestMap/APPLE/CYRILLIC.map
alias: APPLE-UKRAINE
name: APPLE-DEVANAGA
srcURL: $SrcUnicode/VENDORS/APPLE/DEVANAGA.TXT
src: $DestUnicode/VENDORS/APPLE/DEVANAGA.TXT
map: $DestMap/APPLE/DEVANAGA.map
name: APPLE-DINGBATS
srcURL: $SrcUnicode/VENDORS/APPLE/DINGBATS.TXT
src: $DestUnicode/VENDORS/APPLE/DINGBATS.TXT
map: $DestMap/APPLE/DINGBATS.map
# Not yet supported: Can't deal with <LR> and <LR>!
# name: APPLE-FARSI
# srcURL: $SrcUnicode/VENDORS/APPLE/FARSI.TXT
# src: $DestUnicode/VENDORS/APPLE/FARSI.TXT
# map: $DestMap/APPLE/FARSI.map
name: APPLE-GREEK
srcURL: $SrcUnicode/VENDORS/APPLE/GREEK.TXT
src: $DestUnicode/VENDORS/APPLE/GREEK.TXT
map: $DestMap/APPLE/GREEK.map
# Not yet supported: Can't deal with from(x+y) mappings!
# name: APPLE-GUJARATI
# srcURL: $SrcUnicode/VENDORS/APPLE/GUJARATI.TXT
# src: $DestUnicode/VENDORS/APPLE/GUJARATI.TXT
# map: $DestMap/APPLE/GUJARATI.map
# Not yet supported: Can't deal with from(x+y) mappings!
# name: APPLE-GURMUKHI
# srcURL: $SrcUnicode/VENDORS/APPLE/GURMUKHI.TXT
# src: $DestUnicode/VENDORS/APPLE/GURMUKHI.TXT
# map: $DestMap/APPLE/GURMUKHI.map
# Not yet supported: Can't deal with <LR> and <LR>!
# Using an older mapping file instead! The older mapping is
# unfortunately no longer available in public.
name: APPLE-HEBREW
src: $DestUnicode/VENDORS/APPLE/HEBREW.OLD.TXT
map: $DestMap/APPLE/HEBREW.map
name: APPLE-ICELAND
srcURL: $SrcUnicode/VENDORS/APPLE/ICELAND.TXT
src: $DestUnicode/VENDORS/APPLE/ICELAND.TXT
map: $DestMap/APPLE/ICELAND.map
name: APPLE-JAPANESE
srcURL: $SrcUnicode/VENDORS/APPLE/JAPANESE.TXT
src: $DestUnicode/VENDORS/APPLE/JAPANESE.TXT
map: $DestMap/APPLE/JAPANESE.map
name: APPLE-KOREAN
srcURL: $SrcUnicode/VENDORS/APPLE/KOREAN.TXT
src: $DestUnicode/VENDORS/APPLE/KOREAN.TXT
map: $DestMap/APPLE/KOREAN.map
name: APPLE-ROMAN
srcURL: $SrcUnicode/VENDORS/APPLE/ROMAN.TXT
src: $DestUnicode/VENDORS/APPLE/ROMAN.TXT
map: $DestMap/APPLE/ROMAN.map
name: APPLE-ROMANIAN
srcURL: $SrcUnicode/VENDORS/APPLE/ROMANIAN.TXT
src: $DestUnicode/VENDORS/APPLE/ROMANIAN.TXT
map: $DestMap/APPLE/ROMANIAN.map
name: APPLE-SYMBOL
srcURL: $SrcUnicode/VENDORS/APPLE/SYMBOL.TXT
src: $DestUnicode/VENDORS/APPLE/SYMBOL.TXT
map: $DestMap/APPLE/SYMBOL.map
name: APPLE-THAI
srcURL: $SrcUnicode/VENDORS/APPLE/THAI.TXT
src: $DestUnicode/VENDORS/APPLE/THAI.TXT
map: $DestMap/APPLE/THAI.map
name: APPLE-TURKISH
srcURL: $SrcUnicode/VENDORS/APPLE/TURKISH.TXT
src: $DestUnicode/VENDORS/APPLE/TURKISH.TXT
map: $DestMap/APPLE/TURKISH.map
##
## --- IBM / MS codepages -------------------------------------------------
##
name: CP037
srcURL: $SrcUnicode/VENDORS/MICSFT/EBCDIC/CP037.TXT
src: $DestUnicode/VENDORS/MICSFT/EBCDIC/CP037.TXT
map: $DestMap/MS/EBCDIC/CP037.map
alias: IBM037
alias: ebcdic-cp-us
alias: ebcdic-cp-ca
alias: ebcdic-cp-wt
alias: ebcdic-cp-nl
alias: csIBM037
#mib: 2028
name: CP437
srcURL: $SrcUnicode/VENDORS/MICSFT/PC/CP437.TXT
src: $DestUnicode/VENDORS/MICSFT/PC/CP437.TXT
map: $DestMap/MS/DOS/CP437.map
alias: IBM437
alias: 437
alias: csPC8CodePage437
#mib: 2011
name: CP500
srcURL: $SrcUnicode/VENDORS/MICSFT/EBCDIC/CP500.TXT
src: $DestUnicode/VENDORS/MICSFT/EBCDIC/CP500.TXT
map: $DestMap/MS/EBCDIC/CP500.map
alias: IBM500
alias: ebcdic-cp-be
alias: ebcdic-cp-ch
alias: csIBM500
#mib: 2044
name: CP737
srcURL: $SrcUnicode/VENDORS/MICSFT/PC/CP737.TXT
src: $DestUnicode/VENDORS/MICSFT/PC/CP737.TXT
map: $DestMap/MS/DOS/CP737.map
name: CP775
srcURL: $SrcUnicode/VENDORS/MICSFT/PC/CP775.TXT
src: $DestUnicode/VENDORS/MICSFT/PC/CP775.TXT
map: $DestMap/MS/DOS/CP775.map
alias: IBM775
alias: csPC775Baltic
#mib: 2087
name: CP850
srcURL: $SrcUnicode/VENDORS/MICSFT/PC/CP850.TXT
src: $DestUnicode/VENDORS/MICSFT/PC/CP850.TXT
map: $DestMap/MS/DOS/CP850.map
alias: IBM850
alias: 850
alias: csPC850Multilingual
#mib: 2009
name: CP852
srcURL: $SrcUnicode/VENDORS/MICSFT/PC/CP852.TXT
src: $DestUnicode/VENDORS/MICSFT/PC/CP852.TXT
map: $DestMap/MS/DOS/CP852.map
alias: IBM852
alias: 852
alias: csPCp852
#mib: 2010
name: CP855
srcURL: $SrcUnicode/VENDORS/MICSFT/PC/CP855.TXT
src: $DestUnicode/VENDORS/MICSFT/PC/CP855.TXT
map: $DestMap/MS/DOS/CP855.map
alias: IBM855
alias: 855
alias: csIBM855
#mib: 2046
name: CP857
srcURL: $SrcUnicode/VENDORS/MICSFT/PC/CP857.TXT
src: $DestUnicode/VENDORS/MICSFT/PC/CP857.TXT
map: $DestMap/MS/DOS/CP857.map
alias: IBM857
alias: 857
alias: csIBM857
#mib: 2047
name: CP860
srcURL: $SrcUnicode/VENDORS/MICSFT/PC/CP860.TXT
src: $DestUnicode/VENDORS/MICSFT/PC/CP860.TXT
map: $DestMap/MS/DOS/CP860.map
alias: IBM860
alias: 860
alias: csIBM860
#mib: 2048
name: CP861
srcURL: $SrcUnicode/VENDORS/MICSFT/PC/CP861.TXT
src: $DestUnicode/VENDORS/MICSFT/PC/CP861.TXT
map: $DestMap/MS/DOS/CP861.map
alias: IBM861
alias: 861
alias: cp-is
alias: csIBM861
#mib: 2049
name: CP862
srcURL: $SrcUnicode/VENDORS/MICSFT/PC/CP862.TXT
src: $DestUnicode/VENDORS/MICSFT/PC/CP862.TXT
map: $DestMap/MS/DOS/CP862.map
alias: IBM862
alias: 862
alias: csPC862LatinHebrew
#mib: 2013
name: CP863
srcURL: $SrcUnicode/VENDORS/MICSFT/PC/CP863.TXT
src: $DestUnicode/VENDORS/MICSFT/PC/CP863.TXT
map: $DestMap/MS/DOS/CP863.map
alias: IBM863
alias: 863
alias: csIBM863
#mib: 2050
name: CP864
srcURL: $SrcUnicode/VENDORS/MICSFT/PC/CP864.TXT
src: $DestUnicode/VENDORS/MICSFT/PC/CP864.TXT
map: $DestMap/MS/DOS/CP864.map
alias: IBM864
alias: csIBM864
#mib: 2051
name: CP865
srcURL: $SrcUnicode/VENDORS/MICSFT/PC/CP865.TXT
src: $DestUnicode/VENDORS/MICSFT/PC/CP865.TXT
map: $DestMap/MS/DOS/CP865.map
alias: IBM865
alias: 865
alias: csIBM865
#mib: 2052
name: CP866
srcURL: $SrcUnicode/VENDORS/MICSFT/PC/CP866.TXT
src: $DestUnicode/VENDORS/MICSFT/PC/CP866.TXT
map: $DestMap/MS/DOS/CP866.map
alias: IBM866
alias: 866
alias: csIBM866
#mib: 2086
name: CP869
srcURL: $SrcUnicode/VENDORS/MICSFT/PC/CP869.TXT
src: $DestUnicode/VENDORS/MICSFT/PC/CP869.TXT
map: $DestMap/MS/DOS/CP869.map
alias: IBM869
alias: 869
alias: cp-gr
alias: csIBM869
#mib: 2054
#name: CP870
#name: CP871
name: CP874
srcURL: $SrcUnicode/VENDORS/MICSFT/PC/CP874.TXT
src: $DestUnicode/VENDORS/MICSFT/PC/CP874.TXT
map: $DestMap/MS/DOS/CP874.map
name: CP875
srcURL: $SrcUnicode/VENDORS/MICSFT/EBCDIC/CP875.TXT
src: $DestUnicode/VENDORS/MICSFT/EBCDIC/CP875.TXT
map: $DestMap/MS/EBCDIC/CP875.map
name: CP932
srcURL: $SrcUnicode/VENDORS/MICSFT/WINDOWS/CP932.TXT
src: $DestUnicode/VENDORS/MICSFT/WINDOWS/CP932.TXT
map: $DestMap/MS/WIN/CP932.map
name: CP936
srcURL: $SrcUnicode/VENDORS/MICSFT/WINDOWS/CP936.TXT
src: $DestUnicode/VENDORS/MICSFT/WINDOWS/CP936.TXT
map: $DestMap/MS/WIN/CP936.map
name: CP949
srcURL: $SrcUnicode/VENDORS/MICSFT/WINDOWS/CP949.TXT
src: $DestUnicode/VENDORS/MICSFT/WINDOWS/CP949.TXT
map: $DestMap/MS/WIN/CP949.map
name: CP950
srcURL: $SrcUnicode/VENDORS/MICSFT/WINDOWS/CP950.TXT
src: $DestUnicode/VENDORS/MICSFT/WINDOWS/CP950.TXT
map: $DestMap/MS/WIN/CP950.map
name: CP1026
srcURL: $SrcUnicode/VENDORS/MICSFT/EBCDIC/CP1026.TXT
src: $DestUnicode/VENDORS/MICSFT/EBCDIC/CP1026.TXT
map: $DestMap/MS/EBCDIC/CP1026.map
alias: IBM1026
alias: csIBM1026
#mib: 2063
name: CP1250
srcURL: $SrcUnicode/VENDORS/MICSFT/WINDOWS/CP1250.TXT
src: $DestUnicode/VENDORS/MICSFT/WINDOWS/CP1250.TXT
map: $DestMap/MS/WIN/CP1250.map
alias: windows-1250
#mib: 2250
name: CP1251
srcURL: $SrcUnicode/VENDORS/MICSFT/WINDOWS/CP1251.TXT
src: $DestUnicode/VENDORS/MICSFT/WINDOWS/CP1251.TXT
map: $DestMap/MS/WIN/CP1251.map
alias: windows-1251
#mib: 2251
name: CP1252
srcURL: $SrcUnicode/VENDORS/MICSFT/WINDOWS/CP1252.TXT
src: $DestUnicode/VENDORS/MICSFT/WINDOWS/CP1252.TXT
map: $DestMap/MS/WIN/CP1252.map
alias: windows-1252
name: CP1253
srcURL: $SrcUnicode/VENDORS/MICSFT/WINDOWS/CP1253.TXT
src: $DestUnicode/VENDORS/MICSFT/WINDOWS/CP1253.TXT
map: $DestMap/MS/WIN/CP1253.map
alias: windows-1253
#mib: 2253
name: CP1254
srcURL: $SrcUnicode/VENDORS/MICSFT/WINDOWS/CP1254.TXT
src: $DestUnicode/VENDORS/MICSFT/WINDOWS/CP1254.TXT
map: $DestMap/MS/WIN/CP1254.map
alias: windows-1254
#mib: 2254
name: CP1255
srcURL: $SrcUnicode/VENDORS/MICSFT/WINDOWS/CP1255.TXT
src: $DestUnicode/VENDORS/MICSFT/WINDOWS/CP1255.TXT
map: $DestMap/MS/WIN/CP1255.map
alias: windows-1255
#mib: 2255
name: CP1256
srcURL: $SrcUnicode/VENDORS/MICSFT/WINDOWS/CP1256.TXT
src: $DestUnicode/VENDORS/MICSFT/WINDOWS/CP1256.TXT
map: $DestMap/MS/WIN/CP1256.map
alias: windows-1256
#mib: 2256
name: CP1257
srcURL: $SrcUnicode/VENDORS/MICSFT/WINDOWS/CP1257.TXT
src: $DestUnicode/VENDORS/MICSFT/WINDOWS/CP1257.TXT
map: $DestMap/MS/WIN/CP1257.map
alias: windows-1257
#mib: 2257
name: CP1258
srcURL: $SrcUnicode/VENDORS/MICSFT/WINDOWS/CP1258.TXT
src: $DestUnicode/VENDORS/MICSFT/WINDOWS/CP1258.TXT
map: $DestMap/MS/WIN/CP1258.map
alias: windows-1258
#mib: 2258
name: IBM038
srcURL: $SrcKeld/CP038
src: $DestKeld/CP038
map: $DestMap/IBM/IBM038.map
style: Keld
alias: EBCDIC-INT
alias: CP038
alias: csIBM038
#mib: 2029
##
## --- ISO 8859 -----------------------------------------------------------
##
name: ISO-8859-1
srcURL: $SrcUnicode/ISO8859/8859-1.TXT
src: $DestUnicode/ISO8859/8859-1.TXT
map: $DestMap/ISO/8859-1.map
alias: ISO-IR-100
alias: ISO_8859-1:1987
alias: LATIN1
alias: L1
alias: IBM819
alias: CP819
##
## locale support for ISO-8859-1
##
alias: en_US.ISO8859-1
alias: de_DE.ISO8859-1
alias: en_US
alias: de_DE
alias: en
alias: de
alias: english
alias: german
alias: english.iso88591
alias: german.iso88591
name: ISO-8859-2
srcURL: $SrcUnicode/ISO8859/8859-2.TXT
src: $DestUnicode/ISO8859/8859-2.TXT
map: $DestMap/ISO/8859-2.map
alias: ISO-IR-101
alias: ISO_8859-2:1987
alias: LATIN2
alias: L2
name: ISO-8859-3
srcURL: $SrcUnicode/ISO8859/8859-3.TXT
src: $DestUnicode/ISO8859/8859-3.TXT
map: $DestMap/ISO/8859-3.map
alias: ISO-IR-109
alias: ISO_8859-3:1988
alias: LATIN3
alias: L3
name: ISO-8859-4
srcURL: $SrcUnicode/ISO8859/8859-4.TXT
src: $DestUnicode/ISO8859/8859-4.TXT
map: $DestMap/ISO/8859-4.map
alias: ISO-IR-110
alias: ISO_8859-4:1988
alias: LATIN4
alias: L4
name: ISO-8859-5
srcURL: $SrcUnicode/ISO8859/8859-5.TXT
src: $DestUnicode/ISO8859/8859-5.TXT
map: $DestMap/ISO/8859-5.map
alias: ISO-IR-144
alias: ISO_8859-5:1988
alias: CYRILLIC
##
## locale support for ISO-8859-5
##
alias: ru_RU.ISO8859-5
alias: ru_RU
alias: ru
alias: russian
alias: russion.iso88595
name: ISO-8859-6
srcURL: $SrcUnicode/ISO8859/8859-6.TXT
src: $DestUnicode/ISO8859/8859-6.TXT
map: $DestMap/ISO/8859-6.map
alias: ISO-IR-127
alias: ISO_8859-6:1987
alias: ECMA-114
alias: ASMO-708
alias: ARABIC
name: ISO-8859-7
srcURL: $SrcUnicode/ISO8859/8859-7.TXT
src: $DestUnicode/ISO8859/8859-7.TXT
map: $DestMap/ISO/8859-7.map
alias: ISO-IR-126
alias: ISO_8859-7:1987
alias: ELOT_928
alias: ECMA-118
alias: GREEK
alias: GREEK8
name: ISO-8859-8
srcURL: $SrcUnicode/ISO8859/8859-8.TXT
src: $DestUnicode/ISO8859/8859-8.TXT
map: $DestMap/ISO/8859-8.map
alias: ISO-IR-138
alias: ISO_8859-8:1988
alias: HEBREW
name: ISO-8859-9
srcURL: $SrcUnicode/ISO8859/8859-9.TXT
src: $DestUnicode/ISO8859/8859-9.TXT
map: $DestMap/ISO/8859-9.map
alias: ISO-IR-148
alias: ISO_8859-9:1989
alias: LATIN5
alias: L5
name: ISO-8859-10
srcURL: $SrcUnicode/ISO8859/8859-10.TXT
src: $DestUnicode/ISO8859/8859-10.TXT
map: $DestMap/ISO/8859-10.map
alias: ISO-IR-157
alias: ISO_8859-10:1993
alias: L6
alias: LATIN6
name: ISO-8859-13
srcURL: $SrcUnicode/ISO8859/8859-13.TXT
src: $DestUnicode/ISO8859/8859-13.TXT
map: $DestMap/ISO/8859-13.map
name: ISO-8859-14
srcURL: $SrcUnicode/ISO8859/8859-14.TXT
src: $DestUnicode/ISO8859/8859-14.TXT
map: $DestMap/ISO/8859-14.map
name: ISO-8859-15
srcURL: $SrcUnicode/ISO8859/8859-15.TXT
src: $DestUnicode/ISO8859/8859-15.TXT
map: $DestMap/ISO/8859-15.map
##
## --- MS Macintosh charsets ----------------------------------------------
##
name: MS-CYRILLIC
srcURL: $SrcUnicode/VENDORS/MICSFT/MAC/CYRILLIC.TXT
src: $DestUnicode/VENDORS/MICSFT/MAC/CYRILLIC.TXT
map: $DestMap/MS/MAC/CYRILLIC.map
name: MS-GREEK
srcURL: $SrcUnicode/VENDORS/MICSFT/MAC/GREEK.TXT
src: $DestUnicode/VENDORS/MICSFT/MAC/GREEK.TXT
map: $DestMap/MS/MAC/GREEK.map
name: MS-ICELAND
srcURL: $SrcUnicode/VENDORS/MICSFT/MAC/ICELAND.TXT
src: $DestUnicode/VENDORS/MICSFT/MAC/ICELAND.TXT
map: $DestMap/MS/MAC/ICELAND.map
name: MS-LATIN2
srcURL: $SrcUnicode/VENDORS/MICSFT/MAC/LATIN2.TXT
src: $DestUnicode/VENDORS/MICSFT/MAC/LATIN2.TXT
map: $DestMap/MS/MAC/LATIN2.map
name: MS-ROMAN
srcURL: $SrcUnicode/VENDORS/MICSFT/MAC/ROMAN.TXT
src: $DestUnicode/VENDORS/MICSFT/MAC/ROMAN.TXT
map: $DestMap/MS/MAC/ROMAN.map
name: MS-TURKISH
srcURL: $SrcUnicode/VENDORS/MICSFT/MAC/TURKISH.TXT
src: $DestUnicode/VENDORS/MICSFT/MAC/TURKISH.TXT
map: $DestMap/MS/MAC/TURKISH.map
##
## --- ASCII --------------------------------------------------------------
##
name: US-ASCII
srcURL: $SrcKeld/US-ASCII
src: $DestKeld/US-ASCII
map: $DestMap/ISO/ISO646-US.map
style: Keld
alias: ANSI_X3.4-1968
alias: iso-ir-6
alias: ANSI_X3.4-1986
alias: ISO_646.irv:1991
alias: ASCII
alias: ISO646-US
alias: us
alias: IBM367
alias: cp367
alias: csASCII
##
## --- NeXT ---------------------------------------------------------------
##
name: NEXT
srcURL: $SrcUnicode/VENDORS/NEXT/NEXTSTEP.TXT
src: $DestUnicode/VENDORS/NEXT/NEXTSTEP.TXT
map: $DestMap/NEXT/NEXTSTEP.map
alias: NeXT
alias: NEXTSTEP
##
## --- Eastasia charsets (Unicode) -----------------------------------------
##
name: GB12345-80
srcURL: $SrcUnicode/EASTASIA/GB/GB12345.TXT
src: $DestUnicode/EASTASIA/GB/GB12345.TXT
map: $DestMap/EASTASIA/GB12345-80.map
name: GB2312-80
srcURL: $SrcUnicode/EASTASIA/GB/GB2312.TXT
src: $DestUnicode/EASTASIA/GB/GB2312.TXT
map: $DestMap/EASTASIA/GB2312-80.map
alias: GB_2312-80
alias: iso-ir-58
alias: chinese
alias: csISO58GB231280
# The text source of this mapping is generated from GB2312.TXT with the
# tool mkCSGB2312. Unfortunately you need to do this by hand for now:
# 1. chdir to $DestMap/EASTASIA/
# 2. mkCSGB2312
name: GB2312
src: $DestUnicode/EASTASIA/GB/CSGB2312.TXT
map: $DestMap/EASTASIA/GB2312.map
alias: csGB2312
#mib: 2025
name: JIS-X-0201
srcURL: $SrcUnicode/EASTASIA/JIS/JIS0201.TXT
src: $DestUnicode/EASTASIA/JIS/JIS0201.TXT
map: $DestMap/EASTASIA/JIS-X-0201.map
alias: JIS_X0201
alias: X0201
alias: csHalfWidthKatakana
#mib: 15
name: JIS-X-0208
srcURL: $SrcUnicode/EASTASIA/JIS/JIS0208.TXT
src: $DestUnicode/EASTASIA/JIS/JIS0208.TXT
map: $DestMap/EASTASIA/JIS-X-0208.map
style: 2 3
alias: JIS_C6226-1983
alias: iso-ir-87
alias: X0208
alias: JIS_X0208-1983
alias: csISO87JISX0208
#mib: 63
name: JIS-X-0212
srcURL: $SrcUnicode/EASTASIA/JIS/JIS0212.TXT
src: $DestUnicode/EASTASIA/JIS/JIS0212.TXT
map: $DestMap/EASTASIA/JIS-X-0212.map
name: Shift-JIS
srcURL: $SrcUnicode/EASTASIA/JIS/SHIFTJIS.TXT
src: $DestUnicode/EASTASIA/JIS/SHIFTJIS.TXT
map: $DestMap/EASTASIA/SHIFTJIS.map
name: BIG5
srcURL: $SrcUnicode/EASTASIA/OTHER/BIG5.TXT
src: $DestUnicode/EASTASIA/OTHER/BIG5.TXT
map: $DestMap/EASTASIA/BIG5.map
# This encoding is probably defect. It is actually a 20 bit -> 16 bit
# encoding, but the mapping expands the 20 bit to 24 bit. I didn't find time
# to care for this yet... martin [2000-Jun-25]
name: CNS-11643-1986
srcURL: $SrcUnicode/EASTASIA/OTHER/CNS11643.TXT
src: $DestUnicode/EASTASIA/OTHER/CNS11643.TXT
map: $DestMap/EASTASIA/CNS-11643-1986.map
name: JOHAB
srcURL: $SrcUnicode/EASTASIA/KSC/JOHAB.TXT
src: $DestUnicode/EASTASIA/KSC/JOHAB.TXT
map: $DestMap/EASTASIA/JOHAB.map
name: KSC5601-1992
srcURL: $SrcUnicode/EASTASIA/KSC/KSC5601.TXT
src: $DestUnicode/EASTASIA/KSC/KSC5601.TXT
map: $DestMap/EASTASIA/KSC5601-1992.map
name: KSCX-1001
srcURL: $SrcUnicode/EASTASIA/KSC/KSX1001.TXT
src: $DestUnicode/EASTASIA/KSC/KSX1001.TXT
map: $DestMap/EASTASIA/KSC1001.map
# The text source is created from JIS-0201.TXT, JIS-0208.TXT and JIS-0212.TXT
# The sample perl code to do this conversion can be obtained by contacting chunchichen@hotmail.com
# Michael Chen [2000-Dec-29]
name: EUC-JP
srcURL: $SrcUnicode/EASTASIA/JIS/EUC-JP.TXT
src: $DestUnicode/EASTASIA/JIS/EUC-JP.TXT
map: $DestMap/EASTASIA/EUC-JP.map
#mib: ????
# The text source is created from ASCII.TXT KSC5601.TXT
# The sample perl code to do this conversion can be obtained by contacting chunchichen@hotmail.com
# Michael Chen [2002-Mar-20]
name: EUC-KR
srcURL: $SrcUnicode/EASTASIA/KSC/EUC-KR.TXT
src: $DestUnicode/EASTASIA/KSC/EUC-KR.TXT
map: $DestMap/EASTASIA/EUC-KR.map
#mib: ????
##
## --- Test ---------------------------------------------------------------
##
# name: Test_Latin6
# map: DestMappingsGisle/latin6.bin
# alias: Latin6_from_another_binary_format

Some files were not shown because too many files have changed in this diff Show More