You can not select more than 25 topics
			Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
		
		
		
		
		
			
		
			
				
					
					
						
							205 lines
						
					
					
						
							5.0 KiB
						
					
					
				
			
		
		
	
	
							205 lines
						
					
					
						
							5.0 KiB
						
					
					
				| Introduction:
 | |
| =============
 | |
| 
 | |
| This file documents an approximate correlation between the data files
 | |
| provided in the !Unicode distribution and the encoding headers in GNU
 | |
| libiconv 1.9.1.
 | |
| 
 | |
| Those with '?' in the iconv column either are not represented in iconv
 | |
| or I've missed the relevant header file ;)
 | |
| 
 | |
| A number of encodings are present in the iconv distribution but not
 | |
| in !Unicode. These are documented at the end of this file.
 | |
| 
 | |
| Changelog:
 | |
| ==========
 | |
| 
 | |
| v 0.01 (09-Sep-2004)
 | |
| ~~~~~~~~~~~~~~~~~~~~
 | |
| Initial Incarnation
 | |
| 
 | |
| v 0.02 (11-Sep-2004)
 | |
| ~~~~~~~~~~~~~~~~~~~~
 | |
| Documented additional encodings supported by the Iconv module.
 | |
| Corrected list of !Unicode deficiencies.
 | |
| 
 | |
| 
 | |
| !Unicode->iconv:
 | |
| ================
 | |
| 
 | |
| Unicode:			iconv:			notes:
 | |
| 
 | |
| Acorn.Latin1			riscos1.h
 | |
| 
 | |
| Apple.CentEuro			mac_centraleurope.h
 | |
| Apple.Cyrillic			mac_cyrillic.h
 | |
| Apple.Roman			mac_roman.h
 | |
| Apple.Ukrainian			mac_ukraine.h
 | |
| 
 | |
| BigFive				big5.h
 | |
| 
 | |
| ISO2022.C0.40[ISO646]		?
 | |
| 
 | |
| ISO2022.C1.43[IS6429]		?
 | |
| 
 | |
| ISO2022.G94.40[646old]		iso646_cn.h
 | |
| ISO2022.G94.41[646-GB]		?
 | |
| ISO2022.G94.42[646IRV]		?
 | |
| ISO2022.G94.43[FinSwe]		?
 | |
| ISO2022.G94.47[646-SE]		?
 | |
| ISO2022.G94.48[646-SE]		?
 | |
| ISO2022.G94.49[JS201K]		jisx0201.h		top of JIS range 
 | |
| ISO2022.G94.4A[JS201R]		jisx0201.h iso646_jp.h	bottom of JIS range
 | |
| ISO2022.G94.4B[646-DE]		?
 | |
| ISO2022.G94.4C[646-PT]		?
 | |
| ISO2022.G94.54[GB1988]		?
 | |
| ISO2022.G94.56[Teltxt]		?
 | |
| ISO2022.G94.59[646-IT]		?
 | |
| ISO2022.G94.5A[646-ES]		?
 | |
| ISO2022.G94.60[646-NO]		?
 | |
| ISO2022.G94.66[646-FR]		?
 | |
| ISO2022.G94.69[646-HU]		?
 | |
| ISO2022.G94.6B[Arabic]		?
 | |
| ISO2022.G94.6C[IS6397]		?
 | |
| ISO2022.G94.7A[SerbCr]		?
 | |
| 
 | |
| ISO2022.G94x94.40[JS6226]	?
 | |
| ISO2022.G94x94.41[GB2312]	gb2312.h
 | |
| ISO2022.G94x94.42[JIS208]	jis0x208.h
 | |
| ISO2022.G94x94.43[KS1001]	ksc5601.h
 | |
| ISO2022.G94x94.44[JIS212]	jis0x212.h
 | |
| ISO2022.G94x94.47[CNS1]		cns11643_1.h		the tables differ
 | |
| ISO2022.G94x94.48[CNS2]		cns11643_2.h
 | |
| ISO2022.G94x94.49[CNS3]		cns11643_3.h
 | |
| ISO2022.G94x94.4A[CNS4]		cns11643_4.h
 | |
| ISO2022.G94x94.4B[CNS5]		cns11643_5.h
 | |
| ISO2022.G94x94.4C[CNS6]		cns11643_6.h
 | |
| ISO2022.G94x94.4D[CNS7]		cns11643_7.h
 | |
| 
 | |
| ISO2022.G96.41[Lat1]		iso8859_1.h
 | |
| ISO2022.G96.42[Lat2]		iso8859_2.h
 | |
| ISO2022.G96.43[Lat3]		iso8859_3.h
 | |
| ISO2022.G96.44[Lat4]		iso8859_4.h
 | |
| ISO2022.G96.46[Greek]		?
 | |
| ISO2022.G96.47[Arabic]		iso8859_6.h		ISO-8859-6 ignored
 | |
| ISO2022.G96.48[Hebrew]		?
 | |
| ISO2022.G96.4C[Cyrill]		?
 | |
| ISO2022.G96.4D[Lat5]		iso8859_5.h
 | |
| ISO2022.G96.50[LatSup]		?
 | |
| ISO2022.G96.52[IS6397]		?
 | |
| ISO2022.G96.54[Thai]		tis620.h
 | |
| ISO2022.G96.56[Lat6]		iso8859_6.h
 | |
| ISO2022.G96.58[L6Sami]		?
 | |
| ISO2022.G96.59[Lat7]		iso8859_7.h
 | |
| ISO2022.G96.5C[Welsh]		?
 | |
| ISO2022.G96.5D[Sami]		?
 | |
| ISO2022.G96.5E[Hebrew]		?
 | |
| ISO2022.G96.5F[Lat8]		iso8859_8.h
 | |
| ISO2022.G96.62[Lat9]		iso8859_9.h
 | |
| 
 | |
| KOI8-R				koi8_r.h
 | |
| 
 | |
| Microsoft.CP1250		cp1250.h
 | |
| Microsoft.CP1251		cp1251.h
 | |
| Microsoft.CP1252		cp1252.h
 | |
| Microsoft.CP1254		cp1254.h
 | |
| Microsoft.CP866			cp866.h
 | |
| Microsoft.CP932			cp932.h cp932ext.h
 | |
| 
 | |
| iconv->!Unicode:
 | |
| ================
 | |
| 
 | |
| Iconv has the following encodings, which are not present in !Unicode. 
 | |
| Providing a suitable data file for !Unicode is trivial. Whether UnicodeLib
 | |
| will then act upon the addition of these is unknown.
 | |
| This list is ordered as per libiconv's NOTES file.
 | |
| 
 | |
| European & Semitic languages:
 | |
| 
 | |
| 	ISO-8859-16 (iso8859_16.h)
 | |
| 	KOI8-{U,RU,T} (koi8_xx.h)
 | |
| 	CP125{3,5,6,7} (cp125n.h)
 | |
| 	CP850 (cp850.h)
 | |
| 	CP862 (cp862.h)
 | |
| 	Mac{Croatian,Romania,Greek,Turkish,Hebrew,Arabic} (mac_foo.h)
 | |
| 
 | |
| Japanese:
 | |
| 
 | |
| 	None afaikt.
 | |
| 
 | |
| Simplified Chinese:
 | |
| 
 | |
| 	GB18030 (gb18030.h, gb18030ext.h)
 | |
| 	HZ-GB-2312 (hz.h)
 | |
| 
 | |
| Traditional Chinese:
 | |
| 
 | |
| 	CP950 (cp950.h)
 | |
| 	BIG5-HKSCS (big5hkscs.h)
 | |
| 
 | |
| Korean:
 | |
| 
 | |
| 	CP949 (cp949.h)
 | |
| 
 | |
| Armenian:
 | |
| 
 | |
| 	ARMSCII-8 (armscii_8.h)
 | |
| 
 | |
| Georgian:
 | |
| 
 | |
| 	Georgian-Academy, Georgian-PS (georgian_academy.h, georgian_ps.h)
 | |
| 
 | |
| Thai:
 | |
| 
 | |
| 	CP874 (cp874.h)
 | |
| 	MacThai (mac_thai.h)
 | |
| 
 | |
| Laotian:
 | |
| 
 | |
| 	MuleLao-1, CP1133 (mulelao.h, cp1133.h)
 | |
| 
 | |
| Vietnamese:
 | |
| 
 | |
| 	VISCII, TCVN (viscii.h, tcvn.h)
 | |
| 	CP1258 (cp1258.h)
 | |
| 
 | |
| Unicode:
 | |
| 
 | |
| 	BE/LE variants of normal encodings. I assume UnicodeLib handles
 | |
| 	these, but can't be sure.
 | |
| 	C99 / JAVA - well, yes.
 | |
| 
 | |
| 
 | |
| Iconv Module:
 | |
| =============
 | |
| 
 | |
| The iconv module is effectively a thin veneer around UnicodeLib. However,
 | |
| 8bit encodings are implemented within the module rather than using the
 | |
| support in UnicodeLib. The rationale for this is simply that, although
 | |
| UnicodeLib will understand (and act upon - reportedly...) additions to
 | |
| the ISO2022 Unicode resource, other encodings are ignored. As the vast
 | |
| majority of outstanding encodings fall into this category, and the code
 | |
| is fairly simple, it made sense to implement it within the module.
 | |
| 
 | |
| With use of the iconv module, the list of outstanding encodings is
 | |
| reduced to:
 | |
| 
 | |
| 	CP1255 (requires state-based transcoding)
 | |
| 
 | |
| 	GB18030 (not 8bit - reportedly a requirement of PRC)
 | |
| 	HZ-GB-2312 (not 8bit - supported by IE4)
 | |
| 
 | |
| 	CP950 (not 8bit - a (MS) variant of Big5)
 | |
| 	BIG5-HKSCS (not 8bit - again, a Big5 variant)
 | |
| 
 | |
| 	CP949 (not 8bit)
 | |
| 
 | |
| 	ARMSCII-8 (easily implemented, if required)
 | |
| 
 | |
| 	VISCII (easily implemented, if required)
 | |
| 	CP1258, TCVN (requires state-based transcoding)
 | |
| 
 | |
| Additionally, the rest of the CodePage encodings implemented in iconv
 | |
| but not listed above (due to omissions from the iconv documentation)
 | |
| are implemented by the iconv module.
 |