One of my passions is classic and vintage cars. As part of that I sometimes end up trolling for information in very old, unmaintained portions of the public Internet. I’m looking for old brochures, service manuals, bulletins — that kind of thing. If the documents are not in English I’ll run the these documents through a translation program in order to read the content.
A lot of these old documents were created before character maps were ‘standardized’ or they were created where the source documentation was never intended to be used outside the region they were created. The ‘sensible defaults’ were not that sensible somewhere else or the equipment used to create the documents was very bespoke to the region.
One of the Field Service Manuals (FSM) I obtained displays text looking like this:
�iƒGƒ“ƒWƒ“•Ò�j ƒGƒ“ƒWƒ“‚`‚x�EƒVƒŠƒ“ƒ_ƒuƒ�ƒbƒN1.6�E1.8 ƒVƒŠƒ“ƒ_ƒuƒ�ƒbƒN2.7�EÌײβ°Ùʳ¼ÞݸÞ�Eµ²Ų̀×ÀÞ¸Ä ¼ØÝÀÞͯÄÞ�E¸×ݸ¼¬ÌÄ�•Ëß½ÄÝ1.6�E1.8 ¸×ݸ¼¬ÌÄ�•Ëß½ÄÝ2.7�E¶Ñ¼¬ÌÄ�•¹°½
Now I’m not fluent in the source language (Japanese in this case) but I’m pretty sure that is not what it is supposed to look like.
If we take the content of that file and run it through the ‘iconv’ command we get properly formatted Japanese output. At least, as best I can tell this is what it is supposed to really look like.
iconv -f SHIFT_JIS -t UTF-8 list.html -o list_2.html
The output looks like Japanese.
エンジン編) エンジンAY・シリンダブロック1.6・1.8 シリンダブロック2.7・フライホイールハウジング・オイルフィラダクト シリンダヘッド・クランクシャフト&ピストン1.6・1.8 クランクシャフト&ピストン2.7・カムシャフト&ケース
Then if we cut and paste this content into a translation program we get something that looks like English.
(Engine) Engine AY / Cylinder block 1.6 / 1.8 Cylinder block 2.7, flywheel housing, oil filter Cylinder Head Crankshaft & Piston 1.6 / 1.8 Crankshaft & Piston 2.7 / Camshaft & Case
Much more useful!
Thanks for reading.