文/五花肉
從鼓角錚鳴到萬“碼”奔騰
——編碼與漢字信息傳遞標準化
From Horn to Various Code——Encoding and standardization of Chinese Information Transmission
文/五花肉
——《墨子·卷十五》
Each officer has his own six flags with 8.34-meter-long staff and 5-meter-long width.When the enemies reach the bank of moat the defending troops hit the drum 3 times and hang a flag.When the enemies climb up the rampart by half the defending troops keep hitting the drum.In night,the defending troops replace flags with torches.The number of torches equal to the flags.If the enemies retreat,the defending troops will hang equal number of flags but won't hit the drum.
——胡適/《四角號碼檢字法》序
One stands for horizontal stroke;two and three stand for vertical stroke;four and five stand for left-falling stroke;six stands for dot and right-falling stroke;seven stands for cross;and eight and nine stand for left and right hooks.
從甲、金、篆、隸發展到楷書,再到信息時代的計算機中文字符,漢字伴隨著中華文明而生、而盛。除了以紙為媒、手書印刷等傳統記錄傳播方式外,中華民族也借助推進漢字字形的標準化,探索出以文字為內容、以編碼為載體的漢字信息傳遞方式。
From oracle;inscriptions on ancient bronze objects; the lesser seal character;official script;regular script to Chinese characters in computer,Chinese characters have witnessed Chinese civilization development.Besides paper media,with the concept of standardization, Chinese people have developed encoding methods of Chinese characters for information transmission.
狼煙旌旗、鼓角錚鳴,這些詞語慣常被用以指代沙場征戰,它們既是千百年來軍隊交換情報、傳遞命令的常用方法,也是古人利用編碼技術傳遞信息的最初萌芽。盡管中國古代兵家曾為這些通信手段制定了使用標準,但借此傳遞的信息卻始終無法逾越人類的視聽范圍。
Beacon tower and horns were the general methods for information exchange in ancient battle field,which was the origin of encoding technology for information transmission.Even though ancient Chinese developed standards for the communication,the communication could not go beyond the limitation of seeing and hearing.
直到1925年,隨著電報碼在近代的引入和使用,上海人王云五在其基礎上開發出具有檢字功能的四角編碼,最原始的漢字編碼誕生了。雖然這種編碼因為重碼較多而無法作為計算機的輸入編碼,但它給人們的啟示卻有著劃時代意義——利用漢字的某些特征加上有序符號,可以使漢字具備有序性、實現有理化,形成了漢字信息技術處理的雛形。電報碼和四角號碼也成為當時中國社會用字數字化和標準化的兩大成就。
In 1925,with the introduction telegraph code in China,Shanghainese Wang Yunwu developed"four corner number code",which was the origin of Chinese character encoding method.Although the code was not suitable for computer input because of coincident code it inspired the concept,that is to say,we can encoding Chinese characters according to character pattern and font,which was the origin of modern information processing of Chinese characters.Telegraph code and four corner number code were the two achievements in Chinese standardization and digitization at that time.
進入20世紀80年代,隨著《信息交換用漢字編碼字符集基本集》(GB 2313-80)的發布,漢語言邁入信息化時代。在短短30年間,中國推出上千種漢字編碼方法和數十種輸入法,呈現出萬“碼”奔騰的局面。近年,更借助標準化的規范統一,形成了音碼、形碼、手寫/語音等主流漢字輸入法?!皾h字信息處理與印刷革命”成為僅次于“兩彈一星”的20世紀我國重大工程建設成就。
In 1980s,the publication of the Chinese national standards"Information technology-Chineseideograms coded character set basic set"(GB 2313-80)symbolized the informatization of Chinese character.In thirty years,Chinese people have developed thousands of encoding methods and dozens of Chinese input methods.In recent years,with the progress of standardization in Chinese character,the input methods have been integrated into major methods including tone codes;bar codes;handwriting and voice input. Chinese character information processing and revolution in printing were the greatest achievements second to"two bombs and one satellite"in the 20th century in China.
當下,隨著大數據時代的開啟和語音識別技術的突破,漢字信息處理技術又一次迎來了發展高峰。漢字語音識別技術廣泛應用在IOS、安卓等智能手機平臺;中文域名日益普遍,漢字及漢語言文化在“地球村”中的地位日漸提升。未來,伴隨著中華民族的復興,漢字必然會使中華文明在信息化社會綻放出更為奪目的光彩!
Recently,with the development oftechnology of massive datasets and speech recognition,Chinese character information processing witnessed the second development peak.Chinese speech recognition has been widely applied in smart phone operation system such as IOS and Android and,Chinese domain names become popular in internet.Chinese language and civilization are playing more and more important role in global village.With the resurrection of Chinese nation,we believe that Chinese character will make Chinese civilization rejuvenate in the information society.
(支持單位:上海市質量和標準化研究院)