《淮南子》詞彙研究

A lexical study of Huainanzi

Student thesis: Doctoral Thesis

View graph of relations

Author(s)

  • Kam Tang LAU

Detail(s)

Awarding Institution
Supervisors/Advisors
  • Pang Fei KWOK (Supervisor)
Award date14 Feb 2014

Abstract

《淮南子》,又名《淮南鴻烈》,是西漢淮南王劉安(公元前179-前122)及其門客集體撰寫的一部著作,大約起草於漢景帝(劉啟,公元前188-前141,前156年-前141年在位)後元三年(公元前141年),成書於漢武帝(劉徹,公元前156-前87,前141年-前87年在位)建元元年(公元前140年),並於建元二年進京獻書。原書本有內篇二十一卷,外篇三十三卷,至今存世的只有內篇。該書的思想內容接近道家,同時夾雜着先秦各家學說。 西漢建立了中國歷史上繁榮的統一國家,其科技、文化相比於先秦時期有了較大的發展,社會生活更為豐盛,促使了語言詞彙的豐富與發展,新的詞語不斷產生,形成了新的語言面貌。古漢語詞彙研究目的,就是要描寫出最接近於古漢語詞彙發展事實的古漢語詞彙歷史。其中專書詞彙研究,就是以專書為對象,研究當時的語言特色。如果能把某一時代的重要語料都進行深入研究,那麼一個共時層面的詞彙狀況就能描寫清楚了。本研究就是基於以上目的,就《淮南子》展開研究,以詞彙學、漢語史角度來分析書中的詞彙特點,找出該書中所呈現的漢初語言特色。 本研究通過對《淮南子》詞彙的分析,探討漢初的語言現象,探討先秦與漢初的詞彙特徵,這樣一方面填補了漢語史上由先秦至兩漢間語言轉變的研究空隙,另一方面亦將有助解讀《淮南子》,進一步深化其思想內容的探索。 本研究具體工作有三:第一,把《淮南子》一書中的詞彙切分開來,判定哪些是單音詞、複音詞或固定短語。第二,對以上所得詞彙,進行靜態描寫,找出它們的詞彙特徵,首先分析單音詞和複音詞,然後再分析詞義的關係。第三,找出《淮南子》的新詞新義,由於詞彙是會隨着時間而變化的,有必要作縱向的歷史比較和動態分析,以便可以知道該書詞彙的詞義、詞形、詞性的變遷。 具體所得有四:第一,呈現《淮南子》單音詞的同形、通假、兼用及活用狀況;第二,把《淮南子》複音詞的複合詞、單純詞和派生詞的意義關係弄清楚;第三,整理了《淮南子》中的固定短詞,包括專有名詞、成語、慣用語、諺語等;第四,找出《淮南子》一書的新詞、詞義,以顯示該書在詞彙學上的地位。 最後介紹以《淮南子》為文本的上古漢語分詞及詞性標注語料庫及其構建過程。我們採取了自動分詞與詞性標注並結合人工校正的方法構建該語料庫,其中自動過程使用領域適應方法優化標注模型,在分詞和詞性標注上均顯著提升了標注性能。我們分析了上古漢語的詞彙特點,並以此為基礎描述了一些顯式的詞彙形態特徵,將其運用於我們的自動分詞及詞性標注中,特別對詞性標注系統帶來了有效説明。我們總結並分析了自動分詞和詞性標注中出現的錯誤,最後描述了整個語料庫的詞彙和詞性分佈特點。我們提出的方法在《淮南子》的標注過程中得到了驗證,為日後擴展到其它古漢語資源提供了參考。同時,基於本研究工作得到的《淮南子》語料庫也為日後的古漢語研究提供了有益的資源。 Huainanzi, also known as Huainan Honglie, is a collective work written by Prince Huainan, Liu An (179BC–122BC) and a group of his retainers in the Western Han Dynasty. The Western Han Dynasty was one of the most prosperous periods in the Chinese history. Rapid developments had taken place in science and technology. Cultural life such as education, and music also flourished and as a result enriched the Chinese language. Many new words and expressions were coined during the Western Han dynasty. This study focuses on classical Chinese lexical items. It provides a thorough investigation of the characteristics of vocabulary items in Huainanzi. This dissertation presents seven chapters. The first chapter is the research background and methodology. The second chapter is the definition and formation of Classical Chinese Lexicon. The third chapter is the study of the monosyllabic word. This section is divided into four categories: Homograph, Interchangeable words, Multi-category words and Temporary Multi-category words. The fourth chapter is the study of the forms and characteristics of polysyllabic words (including the idioms). The fifth chapter is the analysis of the new invented words and new usages of words from Huainanzi and their meanings and features. Finally, I would like to present a segmented and part-of-speech (POS) tagged Archaic Chinese corpus along with its construction process, which is performed by automatic segmentation and tagging with manual correction as post-processing. We use both Modern and Archaic Chinese labeled data for training word segmenter and POS tagger, which are further improved by domain adaptation techniques, as well as by adding linguistic and morphological features derived from the characteristics of Archaic Chinese language. The experimental results showed the effectiveness of our approach. In particular, the domain adaptation techniques and the added features significantly improve POS tagging performance. During our manual correction, we categorize the errors resulted from the automatic segmentation and POS tagging process, and investigate the sources of those errors. Finally, we give the statistics of the resulted corpus on the distributions of words and POS tags. Our work is a preliminary study that could be easily extended to annotating other Archaic Chinese text, and the resulted corpus is a valuable resource for research on archaic Chinese language.

    Research areas

  • Word formation, 淮南子, Terms and phrases, Chinese language, Language, Huainan zi