Abstract
Complex networks are constructed for studying the co-occurrence of characters and words in the Chinese language. Two types of networks are investigated. In the first type, nodes correspond to Chinese characters, and in the second type, nodes correspond to Chinese words. Moreover, edges correspond to connections of characters and/or words that occur consecutively. Networks are built from a collection of Chinese texts of four different styles, namely, essays, novels, popular science articles, and news reports. Their statistical properties are studied in terms of some complex network parameters, including average degree, diameter, average path length, clustering coefficient, degree distribution, as well as connected subnetworks. It is found that although these two kinds of networks have different parameter values, they display qualitatively similar properties, such as exhibition of small-world and scale-free features. This qualitative equivalence between the network of Chinese characters and the network of Chinese words provides a valid basis on which either types of networks can be used for comparing different languages regardless of the incompatibility of the linguistic roles that words play in the Chinese language and in other languages.
| Original language | English |
|---|---|
| Title of host publication | 2008 International Symposium on Nonlinear Theory and Its Applications, NOLTA'08 |
| Pages | 94-97 |
| Publication status | Published - Sept 2008 |
| Externally published | Yes |
| Event | 2008 International Symposium on Nonlinear Theory and Its Applications (NOLTA 2008) - Budapest, Hungary Duration: 7 Sept 2008 → 10 Sept 2008 |
Conference
| Conference | 2008 International Symposium on Nonlinear Theory and Its Applications (NOLTA 2008) |
|---|---|
| Abbreviated title | NOLTA'08 |
| Place | Hungary |
| City | Budapest |
| Period | 7/09/08 → 10/09/08 |