Research
Interests: Computational
linguistics / terminology, machine translation / computer-aided
translation / human computer interactive translation / cognitive
studies of translation / poetry translation / terminology translation,
Chinese/computational poetry/poetics, machine learning of natural
language, Taichi &
Chinese/Buddhist/Taoist philosophy ...
Teaching:
Computational
linguistics, machine translation / computer-aided translation / human
computer interactive translation, terminology / terminology &
translation, language technology, ... Publications [corresponding
author marked with *] and my Google
scholar citations
Meng Y., Wan, Y. & C. Kit* (2025). Sound symbolism is not
“marginal” in Chinese: Evidence from diachronic rhyme books. PLOS ONE, 20(5): e0322044. [SCI
IF 2.9 (2023), Q1 32/134, Multidisciplinary Sciences]
Meng Y., Wan, Y. & C. Kit* (2025). Listening to the Verses:
Unveiling phonetic contrasts in Li Bai and Du Fu’s poetry. Humanities and Social Sciences
Communications, 12: 365. [SCI IF 3.7 (2023), Q1 1/411 in
Humanities, Multidisciplinary & Q1 13/267 in Social Sciences,
Interdisciplinary]
Li, S., Kit, C., & Cheng, L.* (2024). Unveiling the landscape
of onomastics from 1972 to 2022: A bibliometric analysis. Names, 72(3):40–64. Recipient
of the 2024 Best Article of the Year.
Xie, T., Wan, Y., Zhou, Y., Huang, W., Liu, Y., Linghu, Q., Wang,
S., Kit, C.*, Grazian, C., Zhang, W., & Hoex, B.* (2024). Creation
of a structured solar cell material dataset and performance prediction
using large language models. Patterns,
5(5): 100955. [SCI IF 6.7 (2023), Q1 31/197 Computer Science, AI
& Q1 26/249 Computer Science, Information Systems]
Xie, T., Wan, Y., Wang, H., Østrøm, I., Wang, S.,
He, M., Deng, R., Wu, X., Grazian, C., Kit, C.*, and Hoex, B.* (2024).
Opinion mining by convolutional neural networks for maximizing
discoverability of nanomaterials. Journal
of Chemical Information and Modeling, 64(7): 2746–59. [SCI
IF 5.6 (2023), Q1 10/72 Chemistry, Medicinal / Q2 60/230 Chemistry,
Multidisciplinary]
Meng, Y., Wan, Y. & Kit, C.* (2024). Du Fu’s conspicuous
negativity and Li Bai’s hidden positivity: A sentiment comparison and
exploration. Digital Scholarship in
the Humanities, 39(1):280–295. Oxford University Press. [SCI
IF 0.7 (2023), Q3 177/297 Linguistics; JCI Q1 60/406 Humanities,
multidisciplinary & Q2 75/297 in Linguistics]
Wu, Y & Kit, C*. (2023). Hong Kong Corpus of Chinese Sentence
and Passage Reading. Scientific Data,
10, 899. [SCI IF 5.8, Q1 16/134 Multidisciplinary Sciences]
Kit, C. 2022. An Embroidering
Needle through Ribs: Poem selection 2012-2020 (《一枚繡花針在肋骨間穿行》).
Hong Kong: Manuscript Publishing Ltd. Recipient of Project Grant, HK
Arts Development Council.
Chunyu Kit & Yingying Meng. 2022. A Corpus-based Comparative
Study of Li Bai and Du Fu's poetry, Chinese
Language Review (Hong
Kong), Issue 125 (2021 Nov.), pp. 20-33. 14 p. The Chinese Language
Society of Hong Kong. http://huayuqiao.org/DOCC/DOC125/NO_020.php
Li, Siyue
and Kit, Chunyu. 2021. Legislative discourse of digital governance: a
corpus-driven comparative study of laws in the European Union and
China, International
Journal of Legal Discourse, 6(2): 349-379. De
Gruyter. 30
p. https://doi.org/10.1515/ijld-2021-2059
Chunyu Kit. 2018. Crossing a
River on a Petal of Sound: Poem selection 1982-2011 (《乘一朵声音过河》).
Hong Kong:
Manuscript Publishing Ltd. Recipient of Project Grant, HK Arts
Development Council.
Chunyu Kit & Meichun Liu. 2018. Frontiers of Empirical and
Corpus Linguistics 《实
证和语料库语言学前沿》. Beijing: China Social Sciences Press.
Chunyu Kit. 2018. The origins and frontiers of empirical
linguistics. In Chunyu Kit & Meichun Liu (eds.), Frontiers of Empirical and Corpus
Linguistics, pp. 1-28. Beijing: China Social Sciences Press.
Hai Zhao, Deng Cai, Changning Huang & Chunyu Kit. 2018.
Chinese word segmentation: Another decade review (2007-2017). In Chunyu
Kit & Meichun Liu (eds.), Frontiers
of Empirical and Corpus Linguistics, pp. 139-162. Beijing:
China Social Sciences Press.
Nannan Zhou & Chunyu Kit*. 2018. The effects of spacing on
the reading of Chinese: more evidence from eye movements. In Chunyu Kit
& Meichun Liu (eds.), Frontiers
of Empirical and Corpus Linguistics, pp. 257-300. Beijing: China
Social Sciences Press.
Ting-Wei Wu, Nannan Zhou, & Chunyu Kit*. 2018. An eye
tracking
study of cognitive effort allocation across translation subtasks. In
Chunyu Kit & Meichun Liu (eds.), Frontiers
of Empirical and Corpus Linguistics, pp. 301-341, Beijing:
China Social Sciences Press.
Chaochao Wang, Deyi Xiong, Min Zhang, & Chunyu Kit. 2015.
Learning bilingual distributed phrase representations for statistical
machine translation. In Proceedings of MT Summit XV, Vol. 1,
pp. 32-43. Oct 30-Nov. 3, 2015. Miami.
Xiaojun Quan & Chunyu Kit*. 2015. Towards
non‐monotonic sentence alignment. Information
Sciences, 323:34-47. Elsevier. [SCI 3.364, Q1, 8/145 Computer
Science, Information Systems]
Chunyu Kit & Tak-Ming Wong. 2015. Evaluation in machine
translation and computer-aided translation. Chapter 12 in
Sin-wai Chan (ed.), The
Routledge Encyclopedia of Translation Technology,
pp. 213-236. Routledge.
Chunyu Kit. 2013. Recent
advances
in computational
linguistics. A book chapter in Jenny Wang & Dongdong
Chen (eds.), Linguistics,
pp. 585-633. China Renmin University Press. (In Chinese)
Jianqiang Ma, Chunyu Kit, & Dale Gerdemann. 2012. Semi-automatic
annotation of Chinese word structure. In the
2nd CIPS-SIGHAN Joint Conference on
Chinese Language Processing (CLP-2012), pp.9-17. Dec 20-21,
2012,
Tianjin, China.
Billy T.M. Wong, Cecilia F.K. Pun,
Chunyu Kit, & Jonathan J. Webster. 2011. Lexical cohesion for
evaluation of machine translation at document level. In NLPKE
2011 7th International Conference
on Natural Language Processing and Knowledge Engineering (NLP-KE
2011),
pp.238-242. Nov. 27-29, 2011, Tokushima, Japan.
Xiao Chen & Chunyu Kit. 2011. Improving
part-of-speech tagging for context-free parsing, in Proceedings of
the 5th
International Joint Conference on Natural Language Processing
(IJCNLP
2011),
pp.1260-1268.
November 8-13, 2011, Chiang Mai, Thailand.
Hio Tong Chan & Chunyu
Kit. 2010. Two cores in Chinese negation system: A
corpus-based view. In Proceedings -
2010 International Conference on Asian Language Processing (IALP
2010), pp. 87-90. Dec. 28-30, 2010. Harbin,
China.
Billy Wong & Chunyu Kit. 2010. The
parameter-optimized
ATEC metric for MT evaluation. In Proceedings
of the Joint Fifth
Workshop on Statistical
Machine Translation and MetricsMATR, pp.360-364. July
15–16, 2010, Uppsala, Sweden.
Yan Song & Chunyu
Kit. 2010. Does
joint decoding really outperform cascade processing in
English-to-Chinese transliteration generation? The role of
syllabification. In Proceedings
of 2010 International Conference on Machine Learning and Cybernetics
(ICMLC
2010),
pp.3323-3328. July 11-14, 2010, Qingdao,
China.
Ruifeng Xu & Chunyu Kit. 2010b.
Opinion retrieval
based on mutual reinforcement between opinion analysis and relevance
estimation. In Proceedings of 2010
International Conference on Machine Learning and Cybernetics
(ICMLC 2010), pp.3347-3352. July 11-14, 2010, Qingdao,
China.
Hai Zhao, Chunyu Kit, & Yan Song.
2009. Character
dependency tree based lexical and syntactic all-in-one parsing for
Chinese. In Maosong Sun & Qunxiu Chen (eds.), Advances in
computational linguistics in
China, the Proceedings of the 10th Chinese National
Conference on
Computational Linguistics
(CNCCL-2009), pp. 82-88. Yantai, China, July 24-26, 2009. (in Chinese)
Chunyu Kit & Zhiwei Feng. 2009. Ontology-based
definition of term. Terminology
Standardization & Information
Technology, 2009 Issue 02, pp.4-8 (part 1) , Issue 03 pp.14-23
(part
2). (in Chinese)
Hai Zhao & Chunyu Kit. 2009. A simple
and efficient model pruning method for conditional random fields. In W.
Li & D. Mollá-Aliod
(Eds.): ICCPOL 2009, LNAI
5459, pp. 145–155, 2009.
Springer. PDF
(236.8 KB)
Xiaoyue Liu, Jonathan Webster, & Chunyu Kit. 2009. An
Extractive Text Summarizer Based on Significant Words. In W. Li &
D.
Mollá-Aliod (Eds.): ICCPOL
2009, LNAI 5459, pp.
168–178, 2009. Springer. PDF
(184.5 KB)
Billy T-M Wong & Chunyu Kit. 2009. Meta-evaluation of machine
translation using parallel legal texts. In W. Li & D.
Mollá-Aliod
(Eds.): ICCPOL 2009, LNAI
5459, pp. 337–344, 2009.
Springer. PDF
(190.1 KB)
Liu, Xiaoyue & Chunyu Kithunyu. 2008. An
Improved Corpus
Comparison Approach to Domain Specific Term Recognition. In PACLIC 22:
Proceedings
of the 22nd Pacific Asia Conference on Language, Information, &
Computation, pp. 253-261. Cebu,
Philippines, Nov. 20-22, 2008.
Hai Zhao, & Chunyu Kit*. 2008. Scaling
conditional random fields by one-against-the-other decomposition. Journal of
Computer Science & Technology, 23(4):612-619 (July
2008).
Springer. PDF
(320.0 KB) [SCI 0.5760,
Q4, 68/84 Computer Science, Software Engineering]
Guohong Fu,
Chunyu Kit
& Jonathan Webster.
2008. A
morpheme-based lexical chunking system for Chinese. In Proceedings of the 7th
International Conference on Machine Learning & Cybernetics
(ICMLC 2008), pp 2455-2460, Kunming, China,
12-15
July 2008.
Risong Na, Chunyu Kit, & Zhiwei Feng.
2008. Statistical
analysis of legal terms in Hong Kong bilingual laws information system.
Terminology Standardization &
Information, 2008 Issue 02, pp.32-35. February 2008. (in Chinese)
Hai Zhao & Chunyu Kit. 2007a.
Effective subsequence-based tagging for Chinese word segmentation (in
Chinese). Journal
of Chinese Information Processing, 21(5):8-13. Also in
Maosong
Sun & Qunxiu Chen (eds.), Frontiers
of Content Computing: Research
& Application, the Proceedings of the 9th Chinese National
Conference on Computational Linguistics (CNCCL-2007, formerly
JSCL-2007), pp. 45-51, Tsinghua University Press.
Dalian, Aug. 6-8, 2007.
Chunyu Kit & Xiaoyue Liu. 2007.
Mono-word
termhood as rank difference in domain & background corpora.
In International
Conference: Keyness in Text, pp. 41-45. Pontignano,
Siena, Italy, June 26-30
2007.
Jing Li, Le Sun, Chunyu Kit & Jonathan
Webster.
2007. A query-focused multi-document summarizer based on lexical
chains. In DUC 2007, NIST,
Rochester, NY, USA, 26-27 April 2007.
http://www-nlpir.nist.gov/projects/duc/pubs.html.
Zhiming Xu, Chunyu Kit & Jonathan J. Webster. 2006.
Integration algorithm of
English-Chinese word segmentation & alignment. In Proceedings of the
2006 International
Conference on Machine Learning & Cybernetics (ICMLC 2006),
pp.5105-4110. Available from IEEE Xplore here.
Zhiming Xu, Qiang Wang, Jonathan J.
Webster,
Chunyu Kit. 2006. An
empirical study of the effectiveness of n-gram modeling in
probabilistic Chinese word segmentation. International Journal of
Computer Science
& Network Security (IJCSNS), 6(1):81-86.
(Abstract)
Zhiming Xu, Jonathan J. Webster & Chunyu Kit.
2006. A new dictionary-based word alignment algorithm. Journal of
Chinese Language & Computing, 16(4):225-238.
(Word
File)
Chunyu Kit,
Xiaoyue Liu, & Jonathan J. Webster. 2006. Abbreviation
recognition with
Maxent
model. In Alexander Gelbukh (ed.), Computational Linguistics &
Intelligent
Text Processing: CICLing 2006, LNCS Vol. 3878, pp.117-120.
Springer. PDF
(263.1 KB) [SCI expanded
0.402, Q4, 61/72 Computer
Science, Theory &
Method]
Chunyu Kit, & Xiaoyue Liu. 2005. Period disambiguation with Maxent
model. In R. Dale, K-F Wong, J. Su & O Y Kwong (eds.), Natural
Language Processing - IJCNLP 2005,
LNCS 3651, pp.223-232. Springer. PDF
(327.4 KB) [SCI expanded
0.402, Q4, 1/72 Computer
Science,
Theory
& Method]
Haodi Feng, Kang Chen, Chunyu Kit, & Xiaotie
Deng. 2005. Unsupervised
segmentation of Chinese corpus using accessor variety. In K. Y. Su,
J. Tsujii, J. H. Lee & O. Y. Kwong (eds.), Natural Language
Processing - IJCNLP 2004,
LNCS
3248, pp. 694-703. Springer.
PDF
(195.3 KB) [SCI
expanded 0.402, Q4, 61/72 Computer
Science, Theory &
Method]
Qinan Hu, Haihua Pan, & Chunyu Kit.
2005.
An
example-based
study on Chinese word segmentation using critical fragments. In K. Y.
Su,
J. Tsujii, J. H. Lee & O. Y. Kwong (eds.), Natural
Language Processing - IJCNLP 2004, LNCS
3248, pp.714-722. Springer. PDF
(359.2 KB) [SCI
expanded 0.402, Q4, 61/72 Computer
Science,
Theory
& Method]
Chunyu Kit, Jonathan J. Webster, King
Kui
Sin,
Haihua Pan,
& Heng Li. 2003. Clause
alignment for bilingual HK legal texts with available lexical resources.
In Maosong Sun, Tianshun Yao, & Chunfa Yuan (eds.), Advances in
Computation
of Oriental Languages: 20th ICCPOL Proceedings, pp.286-292.
Shenyang,
4-6 Aug., 2003.
Chunyu Kit 1998. Ba
& bei
as multi-valence prepositions in Chinese (.pdf).
In Benjamin K. T'sou (ed.), Studia
Linguistica Sinica,
pp.497-522.
Language Information Sciences Research Centre, City University of Hong
Kong..
Jonathan J. Webster & C. Kit. 1995.
Computational
analysis
of Chinese & English texts with the functional semantic processor
& the C-LFG parser. Literary & Linguistic Computing,
10(3):203-211.
Oxford University Press.
Chunyu Kit & J. Webster. 1992.
Parsing
Chinese
pivotal
constructions in an LFG approach. In Proceedings
of 1992
International
Conference on Computer Processing of Chinese & Oriental
Languages,
Dec. 15-19, 1992, S& Key, Florida.
Chunyu Kit. 1992. Parsing Chinese ba
& bei
constructions: An LFG approach. In Proceedings
of 3rd
International
Conference on Chinese Information Processing, October
12-16,
1992, Beijing.
Chunyu Kit. 1991. Design & implementation of
the applied system of Chinese automatic word segmentation - CASS. Journal of Chinese Information
Processing, 5(4):27-34. (In Chinese)
Chunyu Kit, Yuan Liu & Nanyuan
Liang. 1989.
On
methods
of automatic Chinese word segmentation. Journal of
Chinese Information Processing, 3(1):1-9. (in Chinese) Presented
in the
First Conference on
Computational Linguistics of China, June, 1988, Tsinghua
University,
Beijing.