プログラム

ウェブサイトのシステムの関係上、日本語表記のお名前の姓名の順が逆転していますが、ご了承くださいませ。

ご自身のご発表をお探しの場合は「EXPAND ALL」のアイコンをクリックして頂き、
Windowsをお使いの方は[Ctrl]キーを押しながら[F]キーを押し、[検索]ダイアログ
ボックスにご自身のご発表のタイトルを入力し検索してください。
Macの方は[command]キーを押しながら[F]ーを押し、[検索]ダイアログボックスから
ご検索ください。

Appearance frequency of idiomatic phrase in fixed-length tag set of BCCWJ corpus

Stream: Syllabaries and characters/Vocabulary (文字/語彙)
Date: Saturday 12th July
Time: 10:00 AM – 12:00 PM
Speaker(s): Tatsuya Kitamura, Kaede Tanijiri, Saya Kanazawa, Yoshiko Kawamura, Konan University

Presentation type (発表形態): Poster Presentation (ポスター発表)
Language of presentation (発表言語): English (英語)

It is not easy for language learners to achieve mastery of idiomatic phrases of the target language. In order to reveal how much time and effort Japanese language learners should allot to learning idiomatic phrases and which idiomatic phrase should be learned preferentially, we must establish the frequency of use of idiomatic phrases in daily life. We thus measured it for a Japanese text corpus._x000D__x000D_In the present study, we measured the appearance frequency of idiomatic phrases in fixed-length sample sets of the Balanced Corpus of Contemporary Written Japanese (BCCWJ) compiled by the National Institute for Japanese Language and Linguistics. BCCWJ is a tagged corpus of one hundred million words of written Japanese, and the fixed-length sample sets consist of 1,000-character samples from 1,473 newspaper articles, 1,996 magazine articles, 10,551 books, and 1,250 white papers._x000D__x000D_Over six thousand Japanese idiomatic phrases were obtained from Weblio web dictionary (http://www.weblio.jp/) automatically, and notational variants of idiomatic phrases were then added manually. On the basis of the idiomatic phrase list, we developed a program for finding idiomatic phrases taking into consideration the conjugation of verbs and adjectives from text files and calculated the appearance frequency, that is, the average number of idiomatic phrases per 1,000 characters._x000D__x000D_The appearance frequencies of idiomatic phrases were 0.65 for newspaper articles, 0.77 for magazine articles, 1.12 for books, and 0.09 for white papers. The results showed that less than one idiomatic phrase appeared in a 1,000-character sample of newspaper articles, magazine articles, and white papers; in particular, the appearance frequency in white papers was significantly small. Even in books, a mere one idiomatic phrase appeared per 1,000-character sample. This suggested that learners should learn only highly frequently used idiomatic phrases._x000D__x000D_We will present the top 100 high-frequency idiomatic phrases at the Sydney ICJLE conference 2014.

Keywords (キーワード)

Appearance frequency, Idiomatic phrase, Fixed-length tag set, BCCWJ corpus, Weblio web dictionary

ページトップへ