curve_top_left curve_top_right
 

Home

■What's New
・2010/04/28 39574 words
・2010/05/10 48671 words(+9097 words)
・2011/10/13 267442 words(+218771 words)


■Outline & Purpose
CbLLE POS Research Engine (Written Japanese by Japanese learners) has been designed to search part-of-speech tags of the Learner's Language Corpus of Japanese developed by the Global COE Program, ‘Corpus-based Linguistics and Language Education' (CbLLE). Part-of-speech (POS) tagging is based on the system called Chasen. This research engine was developed by the Global COE Program ‘Corpus-based Linguistics and Language Education' (CbLLE) in order to carry out a systematic analysis of Japanese compositions by Japanese learners.
References:
http://chasen-legacy.sourceforge.jp/

■Corpus Description
This search engine can search part-of-speech tags in the Learner's Language Corpus of Japanese consisting of data collected in Taiwan and from native speakers).
1. Data collected in Taiwan (total word count: about 200,000 words)

 

Period of data collection

Kind of tasks

First Phase

February- June
(Once every 2 weeks)

8 functional tasks (f1-f8)
8 diary writing tasks (d1-d8)

Second Phase

September-December
(Once every 2 weeks)

8 diary writing tasks (d9-d16)

2. Data from native speakers (total word count: about 60,000 words)
Data was collected from Japanese native speakers in a single session using the same functional tasks (f1-f8) and one diary task.

Contents of functional tasks:

 

Task number

Contents

Task 1

f1

Introducing yourself

Task 2

f2

Writing about the characteristics of something

Task 3

f3

Writing you intend to do something

Task 4

f4

Saying someone must not do something/
Telling someone how to do something

Task 5

f5

Writing about someone's experience(s)

Task 6

f6

Writing for permission

Task 7

f7

Advising someone

Task 8

f8

Writing what you hope will happen


Name of the data:
1)
Tw 01 f1 -D
  Place
Writer's number Task number Use of dictionary
2)
Tw 01 d1  
  Place
Writer's number Task number No use of dictionary
3) NS
01 d  
  Native speaker Writer's number Task number  


1) ‘Tw01f1-D' indicates the composition was collected in Taiwan, written by Learner No.1, the task was functional task 1, with dictionary.
2) ‘Tw01d1' indicates the composition was collected in Taiwan, written by Learner No.1, the task was diary-writing task 1, without dictionary.
3) ‘NS01d' indicates the composition was written by Native speaker No.1, the task was diary-writing task.
References:
http://cblle.tufs.ac.jp/llc/ja/

■Research Team
Core Member in Charge: Umino, Tae (GSACS)
Researcher within the university: Lin, ChunChen (GSACS)
Graduate students:Suzuki, Ayano (Doctoral Student); Yang, ChiaChen (Doctoral Student), Inokawa, Mutsumi (Doctoral Student); Torii, Aya (Master's Student)
Undergraduate students: Hayashi, Nami; Kashimoto, Rui
Data Providers: Peng, ChunYang (Tamkang University, Taiwan); Kazuo, Horikoshi (Tamkang University, Taiwan)
Informants:
Learners: Taiwanese university students, intermediate level
Native speakers: Japanese university students



UBLE

 
curve_top_left curve_top_right