|
JEFLL Corpus
|
A corpus of Japanese secondary school learners' English compositions. So
far we have more than 10,000 subjects ranging from Year 7 to 12. The total
size is approximately 650,000 running words. The online web query system
is now available free for research.
Please visit the JEFLL Corpus online query page.
|
|
|
International Corpus of Crosslinguistic Interlanguage (ICCI)
|
Collections of younger learners' English essays from 7 different nations
and regions (China, Taiwan, Hong Kong, Spain, Austria, Poland, Israel),
which are comparable in design with JEFLL Corpus. The approximate size
is about half a million tokens.
Official website for ICCI (part of Global COE Program at TUFS)
|
|
Automatic identification of learner errors using edit distance & parallel
LC
|
A proofread version of JEFLL has been prepared and the differences between
the original writings and the corrected ones are automatically extracted
from aligned sentence pairs, using "Levenshtein distance". The
results can be tagged for omission, addition, misformation errors. This
heuristic can help avoid time-consuming manual error annotations.
More information will be available soon!
|
|
|
CEFR and corpus-based identification of criterial features
|
This is a new project of identifying criterial features for CEFR levels
by examing learner corpora at different CEFR levels. I am a collaborator
with the English Profile Programme with Professor Masashi Negishi at TUFS.
Official website for EPP
|
|