MORA TIMING BY BRITISH LEARNERS OF JAPANESE

Katsumi Nagai
Osaka University Graduate School of language and Culture

1. Introduction

Japanese has long been classified as a mora-timed language, which has an opposition between Heavy (bimoraic) syllables and light (monomoraic) syllables. A closed syllable with a long vowel as the nucleus is counted as two morae in Japanese. Japanese poets, who write haiku, count the number of morae to achieve its meter. Traditionally, the mora in Japanese was considered to have equal duration. Han (1962a) advocated that long vowels without preceding consonants were approximately twice as long as short vowels. However, more recent studies have presented negative evidence against the isochrony of mora timing like Beckman's claim (1982) that the mora has no phonetic basis. She says that it is merely a phonological unit of length or weight without reality. Recent definitions of mora have stated that the mora is 'a term used in traditional studies of metrics to refer to a minimal unit of metrical time or weight (Crystal 1997:248).

On the other hand, English is called a stress-timed language, whose stresses occur at almost equal intervals. While stressed syllables in English are pronounced with a greater amount of energy than unstressed syllables as Ladefoged (1993) explains, Japanese speakers mainly change the height of tone to accentuate the segment. The rules of stress alignment, which can apply recursively to realize a well-formed description, have been studied by a number of researchers. However, it must be noted that stress-timing and mora-timing are categorizations made by different criteria. Since stress is a supersegmental feature of utterances, stress can be placed on both syllables and morae. It is necessary to examine the phonetic reality of mora-timing by both native speakers and learners of Japanese because mora-timing by speakers of stress-timed language has been relatively neglected.

The concept of the mora, as an abstract isochronous unit of timing in Japanese, was comprehensively re-examined by Port, et.al. (1987). They stated that the mora needs to be defined by the segments which comprise it, rather than as a traditional CV syllable which yields compensation within itself. Their experiments included the investigation of utterances which contain more than two morae at the same time unlike Beckman's experiment (1982). They concluded that "the concept of the mora as an abstract isochronous unit of timing in Japanese captures many of the most salient features of timing in this language (Port, et.al.1987:1584)." If so, acquisition of mora-timing would be indispensable for learners of Japanese. Experiments in this paper try to answer the following questions by examining reality and importance of British learners' mora-timing in Japanese: (i) "Does British learners' length of learning Japanese affect the total duration and mora timing of Japanese words?" and (ii) "Which of morae or syllables do the learners count when the number of morae and syllables is different?"

2. Experiment 1

2.1. Aim

Japanese words begin and end at mora boundaries. If the number of morae in a word controls the word duration regardless of the syllabic structure, the length of the word should increase as the number of morae increases. If each added mora has the same duration as the other morae, the total word duration should increase by a constant amount. Port et.al. (1987) proved that the additional mora adds nearly the same amount to the total Japanese word duration. This may sound too natural for native speakers of Japanese. However, it must be noted that adding syllables to an English word does not yield constant lengthening of the word (Lehiste 1972). Therefore, a question in Experiment 1 is whether native English speakers show the same kind of lengthening when they speak Japanese. It is plausible that English speakers' timing control of Japanese words is affected by their English style of timing control. If so, the English speakers' deviation of data from the regression line, which is drawn from estimated word duration and the number of morae, would be larger than that of native Japanese speakers. Hypothesis in this experiment is that H: Native Japanese speakers' correlation between word duration and the number of morae are larger than that of British elementary learners. This is not a statistical null hypothesis because it is dangerous to test the difference of means in the design of Experiment 1 (Guilford and Fruchter 1978:151ff.).

2.2. Design

2.2.1. Materials
Three sets of words (Table 1) are written in Japanese orthography (hiragana). They are taken from Port, et.al. (1987) with a few minor changes to compare the result with their data of native Japanese speakers. All words are embedded in the carrier sentences as below:

Watashi-wa     ...-to     ii-mashi-ta.

I-TOP          ...-ACC    say-POLITE-PAST

"I             said       ...."


# of morae 

   ra-set          ka-set          si-set

1  ra              ka              si

2  raku            kata            sita

3  rakuda          katana          sitaku

4  rakudaga        katanasi        sitakusu

5  rakudagata      katanarasi      sitakusuru

6  rakudagataka    katanarasida    sitakusuruka

7  rakudagatakasi  katanarasidake  sitakusurukana

Table 1 Test words in Experiment 1

2.2.2. Subjects
Twelve subjects are classified into three groups. They are native speakers of Japanese, British elementary learners of Japanese, and British advanced learners of Japanese. Each group consists of four subjects. All the native speakers are language teachers in Japan. All of them speak a Tokyo dialect.

The British elementary learners of Japanese are sophomores at the University of Edinburgh. They have studied Japanese nearly one and half years including six lectures and one tutorial per week at the Centre for Japanese Studies. They have no experience in studying abroad.

The British advanced learners are senior students at the same university, who lived in Japan for one year as exchange students to study Japanese language. The curriculum they finished includes Japanese grammar, translation into and from Japanese, conversation and discussion, and Kanji. Their vocabulary in Japanese is somewhat limited and some unnatural accentuation still remains, but they have little difficulty in making themselves understood in Japanese.

2.2.3. Procedure
The recording was made in a sound-proofed recording room. The test sentences were printed in one sheet. The procedure was briefly explained to subjects in English, and they tried reading the list for a few minutes as a practice. Then the subjects were asked to read the list five times at a slow tempo and another five times at a fast tempo. In the both cases they were asked to read the list comfortably. The subjects were instructed not to put a pause between words. Besides, the author sat next to the subject all the time in a recording room and asked them to repeat the utterance when any unnatural pause was heard because the distinction between unnatural pause and natural closure of /k/ were difficult even for native speakers of Japanese.

Their utterances were recorded using a sensitive condenser microphone (Senheisser MKH-815). The signal was sent to a microphone amplifier (Soundcraft 200B) and recorded digitally on a DAT recorder (SONY PCM-2700A) with 16bit/44.1KHz sampling. The recordings were analyzed on a UNIX workstation (Sun Sparc Station) with a D/A and A/D conversion board. The duration of the segmental units of words was measured by wide-band spectrograms and time domain waveforms on a VDT screen of X-Waves analyzer with digitization of 16 KHz sampling.

Consistency was the first priority in measuring duration. The criteria for segmentation followed standards in Ladefoged (1993:199ff.). The beginning of /s/ was the onset of the noise pattern in higher frequencies, and, in case of /k/, the onset of the closure was measured. The end of words were the onset of the closure of /t/ in the carrier sentences. Apart from the confusion of a psychological and physical scale, segmentation of speech sound is said to be very difficult because speech is a continuum. Beckman and Shoji (1984) also state that "a central problem in the study of speech production and perception is the difficulty of reconciling linguistic representations of an utterance as a series of discrete, static, temporally unspecified phonemic segments with the lack of such units in the acoustic signal". However, as Ganong and Zatorre (1980) tested the reliability of four methods of measuring phoneme boundaries, recent sound spectrographs and waveforms created precisely by computers are quite reliable and powerful tools to analyze speech sound. Blumstein and Stevens's pioneering work (1979:1002) says, 'the observation, based on acoustic theory, that short-time spectra sampled at consonantal release show distinctive gross characteristics for different places of articulation suggests that these properties are utilized by the human speech perception mechanism in order to extract information conveying place of articulation.' It is doubtless that careful investigation of the spectra, where amplitude is plotted against frequency by computer, reveals a close correlation between foreign learners' segmentation control and the acoustic signal. One of the most difficult things was pinpointing the onset of /r/. Unlike stop sounds which have a clear onset and offset of closure, the onset of /r/ is unclear, especially after /a/ in the carrier sentence. Intensity curves and time-domain waveforms were essential aids there.

2.3. Results

Mean and standard deviations of the test words are in Figure 1 through Figure 3 (all of the figures are tucked at the end of the paper). They show the duration, pooled across subjects of each group, as a function of the number of morae. Although slopes of si-sets in all figures look least steep through the three groups of subjects, little difference can be observed among three sets of the test words and across two tempos. The result clearly shows that word length increases almost equally by each additional mora. The difference of each test words does not yield the difference of word duration across the figures. This result is consistent with the data by native speakers in Port et.al. (1978). The correlation pooled across speakers of each group. The correlation by native speakers (for example, r2=0.9968 for ra-set) are somewhat larger than learners of Japanese (r2=0.9846 for Elementary learners and r2=0.9892 for Advanced learners). This result implies the tendency that the correlation by native speakers is larger than that of British learners. Contrary to the linear relations between word duration and the number of morae, the number of morae seem to have some effect on mora duration as seen in Figure 4 through Figure 6. Analyses of variance show that the number of morae gives an effect on its duration of each group. For example, native speakers' total word duration is significantly affected by the number of morae (F(6.161)=2.29, p<0.05). The result is consistent with that of native speakers in Port et.al. (1987). They claims that this effect of the number of mora on duration of mora is a "hint of resemblance" to the case of English though the "magnitude of the effect is clearly far smaller here than in Germanic languages (Lehiste 1972)". However, the number of morae have significant effect on the total word duration of advanced and elementary learners too (elementary learners: F(6.161)=4.90, p<0.01, advanced learners F(6.161)=5.76, p<0.01).

2.4. Discussion

The result indicates that word duration in Japanese depends heavily on the number of morae in the word. High correlation between word duration and the number of morae is observed in the data of both native speakers and British learners of Japanese. If the values of all three sets of test words are pooled, a flatter curvature like Figure 4 in Port et.al. (1987) is available. English speakers' difference in duration between one-mora words and two-morae words are much larger than that of Japanese speakers as seen eminently in Figure 5 and Figure 6. Mora duration, especially one of ka-set, sharply decreases when the number of mora increases one to two. The result is consistent with Sugito's data (1996:278), which indicate that English speakers' duration of one-syllable words and two syllable words are similar. Note here that all native Japanese speakers speak Tokyo dialect, which is usually free from a pitch change within one-syllable words like hi ('sun') and me ('eye') in an Osaka dialect.

Where the focus is on the difference between native speakers and learners of Japanese, si-set of the test words must be examined carefully because the set is expected to include devoiced high vowels after /s/. If /i/ in si-set is constantly devoiced, the number of syllables of the si-set would be different because adding only a consonant makes CCV ([?ta]) and CCVCCV ([?takhsu]) structures. English speakers might be distinguished from native Japanese speakers when the number of morae and the number of syllables in the word is different as in the case of sita (2 morae/1 syllable) and sita.kusu (4 morae/2 syllables) in si-set probably because English speakers are used to syllable-based segmentation. However, such an effect in si-set is hardly seen in the result of Experiment 1. English speakers can be hardly distinguished from Japanese speakers from the viewpoint of their word duration. The reason might be either (i) English speakers' segmentation at CVC boundaries of Japanese words is different from that of English words. That is, they can realize the isochrony of morae when they speak Japanese CVC syllables putting two phonological peaks on the CVC word as is seen in moraic consonants of Osaka dialect. Or, (ii) high vowels in the test words (/i/ and /u/) are not devoiced. In other words, the test set does not have a CVC structure but have a CV.CV structure.

If (i) above is right, it would be necessary to examine English speakers' segmentation of Japanese CVC words more closely. Why do their utterances in Japanese sound less natural than native speakers despite of their linear increases of duration? This question leads up to the aim of Experiment 2 because the difference between mora-timing and syllable-timing is tested directly. On the other hand, (ii) could be a counter-proposition to Port et.al. (1987). The spectrographs were double-checked whether the vowels in si-set were devoiced or deleted. The close examination proved that /i/ between /s/ and /t/ (in its carrier sentence) were often devoiced, especially when native speakers spoke at a fast tempo. One speaker, who frequently voiced /i/ between /s/ and /t/, sometimes pronounced it weaker or devoiced. When the same subject spoke at a fast tempo, /i/ was often deleted in the spectrograph. It was observed in his data that more slowly and carefully the speaker spoke, the less devoicing and deletions of /i/ occurred. However, it was impossible to deduct a general rule or environment where the devoicing occurred.

Japanese is often referred as a language with the voiceless vowels. Ladefoged and Maddieson (1996:49) explain that vibration of vocal folds is prevented by opening the glottis widely enough so that the folds are too far apart to vibrate, or by too low or high subglottal pressure, even if the other articulatory organs are set appropriately. Phonologically, the simplest rule would be that "high vowels (/i/ and /u/) are devoiced when preceded and followed by voiceless obstruents," as seen in Fromkin and Rodman (1983:37). Devoicing and deletion of Japanese vowels sound complicated because most preceding studies include subjective conditions such as "in slow or careful speech." Sakuma (1963:232) and Vance (1987:48), for example, claim that devoicing is applied "in careless pronunciation". Observation of the data in Experiment 1 revealed that devoicing in Japanese occurs like an allophonic free variation in both native and non-native speakers of Japanese. Pedagogically more important fact would be that devoicing or deleting vowels gives little effect on the total word duration. In other words, word duration increases almost linearly as the number of morae increase, even if a vowel in the test word is deleted or devoiced.

3. Experiment 2

3.1. Aim

The number of morae in a word controls Japanese word duration as examined in Experiment 1. However, the number of morae in Experiment 1 was also the number of syllables at the same time. If isochrony of morae has phonetic reality, the effect of morae should be observed in the case that the number of morae and syllables are different. If English speakers cannot be free from syllable-based timing, they may have trouble reading Japanese mora-timed words. If so, the difference between native speakers and learners of Japanese might be used as a yardstick of the learners' achievement.

In order to test whether English speakers' mora-timing is properly at work or not, duration of test words which have two syllables and three morae (e.g. baa.ku and bak.ku) and test words which have three syllables and three morae (e.g. ba.ku.do) are compared. If elementary learners of Japanese cannot use a mora-counting tactic when they speak, duration of the test words which have two syllables and three morae (baa.ku and bak.ku) would be shorter than that of the test words which have three syllables and three morae (ba.ku.do). Therefore, the null hypothesis in this experiment is that H0: the elementary learners' duration of three-mora/three-syllable words (ba.ku.do and bi.ku.do) is not longer than that of three-mora/two-syllable words (baa.ku, bak.ku, bii.ku, and bik.ku).

3.2. Design

3.2.1. Materials
The following sets of test words, all up to three syllables in length, are read by subjects. Two-mora/two-syllable words (ba.ku and bi.ku) are also examined for comparison. The ba.ku set is the same as in Port, et.al. (1987).

baku (2/2)  baaku (3/2)  bakku (3/2)  bakudo (3/3)  (number of 

biku (2/2)  biiku (3/2)  bikku (3/2)  bikudo (3/3)   morae/syllables)

Table 2 Test words in Experiment 2

Two of them (bikku and bakku) are loan words from English (devoiced 'big' and 'bag'), and another two of them, which are less common words in Japanese, have the meanings (biku 'creel' and baku 'tapir'). All the rest are nonsense words. The sentence list is written in hiragana orthography.
3.2.2. Subjects
Twelve subjects are classified into three groups. Each group consists of four subjects. The criteria of classification are the same as in Experiment 1. All the native speakers are language teachers in Japan, and all of the subjects have normal hearing.
3.2.3. Procedure
They were asked to read a set of words with the same frame sentences as in Experiment 1. No information about the place of accent was given to the subjects. After a brief oral explanation, they practiced reading a few times to avoid the effect of hesitation and misreading. Then they read the list five times at a normal and comfortable tempo. The author was also in a recording room with the subject and asked him or her to read the list again when the subject placed an unnatural space before the target phrases. The five tokens of eight test words by twelve subjects yielded 480 utterances.

Measurement of intervals on a visual display followed the method and criteria of Port, et.al.. (1987) and Port and Rotunnno (1979) Amplitude and waveform windows of X-Waves were essential aids because the relative degree of darkness in a wide band spectrogram sometimes showed only a rough cut of the segment. There the magnified waveform needed to be examined on the screen. Close observation of the amplitude of the waveform and repeated audio playbacks between the two cursors enabled fairly precise segmentation. When more than two burst spikes were observed before the offset of the consonantal closure, the first one was used to measure the voice onset time. In order to minimize the sampling and measurement error, a reliability check by a newly sampled spectrogram was conducted again, and the cursors on a VDT window were reset at every pinpointing (Ohala and Lyberg 1976). All data were analyzed with SPSS (4.0 for UNIX System V/386) statistical package. MANOVA command was performed because the same experimental unit was measured repeatedly. The repeated data was set horizontally so that all of a subject's scores across occasions resided in one case. This type of multivariate data-setup prevents subjects from being involved in a random effect nested under between-subject factors (SPSS 1988:33, Davidson 1996:158-160). As in Port, et.al. (1987), Voice Onset Time was calculated as a part of the following vowel (/u/) in the statistical tests.

3.3. Results

Regardless of the fact that accent in Japanese cannot be assigned on geminated consonants or vowels, all subjects put stresses on the first syllables. The means are plotted in Figure 7, Figure 8, and Figure 9. Japanese speakers' data clearly show that mora-timing is at work because the two-syllable/three-mora words (baaku/bakku and biiku/bikku) are not shorter than three-syllable/three-mora words (bakudo: F(1,10)=0.17 n.s. and bikudo: F(1,10)=0.11 n.s.). On the contrary, elementary learners' duration of three-syllable/three-mora words are longer than two-syllable/three-mora words (bakudo: F(1,10)=9.15 p<0.05. and bikudo: F(1,10)=6.92 p<0.05). This result enables the null hypothesis to be rejected. Advanced learners comes in the middle of the two groups, and only one of the two three-syllable/three-mora words is significantly long (bakudo: F(1,10)=1.94 n.s. and bikudo: F(1,10)=8.61 p<0.05).

The duration of two-mora words (baku and biku) should be two-thirds as long as that of three-mora words if each mora duration is constant. Native speakers' mean duration of two-mora words (281ms) yield 73.2% of three-mora words (388ms). It is close to 75%, and two-mora words are shorter than three mora words (baku: F(1,10)=15.33 p<0.01, biku: F(1,10)=13.35. p<0.01). Most duration of two-mora words by non-native speakers are also shorter (F (1,10)=5.16 p<0.05 (baku), F(1,10)=4.46 p=0.06 (biku) for elementary learners, and F(1,10)=6.06 p<0.05 (baku) F(1,10)=9.86 p<0.01 (biku) for advanced learners) than three-mora words, but elementary learners' duration of two-mora words (86.5% of three-mora words) is longer than those of advanced learners (81.4% of three-mora words).

The native speakers' duration of initial /b/ are longer in the two-syllable/three-mora words. For example, /b/ in baaku and bakku is longer than those of baku and bakudo (F(1,14)=4.75 p<0.05), and the same is true in biiku and bikku (F(1,14)=5.42 p<0.05). This result is consistent with Port, et.al.. (1987). However, elementary learners' duration of /b/ in baaku and bakku is not longer than those of baku and bakudo (F(1,14)=0.06 n.s. /b/ in biiku/bikku is not longer either (F(1,14)=0.01 n.s.). Advanced learners are indistinguishable from elementary learners in the point of /b/ duration (/b/ in baaku/bakku F(1,14)=0.21 n.s. /b/ in biiku/bikku F(1,14)=0.21 n.s.).

It is natural that /k/ in bakku or bikku is longer than other words because the control of closure duration plays an important role for the naturalness of Japanese stop consonants (Han 1992). Native speakers (2.38 times) and advanced learners (2.36 times) achieve more than twice closure duration. However, the duration is smaller for elementary learners (1.72 times). It results in the elementary learners' foreign accent because insufficient closure of Japanese geminate consonants yields unnaturalness and perception of a single consonant (Sugito 11989:169, Nagai 1994). It is also true that opinions are still divided on the relation between closure duration of stop consonants and their naturalness as seen in Beckman (1982) and Han (1992). Campbell's statistical survey (1992) reports that strong correlations are observed between the duration of consonants and the type, gemination, and position in the breath group. He shows that there is another strong correlation between the vowel duration and its neighboring phonemic environment or its position in the sentence. However, it would be also important that extra duration is said to go into pauses in Japanese (Sugito 1982:343ff.). The same effect is reported in the case of English (Goldman-Elsler 1968).

Port, et.al. (1987) showed that native speakers' /k/ in baaku and biiku is longer than those in baku, bakudo, biku, and bikudo. However, such effects are rarely seen in the result of this experiment. Native speakers' /k/ in baaku/biiku is not longer than /k/ in baku/biku and bakudo/bikudo (baaku: F(1,10)=3.23 n.s. biiku: F(1,10)=1.36 n.s.). Port, et.al. (1987) claimed that their finding of longer /k/ after longer vowels was the evidence of the incorrectness of positing isochronous syllable-timed compensation. This discrepancy might imply that voicing of the second consonant is more influential than the duration of its preceding vowel. The diversity of speaking rates between groups must be mentioned lastly. Mean duration of three-mora words (baaku, bakku, bakudo, biiku, bikku, and bikudo) pooled within groups are 388ms for native speakers, 713ms for elementary learners, and 516ms for advanced learners. It may indicate that the higher level learners achieve, the faster they can speak. According to Haggins (1964), the normal range of conversational speaking rate is 4 to 7 syllables per second (250ms to 143ms per syllable) in English. A large number of the values by elementary learners are outside the range of four to seven syllables per second.

3.4. Discussion

Klatt (1975) refers to the intrinsic or inherent phonological duration of phonetic segments in English. He classified the factors to vary the nature of segmental duration and said "individual articulators seem to begin their movements at different times and with different velocities, presumably because there has already been a complex recording of the articulatory program to take into account (a) permissible anticipatory coarticulation, (b) other minimizations of articulatory effort, (c) the distance that an articulator must travel, and (d) the different physical constraints imposed on the subglottal, laryngeal, and supraglottal apparatus (Klatt 1976:1209)." His remark shows the difficulty in defining segmental duration only in one of the phonological domain, the articulatory domain, or the perceptual domain because various phonological factors (speaking rate, stress or emphasis), syntactic factors (constituent structures, locations of boundaries), and lexical factors (phonemic string and lexical stress) interact together. Daniloff and Hammarberg (1973) put it "allophonic variation is any change in the canonical forms that constitute the plan for an utterance, whereas coarticulation is a by-product of the execution of the plan." The inherent phonological duration is obscured by physiological limits and efforts to minimize articulatory effort (Klatt 1976: 1215).

Needless to say, physiological mechanisms of phonation do not differ among speakers' of every language. It means that English speakers and Japanese speakers share the same articulatory system. It may be taken for granted, however, some language teachers still cannot be free from a superstition like "Japanese people cannot speak as rhythmical as English speaking people because we lack innately good rhythmic sense they have." etc.. Naturally, it takes more effort to open the jaw for a long distance. The test word baaku and biiku include long vowels which should be longer than the baku/bakudo and biku/bikudo because shorter vowels tend to involve less jaw opening.

Mora-timing in Japanese might not be compatible with the models constructed for syllable-timed and stress-timed language (Port, Al-Ani, and Maeda 1980). Even though the segmental duration have linguistic 'free variations,' language teachers need to make them 'learnable' to English speakers as language-specific characteristics. If speaking a second language is a mapping from their stress-timed segmentation into the mora-timed one, it would be interesting to posit a function between, for example, the duration of /k/ by native speakers and /k/ by language learners. Suppose the mapping over mora timing is a filter of English phonology, it might be described as a bundle of rules extracted from the differences of both languages. For example, English speakers' lip rounding for [u] sound, which begins with the articulation of the preceding sound such as [s] in the word strew (Borden, Harris, and Raphael 1994:156), must be filtered out there. The rules need to include a "converter" from syllable-counting to mora-counting such as "syllable-opener" to make basic Japanese CV syllables out of CVC syllables. Because typical problems of assigning ill-formed timing seem to occur when a CVC syllable is reorganized into moraic CV structure, studying the segmentation problem of closed syllables might be a shortcut for the research of second language learning. At a phonological level, Kubozono (1995) succeeds in explaining a relation between Japanese loan words and their English source words. Takagi and Mann (1994) advocate the hypothesis that the length of Japanese loan word vowels and consonants can be predicted by their original words. Their perceptual experiment revealed that the lax vs. tense vowels in English words systematically correspond to short vs. long vowels respectively, and the stops after lax vs. tense vowels correspond to the geminate vs. single consonants in the Japanese loan words. Two experiments in this paper are trials to bridge the gap between the preceding studies and second language learning. Further experimental studies with a variety of words and pitch patterns would be essential to apply the findings to teaching pronunciation of the Japanese language.

4. Conclusion

A foreign accent in the pronunciation of Japanese is closely connected with the timing-control of articulation. Experienced teachers of Japanese know that lack of appropriate mora-timing, which is one of the most difficult but important targets for learners, results in their unnatural accent. The experiments in this paper revealed that mora-timing has phonetic reality even if the strict isochrony is sometimes blurred. British learners' length of learning Japanese clearly affected total duration and mora timing of Japanese words. In the case that the number of morae and syllables is different, stress-timing of English speakers needs to be adjusted to mora-timing of Japanese in order to avoid foreign accent. In other words, they need to count morae. Teachers of Japanese need to emphasize this unit of timing with confidence because the traditional view of the mora, which claims that each mora should take the same length of time to speak, still has its pedagogical value. However elusive mora-timing in laboratory experiments is, it cannot be denied that there exists a tendency for syllable and word duration in Japanese to depend on the number of morae.

5. References

Beckman, M. (1982) 'Segmental duration and the 'mora' in Japanese' Phonetica:39. 113-135.
Beckman, M. and Shoji Atsuko. (1984) "Spectral and Perceptual Evidence for CV Coarticulation in Devoiced /si/ and /syu/ in Japanese." Phonetica:41. 61-71.
Borden, G. J., Harris, K. S., and Raphael, L. J.. (1994) Speech Science Primer 3rd Ed.. Williams & Wilkins.
Blumstein, S. E. and Stevens, K. N. (1979) "Acoustic invariance in speech production: Evidence from measurements of the spectral characteristics of stop consonants." Journal of the Acoustical Society of America (JASA): 66. 1001-1017.
Campbell, N. (1992) 'Speech Timing in English and Japanese.' Proceedings for International Symposium on Japanese Prosody. Nara, Japan.
Crystal, D. (1997) A dictionary of linguistics and phonetics. 4th Ed. Oxford: Blackwell.
Davidson, Fred. (1996) Principlesb of statistical data handling. London: Sage.
Daigakuto Kagaku Kokai Symposium Soshikiiinkai. (1993) Kokusaikasuru Nihongo. Tokyo: KUBAPRO.
Fromkin, V. A. and Rodman, R. (1983) An Introdution to Language. New York: Halt, Rinehart Winston.
Ganong, W. F. and Zatorre, R. J. (1980) 'Measuring phoneme boundaries in four ways.' JASA:68. 431-439.
Guilford, J.P. and Fruchter, B. (1978) Fundamental statistics in psychology and education. McGraw-Hill.
Haggins, A. W. (1964) 'Distortion of the Temporal Pattern of Speech' JASA: 36. 1055-1064.
Han, M. S.. (1962a) 'The Featureof Duration in Japanese.' Onsei no Kenkyu : 10. 65-80.
Han, M. S.. (1962b) 'Unvoicing of Vowels in Japanese.' Onsei no Kenkyu :10. 81-100.
Han, M. S.. (1992) 'The timing Control of Geminate and Single stop Consonants in Japanese: A Challenge for Nonnative speakers.' Phonetica : 49. 102-127.
Kaiki N. and Sagisaka Y. (1992) 'The control of segmental duration in speech synthesis using statistical methods.' In Tokura (ed.). Speech Perception, Production and Linguistic Structure. IOS Press.
James, A. and Leather, J. (ed.) (1986) Sound Patterns in Second Language Acquisition. Dordrecht: Foris.
Klatt, D. H. (1975) "Vowel lengthening is Syntactically Determined in a Connected Discourse." Journal of Phonetics :3. 129-140.
Klatt, D. H. (1976) "Linguistic uses of segmental duration in English" JASA:59. 5. 1208-1221.
Koizumi Tamotsu. (1980) 'Onsetsuno Kozo'. In Shibata Takeshi Ed. Gengono Kozo :45-83. Taishukan.
Kubozono Haruo. (1992) 'Nihongono mora: sono yakuwarito tokusei' In Haraguchi S. ed. Nihongono morato onsetsukozoni kansuru sogoteki kenkyu. Mombusho Japan 1992: 48-59.
Kubozono Haruo. (1995) Gokeiseito On'inkozo. Tokyo: Kuroshio.
Ladefoged, P. (1993) A Course in Phonetics, 3rd Ed. New York: Harcourt Brace Jovanovich.
Ladefoged, P. and Maddieson, I. (1996) The Sounds of the World's Language. Oxford: Blackwell.
Laver, J. (1994) Principles of phonetics. Cambridge: Cambridge University Press.
Lehiste, I. (1970) Suprasegmentals. Cambride: MIT Press.
Lehiste, I. (1972) "The timing of utterances and lingustic boundaries." JASA: 51. 2018-2024.
Mehler, J., Dommergues, J., Frauenfelder, U., and Segui, J. (1981) 'The syllable's role in speech segmentation.' Journal of Verbal Learning and Verbal Behavior: 9. 295-302.
Meyer, A. S. (1992) 'Investigation of phonologicalencoding through speech error analyses: Achievements, limitations, and laternatives.' Cognition :42. 181-211.'
Morton, J. and Frankish, C. (1976) "Perceptual centres (P-centers)." Psychological Review :83. 405-408.
Nagai Katsumi. (1995) "A study of rhythm perception model." Gengo Bunka-gaku: 5. Osaka University.
Ohala, J., and Lyberg S. (1976) "Comments on 'temporal interactions within a phrase and sentence.'" JASA:59. 990-992.
Port, R. F., Dalby, J., and O'Dell, M. (1987) 'Evidence for mora timing in Japanese'. JASA: 81(5). 1574-1585.
Port, R. F., Al-Ani, S., and Maeda S. (1980) 'Temporal compensation and universal phonetics,' Phonetica :37. 235-252.
Port, R. F., and Rotunno, R. (1979) "Relation between voice-onset time and vowel duration." JASA:66. 654-662.
Sakuma, K. Nihon Onseigaku. Tokyo: Kazama Shobo.
SPSS Inc. (1988) SPSS-X User's Guide. 3rd Ed. Chicago: SPSS Inc.
Sugito Miyoko. (1989) 'Nihongoto Eigono accent-to intonation'. In Sugito Ed. 1989. Nihongoto Nihongo Kyoiku. Vol. 2. Tokyo: Meiji Shoin.
Sugito Miyoko. (1996) Nihonjin no eigo. Osaka: Izumi Shoin.
Takagi Naoyuki and Mann, V. (1994) "A perceptual basis for the systematic phonological correspondences between Japanese load(sic.) words and their English source words." Journal of Phonetics. 22. 343-356.
Vance, T. J. (1987) An introduction to Japanese phonology. New York: State University of New York Press

Figure 1 Mean word duration by native speakers (at fast tempo)


Figure 2 Mean word duration by elementary learners (at fast tempo)


Figure 3 Mean word duration by advanced learners (at fast tempo)


Figure 4 Mean mora duration by native speakers (at fast tempo)


Figure 5 Mean mora duration by elementary learners (at fast tempo)


Figure 6 Mean mora duration by advanced learners (at fast tempo)


Figure 7 Segmental and word duration by native speakers


Figure 8 Segmental and word duration by British elementary learners of Japanese


Figure 9 Segmental and word duration by British advanced learners of Japanese


(c) Katsumi NAGAI 1998 : Jump to the top, Centre for Research and Educational Development in Higher Education, and Faculty of Education, Kagawa University, 760-8521 JAPAN