The main objective of this paper is the definition of psycho-acoustically confirmed acoustic correlates of rhythmic structure across various languages, accents and speaking styles. Within this quest, speech rhythm is regarded as being characterized by grouped sequences of
beats, which are characterized by their prominence structure. Duration is identified as the major acoustic correlate of speech rhythm organization: it signals both beginnings and ends of rhythmic groups at different
hierarchical levels of rhythmic organization and is the most robust cue to perceptual prominence. In order to visualize the impact of relative durations across rhythmically salient beat transitions, e.g. at phrase
boundaries or stresses, time-delay plots are introduced. The method is evaluated quantitatively both statistically and using a KNN classifier.
The visualization technique reveals different relative timing patterns for French and English, which provide prototypical cases for the classic distinction between stress and syllable timing. An analysis of relative
timing across several languages shows that the traditional classification into stress and syllable timing falls way too short. An appropriate quantitative model of speech rhythm must be multidimensional and take into account psycho-acoustic facts.