Recent
publications in linguistics with Gabriel
Altmann since 2006 |
|
Some
aspects of word frequencies, Glottometrics,
vol. 13, 23-46 (2006) 1812 KB |
Abstract: In the present article some
new aspects of word frequency distributions are presented, namely the h-point,
the k-point, the m-point, the n-point, the Lorenz curve and Gini�s coefficient,
all of which characterize some properties of texts. Their relationship
to the measurement of vocabulary richness is scrutinized. It is an attempt
at transferring some views from other domains of science to linguistics. |
Some
geometric properties of word frequency distributions, Göttinger
Beitr. Sprachwiss., Heft 13, 87-98 (2006) 830 KB |
Abstract: The present article shows two
complementary methods for estimating the technique of word exploitation
(repetition) with a given vocabulary. For the sake of simplicity word forms
are counted. The methods are based on the geometric properties of the rank-frequency
distribution and the frequency spectrum. |
On
the dynamics of word classes in text, Glottometrics,
vol. 14, 58-71 (2007), also with Karl-Heinz Best 632 KB |
Abstract: In this study, the distributions
of certain parts of speech, especially auxiliaries, is investigated. Using
the definition of the h-point, we define the thematic concentration of
the text and introduce the concentration unit tcu. |
Confidence
intervals and tests for the h-point and related text characteristics, Glottometrics,
vol. 15, 45-52 (2007), also with Ján Macutek 587 KB |
Abstract: Confidence intervals and tests
for recently introduced text characteristics (the h-point and its relatives)
are derived. |
Writer´s
view of text generation, Glottometrics,
vol. 15, 71-81 (2007) 1259 KB |
Abstract: Generally, a �writer�s view�,
defined by the angle between the ends of the word rank-frequency distribution,
as seen from the h-point, should be limited in the interval [pi/2, pi].
However, as shown in the present paper with 176 texts from 20 languages,
actually the lower limit appears to be the golden number Phi = 1.618...
, rather than pi/2 = 1.57... |
Text
ranking by the weight of highly frequent words, In:
Exact methods in the study of language and text, Grzybek, P., Köhler,
R. (eds), Berlin / New York: de Gruyter, pp. 553-562 (accepted December
2005, published July 2007) 208 KB |
Excerpt: ... The present work is aimed
to bring empirical arguments for the transfer of the h-index concept
from scientometrics to linguistics, in other words, to switch the problem
from paper citation ranking to word frequency ranking. Three main classes
of web text sources were used for this purpose, namely The Bible, classical
works, and Nobel lectures. ... In summary, a simple and objective measure
is proposed for text evaluation by a single criterion, namely the percent
of the cumulated number of the first h decreasingly ranked words
out of the total word count. Any (electronic) text can be evaluated in
this way in a matter of seconds. |
Zipf´s
mean and language typology, Glottometrics
16, 31-37 (2008) 201 KB |
Abstract: Zipf's law is not only an expression
of the rank-frequency relationship of words but it also enables us to make
statements about some morphological features of language, too. In
the present study, several indicators are proposed and their mutual relations
are studied, The data are taken from 20 languages. |
Some
problems of musical texts, Glottometrics
vol. 16, 80-110 (2008), also with Zuzana Martináková and
Ján Macutek 1359 KB |
Abstract: The aim of this article is to
find fixed points and regularities in musical texts, set up statistical
tests for their comparison and observe their development. The analysis
is based on rank-frequency distributions of pitches. The following indicators
are described: the h-point and its angle, the a-indicator, the H-point
and the H-coverage having an affinity to the golden section, and the A-ratio.
Different curves capturing the trends are proposed. The analysis has been
performed on 266 compositions of 12 European composers from Palestrina
to Ligeti. |
Autosemantic
compactness of texts, printed in
�Problems of General, Germanic and Slavic Linguistics�, Papers for 70th
anniversary of Professor V. Levickij, Chernivtsi: Books - XXI, pp 472 -
480 (2008) |
Abstract: In the article some new concepts
of text building based on the h-point are introduced, namely the thematic
concentration considering the proportion of autosemantics above the h-point,
autosemantic pace filling expressing the proportions of autosemantics in
h-intervals, and autosemantic compactness being a function of the parameters
of the exponential curve representing the pace filling. Tests for comparing
texts are proposed. |
On
diversity of word frequencies and language typology,
Göttinger Beitr. Sprachwiss., Heft 14, 83-91 (2007) |
Abstract: In the article it will be shown
that different morphological structures of languages give rise to different
word frequency distributions. The forms of the distributions evaluated
by means of repeat rate and entropy show that strongly synthetic and strongly
analytic languages are situated on the two poles of a continuous scale. |