Regression model of rank –frequency data of Tamil text

  • Authors

    • S Lakshmisridevi Hindustan Institute of technology and Science
    • R Devanathan
    2018-07-20
    https://doi.org/10.14419/ijet.v7i3.11653
  • ZIPF Law, ZM Law, Chi-Square Text, Goodness of Fit.
  • The application of Zipf’s law is universal not only in linguistics but also in various other areas. Mandelbrot modified Zipf law as Zipf Mandelbrot law and it is further we proposed a modification of the ZM law for modeling rank frequency- data of linguistic text. Our model generalized ZM law into a linear regression model involving arbitrary order of Zipfian rank of words in a text .The performance of the proposed model is studied for an English text and it shown to compare favorably with that of Z-M law using Chi-Square goodness of fit test. In this paper we have applied to Tamil text and its performance is also up to the mark and it is been proved by the Chi-Square test and it addresses mainly the lower ranks, we propose to extend the work to higher order ranks using LNRE model in the future.

     

  • References

    1. [1] Zipf. G. K, the Psycho –Biology of Language, Houghton Mifflin, Boston (1935).

      [2] Zipf. G. K, Human Behaviour and the Principle of the Least Effort. A introduction to human Ecology, Hafner, New York. (1949 [3] Wyllys, Ronald E. "Empirical and theoretical bases of Zipf’s law." Library Trends 30.1 53-64(1981).

      [3] Mandelbrot, B An information theory of Statistical Structure of language, in W. E. Jackson (e. d.), Communication theory, Academic press, New York (1953), pp 503-512.

      [4] Mandelbrot, B On the theory of word frequencies and on related Markovian models of Discourse, in R. Jakobson (ed.),Structure of language and its Mathematical Aspects ,American Mathematical Society ,Providence Rhode Island(1962) ,pp.190-219.

      [5] Montemurro, Marcelo A. "Beyond the Zipf–Mandelbrot law in quantitative linguistics." Physica A: Statistical Mechanics and its Applications 300. 3 (2001): 567-578. https://doi.org/10.1016/S0378-4371(01)00355-7.

      [6] Khmaladze, E. V.: The statistical Analysis of large number of rare events, Technical report MS-R8804, Dept. of Mathematical Statistics, CWI. Amsterdam: Center of Mathematics and Computer Science (1987).

      [7] Evert, Stefan. "A simple LNRE model for random character sequences." Proceedings of JADT. Vol. 2004. (2004).

      [8] Popescu, Ioan-IoviÈ›. Word frequency studies. Vol. 64. Walter de Gruyter, 2009.

      [9] https://book.ponniyinselvan.in/.

  • Downloads

  • How to Cite

    Lakshmisridevi, S., & Devanathan, R. (2018). Regression model of rank –frequency data of Tamil text. International Journal of Engineering & Technology, 7(3), 1558-1560. https://doi.org/10.14419/ijet.v7i3.11653