An approach to spectral analysis of psychologically influenced speech

  • Authors

    • Bhagyalaxmi Jena
    • Sudhansu Sekhar Singh
    https://doi.org/10.14419/ijet.v7i1.2.8993

    Received date: December 30, 2017

    Accepted date: December 30, 2017

    Published date: December 28, 2017

  • Speech Signal, Stress, Fast Fourier Transform, Spectrogram, Power Spectral Density.
  • Abstract

    The significant part of any speech signal lies in the information content and the emotions contents like stress or fatigue at a particular period of time. The classification of various types of stress and their effects are defined here. To analyze the changes in stressed speech than that of the normal speech, a database has been created which has investigated the stress among students during the examination in our college. In this paper, the spectral analysis of speech is done where emphasis has been given in the parameters like Fast Fourier Transform (FFT), spectrogram and Power Spectral Density (PSD). These parameters have been simulated using MATLAB codes. The comparison of the mentioned parameters is also done between a normal speech and a psychological stressed speech.

  • References

    1. M. Sigmund. (2006). “Introducing the database ExamStress for speech under stress,” Proceedings of 7th IEEE NordicSignal Pro-cessing Symposium (NORSIG 2006). Reykjavik, (pp. 290-293). https://doi.org/10.1109/NORSIG.2006.275258.
    2. D. A. Cairns & J. H. L. Hansen. (1994), “Nonlinear analysis and detection of speech understressed conditions,” J. Acoust. Soc. Amer., vol. 96, (pp.3392–3400). https://doi.org/10.1121/1.410601.
    3. V. Mohan. (2013). “Analysis & Synthesis of Speech Signal Using Matlab”, International Journal of Advancements in Research & Technology, Volume 2, Issue 5.
    4. T. Johnstone & K. Scherer. (1999) “The effects of emotions on voice quality,” Proceedings of 14th International Congressof Phonetic Sci-ence. San Francisco, (pp. 2029-2032).
    5. D. Ververidis & C. Kotropoulos. (2006). “Emotional speech recogni-tion: Resources, features, and methods,” SpeechCommunication, vol. 48, No. 9, (pp. 1162-1181). https://doi.org/10.1016/j.specom.2006.04.003.
    6. L. R. Rabiner & B. H. Juang. (1993) Fundamentals of Speech Recognition,Englewood Cliffs, NJ: Prentice-Hall.
    7. Cowie & R.Cornelius, R.R. (2003). Describing the emotional statesthat are expressed in speech. Speech Comm. 40 (1), 5–32.Cowie, R., Douglas-Cowie, E., 1996. Automatic statistical.Rep. 236, Univ. of Hamburg. https://doi.org/10.1016/S0167-6393(02)00071-7.
    8. Flanagan, J.L. (1972). Speech Analysis, Synthesis and Percep-tion.second ed. Springer-Verlag, NY. https://doi.org/10.1007/978-3-662-01562-9.
    9. Heuft, B., Portele & T. Rauth, (1996). Emotions in time domain syn-thesis. In: Proc. Internat. Conf. on Spoken Language Processing (ICSLP ’96), Vol. 3, (pp. 1974–1977).
    10. Markel, J.D., Gray & A.H. (1976). Linear Prediction of Speech. Springer-Verlag, NY. https://doi.org/10.1007/978-3-642-66286-7.
    11. Quatieri, T.F. (2002). Discrete-Time Speech Signal Processing. Prentice-Hall, NJ.
    12. Rahurkar & M.Hansen (2002). Frequency band analysis for stress detection using a Teager energy operator based feature. In: Proc. In-ternat. Conf. on Spoken Language Processing (ICSLP ’02), Vol. 3, (pp. 2021–2024).
    13. Steeneken & Hansen (1999). Speech under stress conditions: over-view of the effect of speech production and on system performance. In: Proc. Internat. Conf. on Acoustics, Speech, and Signal Processing (ICASSP ’99), Phoenix, Vol. 4, (pp. 2079–2082).
    14. Womack & B.D., Hansen, (1996). Classification of speech under stress using target driven features. Speech Comm. 20, (pp.131–150). https://doi.org/10.1016/S0167-6393(96)00049-0.
    15. Zhou, G., Hansen, J.H.L. & Kaiser, J.F. (2001). Nonlinear feature-based classification of speech under stress. IEEE Trans.Speech Au-dio Processing 9 (3), (pp.201–216). https://doi.org/10.1109/89.905995.
    16. Deller, J. R., Hansen, J. H. L., Proakis, J. G. (2000). Discete- Time Processing of Speech Signals. N.Y.: Wiley.
    17. M. Sigmund, Voice Recognition by Computer. Tectum Verlag, Mar-burg. (2003).
    18. M. Sigmund & P. Matĕjka. (2002) “An environment for automatic speech signal labelling,” Proceedings of 28th IASTED International Conference on Applied Informatics. Innsbruck, (pp. 298-301).
    19. A. Nagoor Kani. (2005). Signals & Systems. Tata McGraw Hill Ed-ucation.
    20. Sanjit K Mitra. (2009). Digital signal processing, A computer base approach, Tata McGraw Hill.
    21. Lawrence R. Rabiner & Ronald W. Schafer. (2003). Digital Pro-cessing of Speech Signals. AT&T.
    22. Alan V. Oppenheim, Alan S. Willsky & S. Hamid Nawab.(2005). Signal & Systems. PHI Learning.
    23. J.H. Hasen & S.E.Ghazale.Getting started with SUSAS. Proceedings of Eurospeech’97. Rhodes, (pp.1743-1746).
    24. M.Kepesi & L.Weruaga. (2006). Adaptive chirp-based time-frequency analysis of speech signals.vol.48, No.5, (pp. 474-492).
    25. B. Gold & N. Morgan. (2000). Speech and AudioSignal Processing. New York. John Wiley and Sons.
    26. Milan Sigmund. (2007). Spectral Analysis of speech under stress. IJCSNS International Journal of Computer Science and Network Se-curity, vol.7.
    27. J.H.L Hansen & B.D.Womack. (1996). Feature analysis and neural network-based classification of speech under stress.(pp. 307-313)
    28. R.J McAulay & T.F. Quatieri. (1986).Speech Analysis based on a Sinusoidal Representation. IEEE Transaction On Audio, Speech, And Language Processing.Vol.14.No.3 https://doi.org/10.1109/TASSP.1986.1164910.
    29. W.Press, S.Teukolsky, W.Vetterling & Flannery. (1992).
    30. Ruhi Sarikya & John N. Gowdy. (1997). Wavelet Based Analysis of Speech under stress.
    31. B.S. Atal. (1976). Automatic Recognition of Speakers from their Voices. Vol.64. no. 4(pp. 460-476) https://doi.org/10.1109/PROC.1976.10155.
    32. D.O’ Shauhnessy. (2004). Speech Communication (Human and Ma-chine).
    33. Herman J.M. Steeneken and Johan H.L. Hasen. Speech under Stress Conditions: Overview of the Effect on Speech Production and on System Performance.
  • Downloads

  • How to Cite

    Jena, B., & Sekhar Singh, S. (2017). An approach to spectral analysis of psychologically influenced speech. International Journal of Engineering and Technology, 7(1.2), 66-70. https://doi.org/10.14419/ijet.v7i1.2.8993