Bayesian probabilistic approach by blind source separation for instantaneous mixtures


  • Pallavi Agrawal Maulana Azad National Institute of Technology
  • Madhu Shandilya Maulana Azad National Institute of Technology





Gaussian Distribution, Markov Chain Monte Carlo, Noise Covariance, Signal Distortion Ratio, Signal To Interference Ratio.


In this work, the novel method of blind source separation using Bayesian Probabilistic approach is discussed for instantaneous mixtures. This work demonstrates the source separation problem which is well suited for the Bayesian approach. This work also provides a natural and logically consistent method in which prior knowledge can be incorporated to estimate the most probable solution. The distri-butions of the coefficients of the sources in the basis are modeled by a generalized Gaussian distribution (GGD) which is dependent on the sparsity parameter q. This method also utilizes prior distribution of the appropriate sparsity parameter of sources present in the mixture. Once, the prior distribution for each parameter (like mixing matrix, source matrix, sparsity parameter and error or noise covariance matrix) are defined, the Bayesian a posterior probabilistic approach using Markov chain Monte Carlo (MCMC) method is exploited in estimation of a posterior distribution of mixing matrix, source matrix, sparsity parameter and covariance matrix of error. The blind source separation provides the results in the form of signal to distortion ratio (SDR), signal to artifacts ratio (SAR) and signal to interference ratio (SIR) at different SNR.




[1] V. P. Minotto, C. R. Jung, and B. Lee, “Multimodal on-line speaker diarization using sensor fusion through SVM,†IEEE Trans. Multimedia, vol. 17, no. 10, pp. 1694–1705, Oct. 2015.

[2] N. Sarafianos, T. Giannakopoulos, and S. Petridis, “Audio-visual speaker diarization using Fisher linear semi-discriminant analysis,â€Multimedia Tools Appl., vol. 75, no. 1, pp. 115–130, 2016

[3] I. Kapsouras, A. Tefas, N. Nikolaidis, G. Peeters, L. Benaroya, and I. Pitas, “Multimodal speaker clustering in full length movies,â€Multimedia Tools Appl., pp. 1–20, 2016.

[4] I. D. Gebru, S. Ba, G. Evangelidis, and R. Horaud, “Tracking the active speaker based on a joint audio-visual observation model,†in Proc. IEEE Int. Conf. Comput. Vis. Workshops, 2015, pp. 15–21.

[5] A. Deleforge, R. Horaud, Y. Y. Schechner, and L. Girin, “Colocalization of audio sources in images using binaural features and locally-linear regression,†IEEE Trans. Audio Speech Language Process., vol. 23, no. 4, pp. 718–731, Apr. 2015.

[6] I. D. Gebru, X. Alameda-Pineda, et al., “EM algorithms for weighted-data clustering with application to audio-visual scene analysis,†IEEE Trans. Pattern Anal. Mach. Intell., vol. 38, no. 12, pp. 2402–2415, Dec. 2016.

[7] P. Agrawal, and M. Shandilya. "Model-Based Method for Acoustic Echo Cancelation and Near-End Speaker Extraction: Non-negative Matrix Factorization" Journal of Telecommunications & Information Technology, 2 (2018).

[8] G. Skantze, A. Hjalmarsson, and C. Oertel, “Turn-taking, feedback and joint attention in situated human–robot interaction,†Speech Commun., vol. 65, pp. 50–66, 2014.

[9] I. D. Gebru, S. Ba, X. Li, and R. Horaud, “Audio-Visual Speaker Diarization Based on Spatiotemporal Bayesian Fusion,â€IEEE transactions on pattern analysis and machine intelligence, vol. 40, no. 5, may 2018.

[10] L. Bourdev and J. Malik, “Poselets: Body part detectors trained using 3d human pose annotations,†in Proc. IEEE 12th Int. Conf. Comput. Vis., 2009, pp. 1365–1372.

[11] X. Li, L. Girin, R. Horaud, and S. Gannot, “Estimation of relative transfer function in the presence of stationary noise based on segmental power spectral density matrix subtraction,†in Proc. IEEE Int. Conf. Acoust., Speech Signal Process., Apr. 2015, pp. 320–324.

[12] X. Li, R. Horaud, L. Girin, and S. Gannot, “Local relative transfer function for sound source localization,†in Proc. Eur. Signal Process. Conf., Aug. 2015.

View Full Article: