Publications
-
Jaiswal, Rahul Kumar; Elnourani, Mohamed; Deshmukh, Siddharth & Beferull-Lozano, Baltasar
(2023).
Location-free Indoor Radio Map Estimation using Transfer learning,
2023 IEEE 97th Vehicular Technology Conference (VTC2023-Spring).
IEEE conference proceedings.
ISSN 979-8-3503-1114-3.
doi:
10.1109/VTC2023-Spring57618.2023.10200979.
Full text in Research Archive
Show summary
Accurate estimation of radio maps is important for various applications of wireless communications, such as network planning, and resource allocation. To learn accurate radio map models, one needs to have accurate knowledge of transmitter and receiver locations. However, it is difficult to obtain accurate locations in practice, especially, in scenarios having a high degree of wireless multi-path. Alternatively, time of arrival (ToA) features, which are easier to obtain, can be employed for estimating radio maps. To this end, this paper investigates the application of transfer learning method using ToA features for estimating radio maps under indoor wireless communications. The performance is compared with the scenarios where only the locations of receivers and both ToAs and locations of receivers, are used for estimating radio maps, assuming that locations are known. Due to the changes in propagation characteristics, a radio map model learned in a specific wireless environment cannot be directly employed in a new wireless environment. To address this issue, a data-driven transfer learning method is designed that transfers and fine-tunes a deep neural network model learned for a radio map from a source wireless environment to other distinct (target) wireless environments. Our proposed method predicts the training data required in the new wireless environments using a data-driven similarity measure. Our results demonstrate that using ToA (location-free) features results in a superior performance for estimating radio maps in terms of the necessary number of sensor measurements for estimating radio maps with a good accuracy, as compared to a location-based approach, where it may be difficult to have accurate location estimations. It leads to a saving of 70-90% of the necessary sensor measurement data for a mean square error (MSE) of 0.004.
-
Jaiswal, Rahul Kumar & Dubey, Rajesh Kumar
(2023).
CAQoE: A Novel No-Reference Context-aware Speech Quality Prediction Metric.
ACM Transactions on Multimedia Computing, Communications, and Applications (TOMCCAP).
ISSN 1551-6857.
19(1),
p. 1–23.
doi:
10.1145/3529394.
Show summary
The quality of speech degrades while communicating over Voice over Internet Protocol applications, for example, Google Meet, Microsoft Skype, and Apple FaceTime, due to different types of background noise present in the surroundings. It reduces human perceived Quality of Experience (QoE). Along this line, this article proposes a novel speech quality prediction metric that can meet human’s desired QoE level. Our motivation is driven by the lack of evidence showing speech quality metrics that can distinguish different noise degradations before predicting the quality of speech. The quality of speech in noisy environments is improved by speech enhancement algorithms, and for measuring and monitoring the quality of speech, objective speech quality metrics are used. With the integration of these components, a novel no-reference context-aware QoE prediction metric (CAQoE) is proposed in this article, which initially identifies the context or noise type or degradation type of the input noisy speech signal and then predicts context-specific speech quality for that input speech signal. It will have of great importance in deciding the speech enhancement algorithms if the types of degradations causing poor speech quality are known along with the quality metric. Results demonstrate that the proposed CAQoE metric outperforms in different contexts as compared to the metric where contexts are not identified before predicting the quality of speech, even in the presence of limited size speech corpus having different contexts available from the NOIZEUS speech database.
-
Jaiswal, Rahul Kumar & Dubey, Rajesh Kumar
(2023).
Multiple time-instances features based approach for reference-free speech quality measurement.
Computer Speech and Language.
ISSN 0885-2308.
79.
doi:
10.1016/j.csl.2022.101478.
Show summary
This paper investigates the problem of measuring speech quality of received speech signal without employing the original speech signal. The problem of deterioration of the speech quality arises due to noise present in the surroundings. To this line, we propose a multiple time-instances (MTI) features-based approach for reference-free speech quality measurement model. A voice activity detector (VAD) is exploited primarily for calculating the number of active speech chunks of a speech signal. For these chunks and their successive combinations called here batches, multi-resolution auditory model (MRAM), mel-frequency cepstral coefficients (MFCC) and line spectral frequencies (LSF) features are extracted and called as MTI features. It is hypothesized that the MTI features are capable in capturing the distortions caused by time-localized effects of short-time transients, impulsive noise, and its differences from the plosive sounds. The MTI metric estimates (MTI-ME) are calculated corresponding to these MTI features employing the Gaussian mixture model (GMM) probabilistic technique. The overall objective speech quality of a speech signal is then determined as a linear combination of optimally weighted MTI-ME corresponding to distinct active speech chunks and their successive combinations, that is, batches of that speech signal. Minimum mean square error criterion or Pearson’s correlation maximization criterion is employed for computing optimal weights. In addition, a deep neural network (DNN)-based speech quality model is also developed for calculating a single objective speech quality while considering all active speech chunks together. Pearson’s correlation coefficient and weighted average correlation are exploited for evaluating the performance. Results demonstrate that the proposed model achieves promising improvement over the standard speech quality model (P.563) and improves correlation values by around 37%.
-
-
Jaiswal, Rahul Kumar; Deshmukh, Siddharth; Elnourani, Mohamed & Beferull-Lozano, Baltasar
(2022).
Transfer Learning Based Joint Resource Allocation for Underlay D2D Communications,
2022 IEEE Wireless Communications and Networking Conference (WCNC).
IEEE conference proceedings.
ISSN 978-1-6654-4266-4.
p. 1479–1484.
doi:
10.1109/WCNC51071.2022.9771636.
Full text in Research Archive
Show summary
In this paper, we investigate the application of transfer learning to train a Deep Neural Network (DNN) model for joint channel and power allocation in underlay device-todevice (D2D) communication. Based on the traditional optimization solutions, generating training dataset for scenarios with perfect channel state information (CSI) is not computationally demanding, compared to scenarios with imperfect CSI. Thus, a transfer learning-based approach can be exploited to transfer the DNN model trained for the perfect CSI scenarios to the imperfect CSI scenarios. We also consider the issue of defining the similarity between two types of resource allocation tasks. For this, we first determine the value of outage probability for which two resource allocation tasks are same, that is, for which our numerical results illustrate the minimal need of relearning from the transferred DNN model. For other values of outage probability, there is a mismatch between the two tasks and our results illustrate a more efficient relearning of the transferred DNN model. Our results show that the learning dataset required for relearning of the transferred DNN model is significantly smaller than the required training dataset for a DNN model without transfer learning.
-
Jaiswal, Rahul Kumar; Elnourani, Mohamed; Deshmukh, Siddharth & Beferull-Lozano, Baltasar
(2022).
Deep Transfer Learning Based Radio Map Estimation for Indoor Wireless Communications
,
2022 IEEE 23rd International Workshop on Signal Processing Advances in Wireless Communication (SPAWC).
IEEE conference proceedings.
ISSN 978-1-6654-9455-7.
doi:
10.1109/SPAWC51304.2022.9833974.
Full text in Research Archive
Show summary
This paper investigates the problem of transfer
learning in radio map estimation for indoor wireless communications, which can be exploited for different applications, such
as channel modelling, resource allocation, network planning,
and reducing the number of necessary power measurements.
Due to the nature of wireless communications, a radio map
model developed under a particular environment can not be
directly used in a new environment because of the changes in
the propagation characteristics, thus creating a new model for
every environment requires in general a large amount of data
and is computationally demanding. To address these issues, we
design an effective novel data-driven transfer learning procedure
that transfers and fine-tunes a deep neural network (DNN)-based
model for a radio map learned from an original indoor wireless
environment to other different indoor wireless environments. Our
method allows to predict the amount of training data needed in
new indoor wireless environments when performing the operation
of transfer learning using our similarity measure. Our simulation
results illustrate that the proposed method achieves a saving of
60-70% in sensor measurement data and is able to adapt to a new
wireless environment with a small amount of additional data.
-
Jaiswal, Rahul Kumar
(2022).
Automatic Noise Class Detection For Improving Speech Quality Using Artificial Neural Networks,
2022 7th International Conference on Communication and Electronics Systems (ICCES).
IEEE conference proceedings.
ISSN 978-1-6654-9634-6.
p. 1285–1288.
doi:
10.1109/ICCES54183.2022.9836006.
Show summary
Speech quality degrades in noisy environments while using a particular voice over internet protocol (VoIP) application, for example, Microsoft Skype, Apple Facetime, etc. Speech enhancement algorithms separate noise from the degraded speech. No-reference speech quality metric (SQM), such as P.563, measures the quality of speech. To this end, this research study develops a novel approach for extracting features from the noisy speech samples available from the NOIZEUS corpus for detecting the noise class (noise type and SNR) using deep neural network (DNN). It integrates speech enhancement algorithms with SQM to estimate the speech quality (MOS score) from the noisy samples, which is then used as a feature vector to train a DNN. Results demonstrate that the DNN outperforms in detecting the noise class as compared to the machine learning classifiers, tested with different noise classes. This suggests to develop a noise-sensitive speech quality prediction model for real-time measuring and monitoring of the quality of speech.
-
-
Dayal, Aveen; Yeduri, Sreenivasa Reddy; Koduru, Balu Harshavardan; Jaiswal, Rahul Kumar; J, Soumya & M. B., Srinivas
[Show all 8 contributors for this article]
(2022).
Lightweight deep convolutional neural network for background sound classification in speech signals.
Journal of the Acoustical Society of America.
ISSN 0001-4966.
151(4),
p. 2773–2786.
doi:
10.1121/10.0010257.
Show summary
Recognizing background information in human speech signals is a task that is extremely useful in a wide range of practical applications, and many articles on background sound classification have been published. It has not, however, been addressed with background embedded in real-world human speech signals. Thus, this work proposes a lightweight deep convolutional neural network (CNN) in conjunction with spectrograms for an efficient background sound classification with practical human speech signals. The proposed model classifies 11 different background sounds such as airplane, airport, babble, car, drone, exhibition, helicopter, restaurant, station, street, and train sounds embedded in human speech signals. The proposed deep CNN model consists of four convolution layers, four max-pooling layers, and one fully connected layer. The model is tested on human speech signals with varying signal-to-noise ratios (SNRs). Based on the results, the proposed deep CNN model utilizing spectrograms achieves an overall background sound classification accuracy of 95.2% using the human speech signals with a wide range of SNRs. It is also observed that the proposed model outperforms the benchmark models in terms of both accuracy and inference time when evaluated on edge computing devices.
-
Jaiswal, Rahul Kumar
(2021).
Influence of Silence and Noise Filtering on Speech Quality Monitoring,
2021 International Conference on Speech Technology and Human-Computer Dialogue (SpeD).
IEEE conference proceedings.
ISSN 978-1-6654-2786-9.
p. 109–113.
doi:
10.1109/SpeD53181.2021.9587364.
-
Jaiswal, Rahul Kumar & Dubey, Rajesh Kumar
(2021).
Concatenative Text-to-Speech Synthesis System for Communication Recognition,
2021 5th International Conference on Electronics, Communication and Aerospace Technology (ICECA).
IEEE conference proceedings.
ISSN 978-1-6654-3524-6.
doi:
10.1109/ICECA52323.2021.9675855.
-
Jaiswal, Rahul Kumar & Romero, Daniel
(2021).
Implicit Wiener Filtering for Speech Enhancement In Non-Stationary Noise,
2021 11th International Conference on Information Science and Technology (ICIST).
IEEE conference proceedings.
ISSN 978-1-6654-1266-7.
p. 39–47.
doi:
10.1109/ICIST52614.2021.9440639.
Full text in Research Archive
-
Jaiswal, Rahul Kumar & Hines, Andrew
(2020).
Towards a Non-Intrusive Context-Aware Speech Quality Model,
2020 31st Irish Signals and Systems Conference (ISSC).
IEEE conference proceedings.
ISSN 978-1-7281-9418-9.
doi:
10.1109/ISSC49989.2020.9180171.
View all works in Cristin
-
Jaiswal, Rahul Kumar; Elnourani, Mohamed; Deshmukh, Siddharth & Beferull-Lozano, Baltasar
(2023).
Location-free Indoor Radio Map Estimation using Transfer learning.
-
Jaiswal, Rahul Kumar
(2021).
Speech Activity Detection under Adverse Noisy Conditions at Low SNRs.
-
View all works in Cristin
Published
Apr. 16, 2024 11:33 AM