Communications through speech-to-speech pipelines

Dantas, Rafael

Communications through speech-to-speech pipelines

thesis

posted on 2022-12-22, 14:21 authored by Rafael Dantas

Increased urbanisation drastically reduces the area a mobile Internet Service Provider (ISP) may be required to cover to provide a reasonable service. While ISP networks are constantly being upgraded, market pressures may influence the order in which these upgrades will be performed, prioritising denser populational centres while leav ing the more sparsely populated regions of a country relegated to older and slower infrastructure. In such places, there is still value in reducing the amount of data required by Voice over IP (VoIP) applications to execute calls through the Internet. This thesis proposes the combined use of speech-to-text to encode the message being transmitted with a text-to-speech synthesiser to decode the message back into audible waves, resulting in great savings in bandwidth. A black-box experiment was conducted to analyse the performance of 10 popular Mobile VoIP applications on the Android platform with respect to both network usage and perceived quality of the call. There was a clear correlation between the quality reported by the users and the amount of data used by the application to represent the conversation. Finally, another experiment was executed to test the viability of using a speech-to-speech pipeline as a coding method by measuring the average Word Error Rate (WER) of the users when transcribing Semantically Unpredictable Sentences (SUS) presented by either a prerecorded human voice or that of a synthesised one. The experiment demonstrated a very small WER difference between the prerecorded human speeches and the synthesised speeches. In conclusion, the results of our experiments imply that a speech-to-speech pipeline can be used in replacement of regular speech coding for massive data savings at the cost of the extra-textual information. Additionally, this speech-to-speech pipeline can be made entirely independent of traditional Cloud-based solutions.

History

Faculty

Faculty of Science and Engineering

Degree

Master (Research)

First supervisor

Exton, Chris

Second supervisor

Le Gear, Andrew

Note

peer-reviewed

Other Funding information

SFI, ERDF

Language

English

Also affiliated with

LERO - The Irish Software Research Centre

Department or School

Computer Science & Information Systems

Communications through speech-to-speech pipelines

History

Faculty

Degree

First supervisor

Second supervisor

Note

Other Funding information

Language

Also affiliated with

Department or School

Usage metrics

Categories

Keywords

Licence

Exports