Paper note - [Week 6]

Posted Mar 30, 2024

By Life Zero

1 min read

Paper note - [Week 6]

A customized residual neural network and bi-directional gated recurrent unit-based automatic speech recognition model

Motivation

Voice technology is currently employed in many industries, allowing businesses and consumers to facilitate digitization and automation.
Speech recognition is one of the most challenging computer science topics due to the difficulties of separating similar phonetically sentences and smearing problems.

Contribution

It proposes a stacked five layers of customized ResNets and seven layers of Bi-GRUs, each including a layer normalization based on a learnable element-wise affine parameters approach without the requirement of external language models.
The inclusion of the Gaussian error linear unit (GELU) layer and the dense and dropout layers for the classification tasks showed its worthiness in performance enhancement.
It demonstrates that the volume of the training data significantly affects the model’s output.

Method

Mel spectrogram

Residual neural network

Bi-directional Gated Recurrent Units

Speech recognition model

Experiments

Conclusion

Computer vision

This post is licensed under CC BY 4.0 by the author.

Recently Updated

Trending Tags

Tutorial Paper Theory Machine Learning NLP Skill Note collection

Contents

Trending Tags

Tutorial Paper Theory Machine Learning NLP Skill Note collection

A new version of content is available.