In early this July I wanted to start an AI and chose HuggingFace Spaces as the framework to begin with. After looking through several collections of HF models , I decided to build one app by myself.
Previously I have written UI for some personal projects from scratch, with little code being reused. Getting interested with taks in Natural Language Processing
and Audio
aspects, I planned to build an unified UI framework apply to all scenario. Through the development process I gradually recognized the important of auto-detection function of Language-Audio generation models, and finally I got an idea in my mind to which is a bit different from other demos in HF, that is an app to help us detect somethings that audio said about, then conveniently it also translate them to help us understand Audio_Transcribe_Translate. Fortunately those intergrated models are still from now, and their application are on my HF Hub, STT Translate & Transcribe.
As a new-born project it will inevitably have problems and defects. The development will be a progressive process and continuous effort will be devoted to improve the quality and add new features. If you found a bug or have features request, please don’t hesitate to raise an issue and let me know.