Facilitating speech recognition through computing devices is generally known. For example, Android™ operating system provides application programming interface (API)—android.speech for android apps to receive transcripts translated from speech inputs by users. iOs on the other hand uses remote speech recognition service—Siri for recognizing speech inputs from users. Various other operating systems like Microsoft Windows™ provide speech recognition development tool kits for application developers to include program codes that perform speech recognition during runtime of an application running on those operating systems.
Conventional development of a speech recognition feature in a cross-platform application typically involves separate coding to employ platform specific speech recognition APIs for evoking speech recognition functionalities provided by different operating systems. For example, for developing an application running on Android, iOS and other operating systems, the developer(s) of the application is required to have knowledge of speech recognition APIs provided by those operating systems for evoking speech recognition functionalities provided by the client devices installed with those operating systems. The code base of such an application thereby may comprise platform specific code sections corresponding to those APIs; or in some other cases, multiple versions of the application corresponding to different operating systems may be deployed.
Some speech recognition packages were developed to encapsulate platform specific speech recognition APIs. Those packages typically provide their own APIs with generic functional controls of speech recognition functionalities independent from underlying operating systems. While this approach somewhat reduces maintenance and programming effort for developing and deploying cross-platform applications with speech recognition features, the selection of speech recognition functionalities for different operating systems in an application employing such an package is typically done statically at a configuration stage, e.g., during the development stage of the application. Under this approach, the decision of evoking specific speech recognition functionality for a type of operating system is typically predetermined by the provider of the package (e.g., hardcoded in the package) regardless speech recognition functionalities actually available on client devices at runtime. For example, an application employing such a package is typically linked with android.speech for deployment on Android devices as hardcoded by the package regardless whether the Android devices will actually have android.speech or some other speech functionality available on device.