Certain cloud services may offer a Web Socket streaming connection as an alternative to firing repeated REST requests or polling. To make working with REST APIs in Unity more convenient I built a REST client for Unity based on UnityWebRequest which supports abstract types for JSON / XML serialisation. But given a real-time scenario like “speech to text” using a Web Socket client instead of REST gives an option to stream the audio data and get intermediate results back which provides responsive feedback to users for an improved user experience.
|Bing Speech||REST||Web Socket|
|Audio stream duration||15 secs||180 secs - 10 mins|
|Stream audio with intermediate results||No||Yes|
Also when it comes to client app development in Unity there are a couple of very useful message events you receive from the WebSockets server:
To use Web Sockets in Unity you can use the WebSocket-Sharp library, but this only supports the Unity Editor and the mono target platforms. In order to use Web Sockets when targeting Windows Mixed Reality headsets you have to use Universal Windows Platform (UWP) APIs like MessageWebSocket. To make things easier I have created a common Unity Web Socket interface to use WebSocket-Sharp inside the Editor and mono platforms and then use
MessageWebSocket when targeting the Windows Store App platform for MR headsets.
|ConfigureWebSocket(url)||Configures web socket with url and optional headers|
|ConnectAsync()||Connect to web socket|
|CloseAsync()||Close web socket connection|
|SendAsync(data)||Send binary byte or UTF-8 text string with optional callback|
|IsOpen()||Check if web socket status is open|
|Url()||Return the URL being used by the web socket|
OnError(object sender, WebSocketErrorEventArgs e); OnOpen(object sender, EventArgs e); OnMessage(object sender, WebSocketMessageEventArgs e); OnClose(object sender, WebSocketCloseEventArgs e);
If you are interested in the example above there I have also prepared a Unity Web Socket demo project to show Bing Speech service to LUIS for controlling scene game objects using natural speech commands for Mixed Reality scenarios.