Created several proof-of-concept webapps in 2017 experimenting with ideas that will make virtual meetings more immersive. Our approach was largely inspired by the Star Wars™ films. If you remember, the Jedi Council held meetings in which remote participants were sitting in seats using holographic projections of themselves and vs. versa. I prototyped the same two-way immersive meeting idea using WebGL 3-D and WebRTC in web browsers.
The final prototype used 180° and 360° panoramic cameras and monitors placed at each seat in the physical meeting room where each remote participant would have a seat at the table. (It was our substitute for the per seat holographic projections in Stars Wars™. To save costs, two remote participants with adjacent seats in the meeting room shared the same 360° camera as shown in the diagram below.) The virtual seats under control by the remote participant could then interact with the physical participants and other remote participants in the meeting room. Each remote seat used an Intel Skull Canyon gaming NUC running Linux. Each remote seat was portable and on wheels so that the hardware could be quickly rolled to the meeting table in place of the real seat positions.
The 360° camera at each station (or pair of adjacent stations) allowed the remote participants to look anywhere in the meeting room as displayed on their remote laptop screens. Likewise, meeting room participants could see the remote participant's faces on the associated virtual seat monitors and also see which way their 140° slice of the camera stream was pointing by using CSS 3-D perspective warping of the remote person's face to show the viewing angle of each participant's video slice. Even though pairs of remote participants shared the same 360° camera, each remote participant was given independent control over their virtual camera direction. This was possible because the 360° camera was providing everything to each remote participant's laptop. That way the software in each laptop could select its own 140° virtual camera view via WebGL (with GPU acceleration). Digital zooming within the 140° slice was also implemented.
Each remote participant had a 140° wide-screen view of the room on their laptop monitors but the view was divided into 3 adjacent subviews. There was a normal rectilinear subview in the center and two subviews on either side perspectively warped so that each remote user would see a total 140° panoramic on a standard laptop monitor. The side subviews were shown angled-in and distorted to give the user a kind of ultra wide 3-screen theater experience.
The important thing was that no web browser plug-ins were needed. Any web browser capable of supporting WebGL (with enough GPU power) and WebRTC would work. This was my first deep dive into WebGL or WebRTC. For WebRTC signaling we used the open-source Signal Master signaling server along with our own STUN and TURN servers. The setup is shown below:
The system was monitored via MQTT messaging and a NodeRed graphical UI to provide business intelligence and stats on how well the prototypes were working.
One of the problems with using WebRTC were firewall and NAT router issues even within the corporate network. This forced us to put more effort getting the TURN server to work over TCP. Most of the issues were related to full-cone vs. symmetric NAT routers and whether UPD ports could be reused. Here are some typical scenarios we needed to solve diagrammed below:
I was also aware that this prototype solution would not scale well but we were targeting for less than 5 remote participants per meeting.
The software/network flow of the final prototype is diagrammed below:
The 360° camera provides either a distorted dual-fisheye video or a distorted equirectangular video. In either case, WebGL (with GPU acceleration) is needed to project the distorted video onto a sphere in 3-D space and then a virtual camera within the 3-D scene can select a 140° slice of the projected video and turn it into a rectilinear view that can be segmented into 3 adjacent video sections. GPU acceleration is required because the 3-D textures are being updated in real-time at 30 or more FPS. (And yes, the Theta S 360° camera and Skull Canyon NUC did run hot and needed some external cooling to keep from going into thermal shutdown.). I used the three.js WebGL library to handle most of the 3-D rendering task but I still had to create custom UV maps for the dual-hemisphere and 180° PanaCast 2 cameras.
The direction of the virtual camera was transmitted back to the NUC at the virtual participant's seat. The software within the NUC then perspectively warped the remote participant's face to provide visual feedback to those in the room of the angle that the remote participant view was pointing. That way the in-room participants could see if a remote participant was looking towards a particular in-room participant or something else. This made the overall meeting experience more immersive for both local and remote participants.
Here is a more detailed diagram of how the 3-D rendering was done:
The 3-D scene was transformed back into a set of 2-D canvases by WebGL/Three.js that could then be displayed in real-time by the render loop. The success of this advanced prototype landed me a research position at Barco Labs a month later.