I preface
WebRTC (Web real time communication) aims to introduce the real-time communication function into browsers. Users can carry out real-time communication between browsers without installing any other software or plug-ins. This paper introduces the realization of one-to-one audio and video real-time chat room function based on WebRTC. The browser requests the Web server front-end page running elements (HTML, CSS, JS) through HTTP, and then interacts with the signaling server. The signaling service provides functions such as room management and signaling message forwarding, and the media data is transferred through STUN/TURN server.
II Function display
As shown in the figure above, it is a 1v1 chat room. After joining the room, users can chat with other users in the room, or click the audio and video call button to make audio and video calls.
III Architecture description
The above is the description of the architecture of one-to-one audio and video chat room. On the left is the Call initiator process, and on the right is the Call callee process. The initiator first detects the audio and video equipment and collects the audio and video equipment data, then creates a PeerConnection connection (peer-to-peer connection) with the STUN/TURN server, generates an offer SDP and sends it to the callee. After receiving the initiator's SDP, the callee also starts the audio and video equipment detection and collects the audio and video data, and then generates an answer SDP and sends it to the initiator.
IV code implementation
1. Server implementation
a. Implementation of Web + signal server
First, we need to build a Web server for the browser to pull the front-end running elements. This paper selects nodejs express module, which can easily build a static file resource server. As long as the server exposes the path where the front-end code is located, the browser can request through HTTP.
Secondly, we also need to build a signaling server, Signal Server, which maintains the room status and provides message forwarding and other functions. We can receive and process user requests through HTTP.
The following shows some codes of the server (Web+Signal Server). It exposes the public directory through express. The user obtains the front-end code and sends a request message to the signaling in combination with the user's operation, such as joining the room request join_req, exit room request leave_req and some control messages that need to be forwarded by the signaling server to other users in the same room.
// Note: the following code is incomplete and only key parts are provided var logger = log4js.getLogger(); var app = express(); app.use(serveIndex('./public')); app.use(express.static('./public')); var options = { key : fs.readFileSync('./certificate/server.key'), cert : fs.readFileSync('./certificate/server.pem') } var httpsServer = https.createServer(options, app); httpsServer.listen(443, '0.0.0.0'); var httpsIO = socketIO.listen(httpsServer); httpsIO.sockets.on('connection', (socket) => { socket.on('join_req', (roomId, userName) => { socket.join(roomId); var room = httpsIO.sockets.adapter.rooms[roomId]; var memberNumInRoom = Object.keys(room.sockets).length; if (memberNumInRoom > MAX_NUMBER_IN_ROOM) { socket.emit('member_full', roomId, socket.id); socket.leave(roomId); return; } fetchUserInRoom(socket, roomId); addUserToRoom(roomId, userName); logger.info(`${userName} join room: (${roomId}), current number in room is ${memberNumInRoom+1}`); socket.emit('join_res', roomId, socket.id); socket.to(roomId).emit('other_joined', roomId, socket.id, userName); }); socket.on('leave_req', (roomId, userName) => { socket.leave(roomId); var room = httpsIO.sockets.adapter.rooms[roomId]; if (!room) { return; } var memberNumInRoom = Object.keys(room.sockets).length; delUserFromRoom(roomId, userName); logger.info(`${userName} leave room: (${roomId}), current number in room is ${memberNumInRoom-1}`); socket.emit('leave_res', roomId, socket.id); socket.to(roomId).emit('other_leaved', roomId, socket.id, userName); }); socket.on('ctrl_message', (roomId, data) => { logger.debug('recv a ctrl message from ' + socket.id); socket.to(roomId).emit('ctrl_message', roomId, socket.id, data); }); socket.on('start_call', (roomId, data) => { logger.debug("recv start_call from " + socket.id); socket.to(roomId).emit('start_call', roomId, socket.id, data); }); socket.on('message', (roomId, data) => { logger.debug('recv a message from ' + socket.id); socket.to(roomId).emit('message', roomId, socket.id, data); }); });
b. Set up STUN/TURN server
As shown in the above architecture diagram, we also need to build a STUN/TURN server. The STUN server mainly provides the function of notifying the client of the address after NAT mapping. The client knows its address after NAT mapping and forwards it to the opposite end. The opposite end will also send the address after NAT mapping to the local end. The two ends try to conduct P2P communication. If the P2P connection is unsuccessful, the data needs to be transferred through the TURN server, The TURN server is generally deployed on the machine with public network IP to ensure that two clients not in the same network environment can transfer data through the TURN server in case of P2P failure.
If you are not familiar with the working principle of STUN/TURN, you can refer to it How STUN works,How TURN works , you can use coturn to build STUN/TURN servers. Please refer to This blog.
2. Client implementation
a. Create RTCPeerConnection
RTCPeerConnection is a component of WebRTC to realize point-to-point communication. It provides a way to connect peers and conduct data communication. Peer to peer connections need to listen to events and execute callback when events occur. For example, when receiving icecandidate events, they need to send local candidates to the opposite end, receive media track events, etc.
var ICE_CFG = { 'iceServers': [{ 'url': 'stun:stun.l.google.com:19302' }, { 'url': 'turn:xxx.xxx.xxx.xxx:3478', 'username': 'xxx', 'credential': 'xxx' }] }; var pConn = new RTCPeerConnection(ICE_CFG);
STUN:stun. l.google. When the TURN: com relay service fails, you can use the TURN: com relay service to set up the P2P server. In this way, you can use the TURN: com relay service to pass through the P2P server.
b. Get the local stream and add it to PeerConnection
WebRTC provides getUserMedia API. Users can easily access audio and video devices to obtain streaming data.
Basic usage: VAR promise = navigator mediaDevices. getUserMedia(constraints);
For the use of getUserMedia API, please refer to This blog.
function GetLocalMediaStream(mediaStream) { localStream = mediaStream; localAv.srcObject = localStream; localStream.getTracks().forEach((track) => { myPconn.addTrack(track, localStream); }); if (isInitiator == true) { var offerOptions = { offerToReceiveAudio: 1, offerToReceiveVideo: 1 }; myPconn.createOffer(offerOptions) .then(GetOfferDesc) .catch(HandleError); } }
c. Media consultation
For the call initiator, after adding the audio and video track to RTCPeerConnection, it can generate the local SDP information through the createOffer method and set it into RTCPeerConnection using setLocalDescription, and then send the SDP information to the callee buddy through signaling. After receiving the offer SDP, buddy sets the SDP to remote sdp using RTCPeerConnection setRemoteDescription, Then generate and set the local SDP message through RTCPeerConnection createAnswer method, and finally send it to the initiator through signaling.
Please refer to the introduction of SDP This blog For the introduction of createOffer, please refer to This blog.
if (isInitiator == true) { var offerOptions = { offerToReceiveAudio: 1, offerToReceiveVideo: 1 }; myPconn.createOffer(offerOptions) .then(GetOfferDesc) .catch(HandleError); } function GetOfferDesc(desc) { myPconn.setLocalDescription(desc); socket.emit('ctrl_message', roomId.value, desc); } function CreateAnswerDesc() { myPconn.createAnswer().then(GetAnswerDesc).catch(HandleError); } function GetAnswerDesc(desc) { myPconn.setLocalDescription(desc); socket.emit('ctrl_message', roomId.value, desc); }
socket.on('ctrl_message', (roomId, socketId, data) => { if (data) { if (data.hasOwnProperty('type') && data.type === 'offer') { HandleOfferDesc(data); CreateAnswerDesc(); } else if (data.hasOwnProperty('type') && data.type === 'answer') { HandleAnswerDesc(data); } else if (data.hasOwnProperty('type') && data.type === 'candidate') { HandleCandidate(data); } else { console.log('Unknow ctrl message, ' + data); } } });
d. Swap candidate
As follows, when creating a PeerConnection, it will listen to the onicecandidate callback function. For example, for the call initiator, the candidate collection will be started after calling setLocalDescription. When the collection is completed, it will call back the onicecandidate method, and then send the local candidate information to the opposite end. After receiving the local candidate, the opposite end can start to create a connection and select available connections from many connections for calls.
For a detailed description of candidate, please refer to This blog , this blog introduces host candidate, srflx candidate, relay candidate, etc.
When creating peer connection, it also sets the # track callback function, which is the processing after receiving the peer stream data. The demo sets the stream to the < video > tag.
function CreatePeerConnection() { if (myPconn) { console.log('peer connection has already been created.'); return; } myPconn = new RTCPeerConnection(ICE_CFG); myPconn.onicecandidate = (event) => { if (event.candidate) { socket.emit('ctrl_message', roomId.value, { type: 'candidate', label: event.candidate.sdpMLineIndex, id: event.candidate.sdpMid, candidate: event.candidate.candidate }); } } myPconn.ontrack = GetRemoteMediaStream; }
function HandleCandidate(data) { var candidate = new RTCIceCandidate({ sdpMlineIndex: data.label, sdpMid: data.id, candidate: data.candidate }); myPconn.addIceCandidate(candidate); }
function GetRemoteMediaStream(event) { if(event && event.streams.length > 0) { remoteStream = event.streams[0]; remoteAv.srcObject = remoteStream; } }