We need to capture a live video stream from WebRTC (or any other capturing mechanism from the client webcam, even if it is not supported on all browsers, but as a PoC).
This live video needs to be handled by a server component (ASP.Net MVC / Web API), I imagine that the code on the server will look like:
[HttpPost]
public ActionResult HandleVideoStream(Stream videoStream)
{
//Handle the live stream
}
Looking for any keyword or helpful link.
We have already implemented a way to send individual frames using base64 jpg, but this is not useful at all, because there is a huge overhead of the base64 encoding and because we could use any video encoding to send the video more efficiently (send the difference between the frames using VPx -vp8- for example), the required solution needs to capture a video from the webcam of the client and send it live (not recorded) to the server (asp.net) as a stream -or chunks of data representing the new video data-.
Your question is too broad and asking for off-site resources is considered off-topic on stackoverflow. In order to avoid opinion-prone statements I will restrict the answer to general concepts.
Flash/RTMP
WebRTC
is not yet available on all browser so the most widely used way of capturing webcam input from a browser currently in use is via a plugin. The most common solution uses the Adobe Flash Player, whether people like it or not. This is due to the H.264
encoding support in recent versions, along with AAC
, MP3
etc. for audio.
The streaming is accomplished using the RTMP protocol which was initially designed for Flash communication. The protocol works on TCP
and has multiple flavors like RTMPS
(RTMP
over TLS/SSL
for encryption), RTMPT
(RTMP
encapsulated in HTTP
for firewall traversal).
The stream usually uses the FLV container format.
You can easily find open-source projects that use Flash to capture webcam input and stream it to an RTMP
server.
On the server-side you have two options:
- implement a basic
RTMP
server to talk directly to the sending library and read the stream
- use one of the open-source
RTMP
servers and implement just a client in ASP
(you can also transcode the incoming stream on the fly depending on what you're trying to do with your app).
WebRTC
With WebRTC
you can either:
- record small media chunks on a timer and upload them on the server where the stream is reconstructed (needs concatenating and re-stamping the chunks to avoid discontinuities). See this answer for links.
- use the peer-to-peer communication features of
WebRTC
with the server being one of the peers.
A possible solution for the second scenario, which I haven't personally tested yet, is offered by Adam Roach:
- Browser retrieves a webpage with javascript in it.
- Browser executes javascript, which:
- Gets a handle to the camera using
getUserMedia
,
- Creates an
RTCPeerConnection
- Calls
createOffer
and setLocalDescription
on the
RTCPeerConnection
- Sends an request to the server containing the offer (in
SDP
format)
- The server processes the offer
SDP
and generates its own answer SDP
,
which it returns to the browser in its response.
- The JavaScript calls
setRemoteDescription
on the RTCPeerConnection
to start the media flowing.
- The server starts receiving
DTLS/SRTP
packets from the browser,
which it then does whatever it wants to, up to and including storing
in an easily readable format on a local hard drive.
Source
This will use VP8
and Vorbis
inside WebM
over SRTP
(UDP
, can also use TCP
).
Unless you can implement RTCPeerConnection
directly in ASP
with a wrapper you'll need a way to forward the stream to your server app.
The PeerConnection API
is a powerful feature of WebRTC
. It is currently used by the WebRTC version of Google Hangouts. You can read: How does Hangouts use WebRTC.
Agreed that this is an off-topic question, but I recently bumped into the same issue/requirement, and my solution was to use MultiStreamRecorder from WebRTCExperiments. This basically gives you a "blob" of the audio/video stream every X seconds, and you can upload this to your ASP.NET MVC or WebAPI controller as demonstrated here. You can either live-process the blobs on the server part by part, or concatenate them to a file and then process once the stream stops. Note that the APIs used in this library are not fully supported in all browsers, for example there is no iOS support as of yet.
My server side analysis required user to speak full sentences, so in addition I used PitchDetect.js to detect silences in the audio stream before sending the partial blob to server. With this type of setup, you can configure your client to send partial blobs to server after they finish talking, rather than every X seconds.
As for achieving 1-2 second delay, I would suggest looking into WebSockets for delivery, rather than HTTP POST - but you should play with these options and choose the best channel for your requirements.
Most IP cameras these days will use H264 encoding, or MJPEG. You aren't clear about what sort of cameras are being used.
I think the real question is, what components are out there for authoring/editing video and which video format does it require. Only once you know what format you need to be in, can you transcode/transform your video as necessary so you can handle it on the server side.
There are any number of media servers to transform/transcode, and something like FFMPEG or Unreal Media Server can transform, decode, etc on server side to get it to some format you can work with. Most of the IP cameras I have seen just use an H264 web based browser player.
EDIT: Your biggest enemy is going to be your delay. 1-2 seconds of delay is going to be difficult to achieve.