Skip to content
ldang edited this page Jan 19, 2014 · 26 revisions

Goal

Develop software/tools to run on the A/V capture computer to handle audiovideo streaming and archiving for the Southern California Linux Expo (SCaLE). We are investigating the use of GStreamer. (See GStreamer notes)

At SCaLE, we will have a PTZ (Pan-Tilt-Zoom) dome camera in various conference rooms, streaming video via RTSP (Real Time Streaming Protocol). Audio will be hooked up from handheld or beltpack microphones into the camera. We potentially may have scan converters and video encoders to capture what is being projected to the screen.

Minimum requirement

We need to be able to stream the video to Youtube and/or ?. We also need to save the files to disk to later upload to Youtube.

Due to the odd sampling rate and format of the audio output from the camera, we would need to filter the RTSP stream

  • Keep the video stream as-is.
  • Upsample the audio and re-encode it in 48kHz aac-lc
  • Multiplex (Mux) the video and audio

Additional requirements

Capture slides

Capture slides in a separate stream using a scan converter and video encoder; display both the speaker and the slides using picture-in-picture

  • @omwah, from an archival point of view, thinks that should write video stream, slide stream, and then the picture-in-picture to a file.
  • @mproctor13 is concerned that there is no way to sync the slides to the video. The easiest way is to put them in the same stream.
  • A possible solution is to add timecodes, but the way it's done professionally is to have an external clock on the cameras; if we do it via GStreamer, it is too late.
  • @jbermudes suggestedjust start/stop the recording for each session. But @mproctor is concerned about start/stop, because there is a strong possibility that the operator will forget to start and video won't be captured.

@mproctor13 suggested another possible solution to handle slides that would also solve the issue of being able to display SCALE content on a projector between sessions. See below.

Display SCALE content on projector between sessions

When there is no speaker in a room, or between sessions, it would be nice to be able to display our own content instead of a blank screen.

@mproctor13: We can replace the M7001 with a usb video capture dongle and a linux system (like our existing SCALE A/V little computers or Raspberry Pis). We could use a ~$25 kvm to switch between the speaker laptop and recording computer which would be displaying slides. Bonus to this solution is that every room is an overflow room.

This opens up new items to investigation:

  • How do you control the KVM? We either need a human to flip the switch, or make it possible for the computer to control the KVM.

@mproctor13: "I was assuming that we would control the KVM from the computer, I think those computers have parallel ports which make for easy relay flipping and maybe bit-banging i2c to talk to edid on vga connector. Pushing the buttons on scan converter and easy i2c bus access was the original reason I thought of raspberry pi. But maybe if those computers have parallel ports that will work. Or we just detect when there is video coming from speaker laptop and automatically switch to it.

  • How do we make sure the speaker laptop outputs at the correct resolution for the scan converter? This might involve either hacking VGA cables or the scan converter, to handle the resolution negotiation. Resolution: We discovered that the scan converter does seem to manage the resolution negotiation well, and on our side, what we can do is write a note specifying 1024x768 @ 60Hz.

  • Will the Raspberry Pi work for our purposes? Does the software work? Is it stable? Can it keep up? (See Raspberry Pi notes)

  • Do the SCALE A/V computers from last year have parallel ports we can use to control scan converter and Raspberry Pi?

                                          |-----------------------|                 |-----------------------|           |------------------------| 
                                          |                               | s-video  |                               | USB  |                                |
VGA from speaker ---->| scan converter    |------------|    USB Video       |------   | Slides Computer|
Audio from speaker--|   |                               |                |                               |            |                               |
                                      |   |-----------------------|                 |-----------------------|            |-----------------------|
                                      |                 |
                                      |                 |      
                                      ---|-----------------------|        |-----------------------| 
                                          |                               |        |                              | 
VGA from slides     ---->|          KVM             |------|    Projector          |
Audio from slides          |                               |        |                              |
                                          |-----------------------|        |-----------------------|  
                                                        |
                                                        |
                                          |-----------------------| 
                                          |                               |
                                          |   Sound system  |
                                          |                               |
                                          |-----------------------| 

As an added bonus, provide a web interface, where we can decide whether to stream video from another room or just display slides. See subsequent requirement.

Support overflow rooms

There are times when a presentation is standing room only. Last year, we were able to support overflow to empty conference rooms by streaming the video via VLC on one of the SCALE A/V computers. (We used a bash script that invoked cvlc.)

It would be nice to offer people the ability to vote on what session to view within a particular room. @LeStarch is interested in this.

Adjust/boost the volume on the audio

  • An issue with the SCaLE videos on YouTube is that the audio seems to be quite a bit lower than other videos on YouTube, maybe 50% quieter or more.

Cut video at SCaLE

To cut down on the time and workload involved in getting the videos ready to be posted online, we should figure out a way to cut the video at SCaLE. A way to partly automate this is:

  • Develop a mobile app that can be used to just record the start/stop times for prepopulated sessions in various conference rooms. (Ideally, the timestamp comes from the video capture machine.)
  • Use the output from above to automate a solution where ffmpeg can be used to slice up the room recording into appropriately names session files.
  • Have volunteers check over the video and/or do further editing before uploading.

Stress/Load Testing

  • Can the video capture machine handle the load of managing all the video streams?
  • Will the video capture machine hold up throughout the day, i.e., will processes start crashing at some point?

Hardware Specifications

PTZ cameras:

  • Samsung SNP6200H/SNP6200 outputs MPEG over RTSP stream with 8000kHz mu-law audio.

@mproctor13 thinks that when we read RTSP streams using the standard library--most are based on Live555, we get separate audio and video streams.

There are firmware updates for the camera, which may have an impact on this software development. (See Firmware update notes)

For the scan converter and video encoder:

  • Generic PC-to-TV converter has VGA input + loop-thru; it has S-Video/RGB output ports. There is a front panel which allows control of pan/zoom via on screen display. The supported resolutions/refresh rates: 640x480 @ 60/72/75/85 Hz, 800x600 @ 60/75 Hz, 1024x768 @ 60/75 Hz, and 120x1024@60 Hz.

  • Axis M7001 video encoder Amazon link can stream both H.264 and MJPEG over RTSP. Can control PTZ for certain PTZ devices. Can be powered over Ethernet (POE) or a DC adapter.

Youtube Specifications

This is what Youtube requires for live streaming:

  • Protocol: RTMP Flash Streaming
  • Video codec: H.264, Main 4.1 (surprisingly they wont accept webm)
  • Frame rate: 30 fps
  • Keyframe frequency: 2 seconds
  • Audio codec: AAC-LC (Audio required)
  • Audio sample rate: 44.1 KHz
  • Audio bitrate: 128 Kbps stereo

Notes from @irabinovitch:

  • Youtube can do transcoding for us on the fly. So if we push higher bit rates they'll create versions as needed for lower bandwidth environments and for mobile.
  • I dont believe our cameras support 44.1KHZ audio, we'll need to take care of that either in whatever tooling we use to transcode.
  • Each live stream can last up to 4 hours, so we'll need to create a "new" event every 4 hours. This may mean needing to update the embed code on the site through out the day.
  • If we want to do the video editing / clipping in youtube, we'll need to break our "live events" into 2 hours. Youtube doesn't let you edit anything

Our approach

We will investigate using GStreamer to capture video. We primarily need to figure out the GStreamer pipeline to do what we need, and then we need to learn how to work with GStreamer programmatically.

  1. Capture from RTSP
  2. Save the stream to file (record)
  3. Deal with audio -- (resample and re-encode)
  4. Multiplex the video stream with reencoded audio stream
  5. Put into appropriate transport stream

We will also investigate gst-switch and speaker-track to see they might meet some of our needs.

System Requirement (to make sure everyone is using same version)

We are working with Ubuntu distro. Probably 12.04.

For GStreamer, we are starting with gstreamer developer ppa to get 1.0. But if we go with gst-switch, there maybe be some other things to install out of git. Standardization is a work in progress.

@ irabinovitch asked about support for AAC encoding since faac is not available. @ mproctor13 said that voaacenc seems to work.

We are not sure yet whether we need to work with GStreamer programmatically; if we do, it might be good to use Python 3, as there are GST bindings.

Clone this wiki locally