I asked two additional questions before this on Stack Overflow, but got very little help and I thought I would ask an open question for posterity. I have spent time parsing the AWS-SDK API docs and found very little direct answers to my needs. I have also posted on the AWS forums and haven't been able to get a good response there. A simple, comprehensive, step-by-step solution seems impossible to find.
What I have completed:
- Uploading with CarrierWave direct to s3. I followed Railscast #383 and adapted it to my needs.
- I am able to "retrieve" my files from my s3 bucket.
Details about what I've done so far:
I used Carrierwave-Direct to upload direct to s3(this utilizes fog to deal with uploading directly to s3). The upload is processed in a background job with Sidekiq. After the file is put in the bucket I just retrieve it by iterating through a users uploads and call the file by the upload's url from s3.
Here is where I'm lost:
- I need to transcode videos with the Elastic Transcoder provided by AWS.
- I need to recall the uploaded/converted videos from the output bucket. How do I link to the URL from the "output-bucket"? Is it a new URL reference or does the URL stay the same as the original "upload URL"?
- I need to integrate the transcoded videos from transcoder to Cloudfront and display them using JWPlayer.
- How do I integrate the API code into my uploading process in the background?
Here is my code so far:
My uploader:
class VideoUploader < CarrierWave::Uploader::Base
include CarrierWaveDirect::Uploader
end
My initializer that handles the s3 details:
CarrierWave.configure do |config|
config.fog_credentials = {
provider: 'AWS',
aws_access_key_id: 'AWS_ACCESS_KEY_ID',
aws_secret_access_key: 'AWS_SECRET_ACCESS_KEY',
region: 'us-west-1'}
config.fog_directory = 'video-input'
config.fog_public = false # optional, defaults to true
config.fog_attributes = { 'Cache-Control' => "max-age=#{365.day.to_i}" } # optional, defaults to {}
end
My Upload Model:
class Upload < ActiveRecord::Base
belongs_to :user
mount_uploader :video, VideoUploader
after_save :enqueue_video
def enqueue_video
VideoWorker.perform_async(id, key) if key.present?
end
class VideoWorker
include Sidekiq::Worker
def perform(id, key)
upload = Upload.find(id)
upload.key = key
video.remote_video_url = upload.video.direct_fog_url(with_path: true)
upload.save!
end
end
end
My view:
To show all user's uploads:
<% @user.uploads.each do |upload| %>
<%= link_to upload.title, upload_url(upload) %>
<% end %>
To show the upload (right now it is only a download link):
<h1><%= @upload.title %></h1>
<p><%= link_to @upload.video_url %></p>
I don't think my schema or forms are relevant.
A similar sample of how I think the code might work:
I would add this code into the Sidekiq worker, but I'm not sure if I'm doing this right. I'm also uncertain about how I'm going to connect my "upload" to the "converted upload".
upload.update_column 'converted_video',
File.basename(upload.video.path)
transcoder = AWS::ElasticTranscoder::Client.new
transcoder.create_job(
pipeline_id: APP_CONFIG[Rails.env][:pipeline_id],
input: {
key: upload.video.path,
frame_rate: 'auto',
resolution: 'auto',
aspect_ratio: 'auto',
interlaced: 'auto',
container: 'auto'
},
output: {
key: upload.converted_video.path,
preset_id: WEB_M4_PRESET_ID,
thumbnail_pattern: "",
rotate: '0'
}
)
Links to a helpful article and the docs about the Elastic Transcoder:
http://www.techdarkside.com/getting-started-with-the-aws-elastic-transcoder-api-in-rails
http://docs.aws.amazon.com/sdkforruby/api/Aws/ElasticTranscoder.html
I feel like getting your file into S3 with CarrierWave is the hardest part and you've already done it! The next part seems like you're climbing a mountain but it's more like a stroll in the park :)... There's surprisingly little for you to actually do because AWS is going to handle a lot of what we're doing with really only one input from us (two if you count the initial upload) being the create_job request... Before we begin, I'll just say that from the modules referenced in your code it looks like you do indeed have gem 'aws-sdk'
in your gemfile, which is important when using AWS resources (we won't go in to configuring the SDK appropriately as that's outside of the scope of the question, but you can follow the instructions on the github repo), and I'll also say that it is obvious that you have been through some of these steps already, like creating a pipeline. I'm including those steps nonetheless for future readers who may stumble across this answer... Let's begin:
First things first. Create a S3 bucket where your pipeline is going to put the transcoded files. You don't strictly need two buckets, you could just use the same bucket, but two makes things much cleaner and having a bucket itself doesn't cost anything extra (you'll pay for storage in the bucket though).
Create a CloudFront distribution to distribute your transcoded files. For the Origin Domain Name click the input field and you'll get a dropdown list that includes your account's S3 buckets; select the S3 bucket for OUTPUTS, the one where you're putting your transcoded files, that you created in step 1, as the source for your distribution. Note the unique URL that you'll receive after creating your distribution, it will be something like https://d111111abcdef8.cloudfront.net
and that's where you're going to look for your files later on.
Before you're able to create a transcoding job, you need to have a pipeline. A pipeline is basically the queue that holds your transcoding jobs. When a user uploads a file, you'll add a transcoding job to the pipeline you create in this step. You only have to create a pipeline once and you can add all jobs for transcoding to this pipeline. You tell the pipeline which S3 bucket to get the files for your jobs from, and you tell it the S3 bucket where it's to put the output files from the jobs. The output file can have the same name, and will have the new extension, so if you upload myvideo.mp4 and you transcode that to .avi format then the output file will be myvideo.avi (you could also change the name but that's complicating things and outside the scope of your question). Since you know the filename from the job, and you know the output bucket, you just have to put them together to get the URL to access the file (you'll have to make sure you've set the correct access permissions on the bucket in order to access the file). If my output file is myvideo.avi and I know it has been output to a specific bucket, which is part of my CloudFront distribution, I know that I'll be able to access it at myCloudFrontURL/myvideo.avi
... It'll look something like https://d111111abcdef8.cloudfront.net/myvideo.avi
. Since I suspect this is going to be a "standard" process (ie all the uploaded files are going to be transcoded to the same format and your pipeline isn't going to change), I'm going to suggest that you create your pipeline with the GUI. You can read how to do that here:
http://docs.aws.amazon.com/elastictranscoder/latest/developerguide/creating-pipelines.html
Now that we have a pipeline, in order to transcode our uploaded file, we need to create a job in the pipeline. You're on the right track with creating the job with your worker, it's a pure vanilla hash: http://docs.aws.amazon.com/sdkforruby/api/Aws/ElasticTranscoder/Types/CreateJobRequest.html. Once that is posted by the SDK, the ET pipeline will take over, the job will be added to your pipeline, and it will be transcoded in the order which it was added to your ET Pipeline.
EDIT Per the OP's comment on this answer, there could be additional requirements that Users can upload many videos and that you should be able to list all videos the user has uploaded. Rails makes this super easy. You have a User model, and you have a Upload model. You have the belongs_to :user
association on your Upload model, which are perfect. Since a User can have many uploads you'll want to add the has_many :uploads
association to your User model (ActiveRecord associations go two ways). I'm going to assume you used the Rails generator to create your Upload model, and if you did, you'll notice that it created a migration for you which created a Upload table in your database. I'm not clear what you schema looks like, but I'll assume that you ran the migration that was created when you generated your model without making any changes (ie just generating the associated table) and that your Uploads table does not include the columns "user_id" or "url". We'll add those by running Rails g migration AddColumnsToUploadsTable
from your terminal. From there we'll go and edit our migration in the yourApp/db/migrations
folder. We'll add two lines to the change method in our migration add_column :uploads, :url, :string
and add_reference :uploads, :user, :index => true
, and then we'll head back to our terminal and run rake db:migrate
. Those columns are now added to our Upload table in the database.
5.1 In step 3 we created a pipeline, and in step 4 we created a job for that pipeline. The pipeline requires us to tell it where to put the file that is output from the job. Obviously, because we've told the pipeline where to put that file, we know where it's going to be. Not that since we're using CloudFront to distribute that file, instead of using our S3 bucket location, we're going to use our CloudFront url. If we weren't using CloudFront and were just using the S3 bucket we would use the S3 bucket URL. Moving on, like the pipeline requires us to tell it where to put the output file, the job requires us to tell it which format to output to, since we tell the job to transcode to AVI format, we know that the output format is going to be AVI. Finally, since we know the name of the uploaded file, and we're not changing the name of the file, we know what the output file is going to be named. We are supplying all the information, and telling Elastic Transcoder exactly what to do with the output file... Therefore, obviously, we know the bucket location, filename, and file extension, and we can very easily figure out the url to access the file: it's going to be https://<yourCloudFrontURL>/<videoName>.<extension>
. We can turn the yourCloudFrontURL, videoName and extension (which we know) in to variables for our use elsewhere. urlname = https://d111111abcdef8.cloudfront.net/
\ filename = File.basename(params[:file].original_filename, ".*")
\ newextension = ".avi"
Persisting that to your database can be done with the standard Ruby/Rails process: instantiate a new object video = Upload.new
, set the user_id of the new object to the current_user video.user_id = current_user_id
and then set the url of the new object video.url = urlname+filename+newextension
. We've set the user_id and we've set the url of the record, all we need to do is save it with video.save
. You can then access those records as required in the standard way... To retrieve a list of all the videos for a specific user you could do something like @videos = Upload.where(:user_id => current_user.id)
. This would return an array of all the objects in the Upload model were the user_id matches the current_user_id.
Optional next step is to have SNS send your app a notification when transcoding is complete. The notification could be useful for knowing things like exactly when the new file is available. If you chose to use the SNS notification or not, we've now completed our transcoding and we didn't have to do a whole lot from a code perspective except create a job that was posted to the AWS resource by the SDK.
Your next step is to distribute the output file via CloudFront. Since we have already set Origin for our distribution as the S3 bucket that we output our transcoded files in, there's nothing else for you to do. Any new files added to the bucket(s) will automatically be added to your distribution. Read more about how that works here: http://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/AddingObjects.html.
In order to play a file you distributed via your distribution with JPlayer you'll need to do whatever it is that you need to do in order to use the player, and then load the file from the CDN via the Cloudfront URL for the file (this format: https://d111111abcdef8.cloudfront.net/myvideo.avi
), when/where required by the player.
To initiate the transcoding process (what you call integrating the API code with your uploading process) the only thing you need to do is that simple create_job
method on your object on your Sidekiq worker... Once that happens there's really not a whole lot more for you to do except grab a cuppa while you wait for the output file to be available in your CloudFront distribution.
That's it... You've created the output S3 bucket, created the distribution, created the pipeline, grabbed a file from S3, created a job with that file and added that job to your ET pipeline. You've output the result of that job to a different S3 bucket, optionally received a SNS notification of job completion, distributed that file on the CloudFront CDN, and lastly you've loaded that file from your CloudFront distribution in to JPlayer in the browser.
Hope that helps!