How to find specific cache entries in firefox and

2020-02-10 09:42发布

问题:

I have the following scenario:

A user can paste html content in a wysiwyg editor. When that pasted content contains images which are hosted on other domains, I want these to be uploaded to my server. Right now the only way of doing that is manually downloading via "save image as..." context menu, then uploading the image to the server via a form and updating the images in the editor.

I have to solve this client side.

I'm working on a firefox addon that can automate the process. Of course I could download these images, store them on the harddrive and then upload them with FormData or better the pupload , but this seems clumsy as since the content is displayed in the browser, it must be downloaded already and reside somewhere in memory. I would like to grab the image files from memory and tell firefox to upload them (being able to make a Blob of them would suffice it seems).

However, I'm getting hopelessly lost in the API documentation for several different Caching systems on MDN and fail to find any example code of how to use them. I checked code of other addons that access the cache, but most is uncommented and still quite cryptic.

Can you point me to some sample code of what the recommended way would be to achieve this? The best possible solution would be if I can request the particular url from firefox so I can use it in FormData, and if it isn't in the cache firefox downloads to memory, but if it's already there I just get it directly.

回答1:

The master documentation for Mozilla's version 2 HTTP Cache is located here. Aside from the blurbs on this page, the only way I was able to make sense of this new scheme is by looking at the actual code for each object and back-referencing almost everything. Even though I wasn't able to get a 100% clear picture of what exactly was going on, I figured out enough to get it working. In my opinion, Mozilla should have taken the time to create some simple-terms documentation before they went ahead an pushed out the new API. But, we get what they give us I suppose.

On to your problem. We're assuming that the users who want to upload an image already have this image saved in their cache somewhere. In order to be able to pull it out of the user's cache for upload, you must first be able to determine the URI of the image before it can be pulled explicitly from the cache. For the sake of brevity, I'm going to assume that you already have this part figured out.

An important thing to note about the new HTTP Cache is that although it's all based off callbacks, there can still only ever be a single writing process. While in your example it may not be necessary to write to the descriptor, you should still request write access since that will prevent any other processes (i.e. the browser) from altering/deleting the data until you are done with it. Another side note and a source of a lot of pain for me was the fact that requesting a cache entry from the memory cache will ALWAYS created a new entry, overwriting any pre-existing entries. You shouldn't need this, but if it is necessary, you can access the memory cache from the disk (the disk cache is physical disk+memory cache -- Mozilla logic) cache without that side effect.

Once the URI is in hand, you can then make a request to pull it out of the cache. The new caching system is based completely on callbacks. There is one key object that we will need in order to be able to fetch the cache entry's data -- nsICacheEntryOpenCallback. This is a user-defined object that handles the response after a cache entry is requested. It must have two member functions: onCacheEntryCheck(entry, appcache) and onCacheEntryAvilable(descriptor, isnew, appcache, status).

Here is a cut-down example from my code of such an object:

var cacheWaiter = {
  //This function essentially tells the cache service whether or not we want
  //this cache descriptor. If ENTRY_WANTED is returned, the cache descriptor is
  //passed to onCacheEntryAvailable()
  onCacheEntryCheck: function( descriptor, appcache )
  {
    //First, we want to be sure the cache entry is not currently being written
    //so that we can be sure that the file is complete when we go to open it.
    //If predictedDataSize > dataSize, chances are it's still in the process of
    //being cached and we won't be able to get an exclusive lock on it and it
    //will be incomplete, so we don't want it right now.
    try{
      if( descriptor.dataSize < descriptor.predictedDataSize )
        //This tells the nsICacheService to call this function again once the
        //currently writing process is done writing the cache entry.
        return Components.interfaces.nsICacheEntryOpenCallback.RECHECK_AFTER_WRITE_FINISHED;
    }
    catch(e){
      //Also return the same value for any other error
      return Components.interfaces.nsICacheEntryOpenCallback.RECHECK_AFTER_WRITE_FINISHED;
    }
    //If no exceptions occurred and predictedDataSize == dataSize, tell the
    //nsICacheService to pass the descriptor to this.onCacheEntryAvailable()
    return Components.interfaces.nsICacheEntryOpenCallback.ENTRY_WANTED;
  }

  //Once we are certain we want to use this descriptor (i.e. it is done
  //downloading and we want to read it), it gets passed to this function
  //where we can do what we wish with it.
  //At this point we will have full control of the descriptor until this
  //function exits (or, I believe that's how it works)
  onCacheEntryAvailable: function( descriptor, isnew, appcache, status )
  {
    //In this function, you can do your cache descriptor reads and store
    //it in a Blob() for upload. I haven't actually tested the code I put
    //here, modifications may be needed.
    var cacheentryinputstream = descriptor.openInputStream(0);
    var blobarray = new Array(0);
    var buffer = new Array(1024);      

    for( var i = descriptor.dataSize; i == 0; i -= 1024)
    {
      var chunksize = 1024;
      if( i < 0 )
        chunksize = 1024 + i;
      try{
        cacheentryinputstream.read( buffer, chunksize );
      }
      catch(e){
        //Nasty NS_ERROR_WOULD_BLOCK exceptions seem to happen to me
        //frequently. The Mozilla guys don't provide a way around this,
        //since they want a responsive UI at all costs. So, just keep
        //trying until it succeeds.
        i += 1024;
        continue;
      }
      for( var j = 0; j < chunksize; j++ )
      {
        blobarray.push(buffer.charAt(j));
      }
      if( i < 0 )
        i = 0 //Set i == 0 to signal loop break
    }
  }
  var theblob = new Blob(blobarray);
  //Do an AJAX POST request here.
}

Now that the callback object is set up, we can actually do some requests for cache descriptors. Try something like this:

var theuri = "http://www.example.com/image.jpg";

//Load the cache service
var cacheservice = Components.classes["@mozilla.org/netwerk/cache-storage-service;1"].getService(Components.interfaces.nsICacheStorageService)

//Select the default disk cache.
var hdcache = cacheservice.diskCacheStorage(Services.loadContextInfo.default, true);

//Request a cache entry for the URI. OPEN_NORMALLY requests write access.
hdcache.asyncOpenURI(ioservice.newURI(theuri, null, null), "", hdcache.OPEN_NORMALLY, cacheWaiter);

As far as actually getting the URI, you could provide a window for a user to drag-and-drop an image into or perhaps just paste the URL of the image into. Then, you could do an AJAX request to fetch the image (in the case that the user hasn't actually visited the image for some reason, it would then be cached). You could then use that URL to then fetch the cache entry for upload. As an aesthetic touch, you could even show a preview of the image but that's a bit out of scope of the question.

If you need any more clarifications, please feel free to ask!