In the "How to develop an app using the Camera Remote API" toturial it states "The Camera Remote API uses JSON-RPC over HTTP. You can therefore use the Camera Remote APIs with any operating system, such as Android, IOS or Microsoft® Windows®." This stands to reason since the protocols are platform-agnostic. However, in the camera compatibility chart on this page:http://developer.sony.com/develop/cameras/ it states that the Sony Smart Remote Control App must be installed in order to "enable the use of the APIs." Since that app is only iOS and Android, does that mean that the APIs cannot be used on Windows?
I am keenly interested in developing a remote control app for Windows 8 tablets, and then for the Windows 8 phone. But if I cannot control the A5000, A7R, A7, NEX-6, NEX-5R, or NEX-5T, then it becomes far less interesting.
Is it possible to control those cameras with the plain HTTP JSON communication?
Thank you
Thank you for your inquiry.
In the A5000, A7R, A7, NEX-6, NEX-5T, NEX-5R cameras, install the below app. https://www.playmemoriescameraapps.com/portal/usbdetail.php?eid=IS9104-NPIA09014_00-F00002 This app is to be installed IN the camera and started.
Now you can use "Camera Remote API" to control the above camera from any OS.
I don't know if you solved your problem but I have the same issue and I managed to make it work somehow with C++. It took me some time to figure out what I had to do, I have never done any HTTP stuff, even less developed plug and play drivers so I will explain how I did it step by step, as I wish I had been explained.
At the end of the message I have given a link to my entire file, feel free to try it.
I am using boost asio library for every network related issue, and more (everything asynchronous really, this is a great library but very hard to grasp for ignorant people like me...). Most of my functions are partially copy-pasted from the examples in the documentation, this explains why my code is awkward at places. Here is my main function, nothing fancy I instanciate an asio::io_service, create my object (that I wrongly named multicast_manager) and then run the service:
Discovering the camera over ssdp
First, we have to connect to the camera using its upnp (universal plug and play) feature. The principle is that every upnp device is listening to the multicast port 230.255.255.250:1900 for M-SEARCH request. It means that if you send the proper message to this address, the device will answer by telling you it exists, and give you information to use it. The proper message is given in the documentation. I ran into two pitfalls doing that: first, I omitted to add the newline at the end of my message, as specified in the http standard. So the message you want to send can be build like that:
The second thing important in this part is to check that the message is sent to the right network interface. In my case, even when it was disabled, it went out through my ethernet card until I changed the right option in the socket, and I solved this issue with the following code:
Now we listen. We listen from where you might ask if you are like me? What port, what address? Well, we don't care: The thing is, when we sent our message, we defined a destination ip and port (in the endpoint constructor). We didn't necessarily define any local address, it is our own ip address (as a matter of fact, we did define it, but only so that it would know which network interface to choose from); and we didn't define any local port, it is in fact chosen automatically (by the OS I guess?). Anyway, the important part is that anyone listening to the multicast group will get our message and know its source, and will respond directly to the correct ip and port. So no need to specify anything here, no need to create a new socket, we just listen to the same socket we sent our message in a bottle:
If everything goes right, the answer goes along the line of:
To parse this message, I reused the parsing from the boost http client example, except I did it in one go because for some reason I couldn't do an async_read_until with a UDP socket. Anyway, the important part is that the camera received our message; The other important part is the location of the description file DmsRmtDesc.xml.
Retrieving and reading the description file
We need to get DmsRmtDesc.xml. This time we will send a GET request directly to the camera, at the ip address and port specified. This request is something like:
Don't forget the extra empty line. I don't know what the Connection:close means. The accept line specify the application type of the answer you accept, here we will take any answer. I got the file using the boost http client example, basically I open a socket to 10.0.0.1:64321 and receive the HTPP header which is followed by the content of the file. Now we have a xml file with the address of the webservice we want to use. Let's parse it using boost again, we want to retrieve the camera service address, and maybe the liveview stream address:
Once this is done, we can start sending actual commands to the camera, and using the API.
Sending a command to the camera
The idea is quite simple, we build our command using the json format provided in the documentation, and we send it with a POST http request to the camera service. We will launch the liveview mode, so we send out POST request (we will eventually have to use boost property_tree to build our json string, here I did it manually):
We send it to 10.0.0.1:10000 and wait for the answer:
We get the liveview url a second time, I don't know which one is better, they are identical...
Anyway, now we know how to send a command to the camera and retrieve its answer, we still have to fetch the image stream.
Fetching an image from the liveview stream
We have the liveview url, and we have the specification in the API reference guide. First thing first, we ask the camera to send us the stream, so we send a GET request to 10.0.0.1:60152:
And we wait for the answer, that should not take long. The answer begins with the usual HTTTP header:
According to the documentation, this should be directly followed by the liveview data stream wich consists in theory in:
And then we get the common header again, indefinitely until we close the socket.
In my case, the common header started with "88\r\n" so I had to discard it, and the jpg data was followed by 10 extra bytes before switching to the next frame, so I had to take that into account. I also had to detect automatically the start of the jpg image because the jpg data started with a text containing a number whose signification I ignore. Most probably these error are due to something I did wrong, or something I don't understand about the technologies I use here.
My code works right now but the last bits are very ad hoc and it definitely need some better checking.
It also needs much refactoring to be usable, but it shows how each step works I guess...
Here is the entire file if you want to try it out. And here is a working VS project on github.