I want to scrape all of the url's out of my .json bookmark backup that firefox creates and output a .txt
file.
Here's a sample of one of the objects located in the file:
{"index":1,"title":"Bookmarks Toolbar","id":3,"parent":1,"dateAdded":1219177758531250,"lastModified":1288873459187000,"annos":[{"name":"bookmarkProperties/description","flags":0,"expires":4,"mimeType":null,"type":3,"value":"Add bookmarks to this folder to see them displayed on the Bookmarks Toolbar"}],"type":"text/x-moz-place-container","root":"toolbarFolder","children":[{"title":"","id":25,"parent":3,"dateAdded":1224693644437500,"lastModified":1236888979406250,"annos":[{"name":"placesInternal/GUID","flags":0,"expires":4,"mimeType":null,"type":3,"value":"{f6066e21-10ff-46a2-af7a-2891f8dca345}0"}],"type":"text/x-moz-place","uri":"http://www.google.com/"}
These objects are comma-separated and should all contain at least one member that contains a string whose value is the url of the bookmark.
Here's a sample of what the .txt
file would have in it:
http://www.google.com
http://www.yahoo.com
http://www.etc.com`
Ideally, I'm interested in seeing if this can be pulled off using any scripting tools available within a generic Windows XP "environment".
If Windows can't cut it, what would be the quickest & easiest solution to this?
Is there a website or program that can do pattern matching or regex to parse the file do search & replace before I go install something like Active Perl or Strawberry Perl and write a script for it.
If you have Excel, it's probably easy to do a text to columns split
- http://office.microsoft.com/en-us/excel-help/split-names-by-using-convert-text-to-columns-HA001149851.aspx
on "
. Given the format (order of fields) is always the same, you should have the URLs somewhere near the last column.
Another way I found is the method at the following site:
http://forums.mozillazine.org/viewtopic.php?f=38&t=1057265&sid=66d981cc79d1ff63644e0cdd5b665a37
Basically you do the following:
(1) Create a firefox bookmark with the following as the location:
javascript:(function(){var E=document.getElementsByTagName('PRE')[0],T=E.innerHTML,i=0,r1,r2;t=new Array();while(/("uri":"([^"]*)")/g.exec(T)){r1=RegExp.$1;r2=RegExp.$2;if(/^https?:/.exec(r2)){t[i++]='['+(i)+']:<a href='+r2+'>'+r2+'<\/a>';}}with(window.open().document){for(i=0;t[i];i++)write(t[i]+'<br>');close();}})();
(2) Open a blank firefox tab.
(3) drag your firefox json file into the blank tab, this should open the json file.
(4) goto your bookmark you created in step 1.
(5) you should have a list of "clickable urls" for all your bookmarks.
I haven't tested this.
NOTE: Verify/correct all below file paths to match your system.
@Echo Off
Rem FFExportBookmarks.bat
SetLocal EnableDelayedExpansion
Set JSONFile="%APPDATA%\Mozilla\Firefox\Profiles\xyz42pdq.default\bookmarkbackups\Bookmarks.json"
Set FavOut="%USERPROFILE%\My Documents\FFBookmarks.txt"
Set JSONTemp="%Temp%\JSONTemp.txt"
Echo.> %JSONTemp%
Set JSONTemp1="%Temp%\JSONTemp1.txt"
Echo.> %JSONTemp1%
For /f "UseBackQ Delims=" %%N In ('Type %JSONFile%') Do (
Set JSONInput=%%N
Rem Filter double " and other delimiters
Set JSONInput=!JSONInput:"=!
Set JSONInput=!JSONInput: =!
Set JSONInput=!JSONInput:^,= !
Set JSONInput=!JSONInput:[= !
Set JSONInput=!JSONInput:]= !
Set JSONInput=!JSONInput:{= !
Set JSONInput=!JSONInput:}= !
For %%K In (!JSONInput!) Do For /f "Tokens=1,2 Delims=:" %%X In ("%%K") Do (
If /i "%%X"=="uri" Echo %%Y >> %FavOut%
)
)
Start "" %FavOut%
It wasn't very quick, but it's plenty dirty!