Thursday, June 5, 2014

Inserting/Pasting HTML into OpenOffice Calc via Basic macro

Its easy to manually copy data from a browser window and paste into an existing OO Calc doc.  If tables are used in the html, they paste nicely into rows and columns.

I needed to automate this process via a macro, but it wasn't as simple as I at first thought.

Its easy to use a filter with loadComponentFromURL, to create a NEW Calc doc.  But I wanted to place it specifically into an existing doc.

I then found ability to link to External Data via macro using AreaLinks.insertAtPosition.  This worked wonderfully, but I didn't like that it retained links.. I just wanted a dump of the data, no strings attached.

I moved to try doing copy and paste via a Shell command, wget | xclip, and .uno:Paste , but the paste was always "unformatted text" and thus pasted raw html.

I went back to AreaLinks.insertAtPosition .. figured if I can insert a link, and then remove the link (retaining the data), I've essentially achieved my objective.  So this is the macro I ended up with (works with OO 3.2 and LO 4.1):

Sub InsertWebLinkAtSelection(url As String, section As String)
   Dim oDoc As Object
   Dim oSelection
   Dim oAddress As new
      oDoc = ThisComponent
   oSelection = oDoc.getCurrentController().getSelection()
   if oSelection.supportsService("") then
      oAddress = oSelection.getCellAddress()
      oDoc.AreaLinks.insertAtPosition(oAddress,url,"HTML_" +
section,"calc_HTML_WebQuery","0 0")
End Sub
The key was that last line... oDoc.AreaLinks.removeByIndex(getCount())  .. doesn't affect any exist links in document.  So it creates a link, which inserts the data at current cell address, and then removes the link.

Note:  The "section" argument is either "all" or "1", "2", "3",.. etc representing which html table element you want.

Note 2:  I also found that this method doesn't respect the http://user@pw:URL method, so I did have to use : Shell ("bash",0,"-c 'wget -O /tmp/file.html http://user@pw:URL....'", true) to get the results to a temp file, and then used InsertWebLinkAtSelection("file:///tmp/file.html","all")