Thursday, June 5, 2014

Inserting/Pasting HTML into OpenOffice Calc via Basic macro

Its easy to manually copy data from a browser window and paste into an existing OO Calc doc.  If tables are used in the html, they paste nicely into rows and columns.

I needed to automate this process via a macro, but it wasn't as simple as I at first thought.

Its easy to use a filter with loadComponentFromURL, to create a NEW Calc doc.  But I wanted to place it specifically into an existing doc.

I then found ability to link to External Data via macro using AreaLinks.insertAtPosition.  This worked wonderfully, but I didn't like that it retained links.. I just wanted a dump of the data, no strings attached.

I moved to try doing copy and paste via a Shell command, wget | xclip, and .uno:Paste , but the paste was always "unformatted text" and thus pasted raw html.

I went back to AreaLinks.insertAtPosition .. figured if I can insert a link, and then remove the link (retaining the data), I've essentially achieved my objective.  So this is the macro I ended up with (works with OO 3.2 and LO 4.1):

Sub InsertWebLinkAtSelection(url As String, section As String)
   Dim oDoc As Object
   Dim oSelection
   Dim oAddress As new com.sun.star.table.CellAddress
      oDoc = ThisComponent
   oSelection = oDoc.getCurrentController().getSelection()
   if oSelection.supportsService("com.sun.star.sheet.SheetCell") then
      oAddress = oSelection.getCellAddress()
      oDoc.AreaLinks.insertAtPosition(oAddress,url,"HTML_" +
section,"calc_HTML_WebQuery","0 0")
      oDoc.AreaLinks.removeByIndex(getCount())
   endif
End Sub
The key was that last line... oDoc.AreaLinks.removeByIndex(getCount())  .. doesn't affect any exist links in document.  So it creates a link, which inserts the data at current cell address, and then removes the link.

Note:  The "section" argument is either "all" or "1", "2", "3",.. etc representing which html table element you want.

Note 2:  I also found that this method doesn't respect the http://user@pw:URL method, so I did have to use : Shell ("bash",0,"-c 'wget -O /tmp/file.html http://user@pw:URL....'", true) to get the results to a temp file, and then used InsertWebLinkAtSelection("file:///tmp/file.html","all")

2 comments:

  1. This is just too good and works perfectly and fast....with only one problem. How to access data from a login protected area on the website? I haven't found any way in open office to store credentials, there is one area which show used credentials, but have no idea how they get there....

    Any Idea?

    ReplyDelete
  2. See Note 2. You need to pass the credentials in the URL. Not the best method, but its the only way I could get it to work.

    ReplyDelete