htmltab: Assemble Data Frames from HTML Tables
HTML tables are a valuable data source but extracting and recasting
these data into a useful format can be tedious. This package allows to collect
structured information from HTML tables. It is similar to readHTMLTable()
of the XML package but provides three major advantages. First, the function
automatically expands row and column spans in the header and body cells.
Second, users are given more control over the identification of header and body
rows which will end up in the R table, including semantic header information
that appear throughout the body. Third, the function preprocesses table code,
corrects common types of malformations, removes unneeded parts and so helps to
alleviate the need for tedious post-processing.