Data From HTML

Data from Html function is used to extract data from Html File:

Sheetkraft contains the data from html option which can be accessed by the Import From Button present in the SheetKraft Toolbar ( See figure)

Importing Data From a File:

The Following Guidelines show how data can be imported from a file using Data from HTML. Click on the Import From Button and select the HTML option to open this dialogue box. Given below is preview of Data From HTML Function.

The Data From HTML Function Contains Following Parameters:

File_Location:- The file_Location is the path from where the data is to be imported.

Width:- Width is the number to columns required in the result.

Selector:- Selector is the tag which is used to select the data from the HTML file. Selector is defined as $(‘tag’). In the below Screenshot if we use option as selector it must be defined as $(‘option’).

For eg: In the Below Figure we use “OPTION” as a Selector to fetch data inside the option tag.

Pattern:- Pattern is used to fetch the data inside the selector.

Pattern is defined as attr(‘attribute name’).

In the above fig if we want to fetch data for attribute “value” it is defined as attr(‘value’).

The below screenshot shows the overall preview of how Data from Html works.

Below is an example of extracting data from html that is already in table tags

The second argument (Width) should be set to the number of columns to be extracted. Specifying this explicitly instead of relying on the html is a good idea as it avoids errors if the table has more columns than expected.

The third and later arguments can be left blank. Alternatively, any selector expression that returns one or more tables can be used. The default selector expression (in case the third argument is left blank or an empty string is provided) is $('table'). This simply selects all tables in the html.

Subsequent arguments (4th onwards) are interpreted as patterns of values to be matched against the values in the table. The first table that matches the pattern will be used for data extraction. While it is possible to use multiple patterns.

Let’s understand with an example where there are two data tables, how can we extract data from individual tables using patterns:

HTML Code:

DatafromHtml.SK function to extract data from only ‘Teacher table’

=DataFromHtml.SK(“filepath”,width,” {"ID","Teacher Name","Teaching Subject"})

In case only first two arguments are provided it will extract data from the first table (Student table) of the HTML code

DataFromHtml can also be used to extract non-tabular data. This requires understanding the selectors and functions available.

The third argument to DataFromHtml accepts a selector expression called the root selector expression. 

Here is an example:

$('form#login input') selects all input elements inside a form element with an id of login. It can be used to target html like:

...

<form id="login" ...>

...<input ...></input>

...

...<input ...></input>

</form>

The 4th argument takes a horizontal array of selector expressions that are applied to each element selected by the root selector expression to create one row of the results of DataFromHtml.

For example, the formula 

=DataFromHtml.SK(A1,2,"$('form#login input')",{"attr('name')","val()"}) 

Will return the name and value of each input element in the html.

The root selector expression can be:

$('any_valid_selector') to select all elements matching the selector.


Here is a list of functions that return a new selection. These functions are appropriate for use in the root selector expression. Note that selector expressions are case sensitive and that arguments to functions must always be enclosed in single quotes. Square brackets indicate optional arguments.



These functions can also be used in the selector expressions in the fourth argument. But they would normally be followed by functions that extract a value from an element or selection. Here is a list of such value returning functions: