Data From HTML
Data from Html function is used to extract data from Html File:
Sheetkraft contains the data from html option which can be accessed by the Import From Button present in the SheetKraft Toolbar ( See figure)
Importing Data From a File:
The Following Guidelines show how data can be imported from a file using Data from HTML. Click on the Import From Button and select the HTML option to open this dialogue box. Given below is preview of Data From HTML Function.
The Data From HTML Function Contains Following Parameters:
● File_Location:- The file_Location is the path from where the data is to be imported.
● Width:- Width is the number to columns required in the result.
● Selector:- Selector is the tag which is used to select the data from the HTML file. Selector is defined as $(‘tag’). In the below Screenshot if we use option as selector it must be defined as $(‘option’).
For eg: In the Below Figure we use “OPTION” as a Selector to fetch data inside the option tag.
● Pattern:- Pattern is used to fetch the data inside the selector.
Pattern is defined as attr(‘attribute name’).
In the above fig if we want to fetch data for attribute “value” it is defined as attr(‘value’).
The below screenshot shows the overall preview of how Data from Html works.
Below is an example of extracting data from html that is already in table tags
The second argument (Width) should be set to the number of columns to be extracted. Specifying this explicitly instead of relying on the html is a good idea as it avoids errors if the table has more columns than expected.
The third and later arguments can be left blank. Alternatively, any selector expression that returns one or more tables can be used. The default selector expression (in case the third argument is left blank or an empty string is provided) is $('table'). This simply selects all tables in the html.
Subsequent arguments (4th onwards) are interpreted as patterns of values to be matched against the values in the table. The first table that matches the pattern will be used for data extraction. While it is possible to use multiple patterns.
Let’s understand with an example where there are two data tables, how can we extract data from individual tables using patterns:
HTML Code:
● DatafromHtml.SK function to extract data from only ‘Teacher table’
=DataFromHtml.SK(“filepath”,width,” {"ID","Teacher Name","Teaching Subject"})
In case only first two arguments are provided it will extract data from the first table (Student table) of the HTML code
DataFromHtml can also be used to extract non-tabular data. This requires understanding the selectors and functions available.
The third argument to DataFromHtml accepts a selector expression called the root selector expression.
Here is an example:
$('form#login input') selects all input elements inside a form element with an id of login. It can be used to target html like:
...
<form id="login" ...>
...<input ...></input>
...
...<input ...></input>
</form>
The 4th argument takes a horizontal array of selector expressions that are applied to each element selected by the root selector expression to create one row of the results of DataFromHtml.
For example, the formula
=DataFromHtml.SK(A1,2,"$('form#login input')",{"attr('name')","val()"})
Will return the name and value of each input element in the html.
The root selector expression can be:
Just $ or jQuery to refer to the root element(s) in the html. Supported in releases after 2.13.6.
$('any_valid_selector') to select all elements matching the selector.
A chain of function calls like $('table').last() to get the last table or $('table').not('.layout').last() to get the last table that does not have the class layout.
Here is a list of functions that return a new selection. These functions are appropriate for use in the root selector expression. Note that selector expressions are case sensitive and that arguments to functions must always be enclosed in single quotes. Square brackets indicate optional arguments.
children([selector]): Gets the children of each each element optionally filtered by the specified selector.
closest([selector]): Get the first ancestor element that matches the selector, beginning at the current element.
contents(): Similar to children but also returns text and comments.
eq(index): Reduces the selection to the element at the index (zero based).
filter(selector): Reduces the selection to elements matching selector.
find(selector): Gets the descendents of each element that match the specified selector.
first(): Reduces the selection to the first element.
has(selector): Reduce the selection to elements that have a descendant that matches selector.
last(): Reduces the selection to the last element.
next([selector]): Returns the immediately following sibling of each element in the selection. If a selector is specified, the sibling is returned only if it matches the selector.
nextAll([selector]): Get all following siblings of each element in the selection, optionally filtered by a selector.
nextUntil([selector], [filter]): Get all following siblings of each element in the selection up to but not including the element matched by the selector, optionally filtered by filter.
not(selector): Reduce the selection to elements that do not match the selector.
parent([selector]): Returns the immediate parent of each element in the selection, only if it matches the selector if specified.
parents([selector]): Gets the ancestors of each element in the selection, optionally filtered by a selector.
parentsUntil([selector], [filter]): Get the ancestors of each element in the selection, up to but not including the element matched by the selector, optionally filtered by filter.
prev([selector]): Gets the immediately preceding sibling of each element in the selection, only if it matches the selector if specified.
prevAll([selector]): Gets all the preceding siblings of each element in the selection, optionally filtered by a selector.
prevsUntil([selector], [filter]): Gets all preceding siblings of each element up to but not including the element matched by the selector, optionally filtered by filter.
siblings([selector]): Gets the siblings of each element in the selection optionally filtered by a selector.
slice(start, [end]): Reduces the selection to a subset specified by a range of indices (zero based)
These functions can also be used in the selector expressions in the fourth argument. But they would normally be followed by functions that extract a value from an element or selection. Here is a list of such value returning functions:
attr(name): Gets the value of the specified attribute for the first element of the selector.
css(name): Gets the value of a the specified property from the style.
data(key): Gets the value of the data-{key} attribute.
hasClass(className): Returns true if any of the elements in the selection have the specified class name.
html(): Gets the inner html of the first element in the selection.
index(selector?): Returns the position of the current element (zero based) in its parent element or in a new selection corresponding to selector if specified.
is(selector): Returns true if any of the elements in the selection match the selector.
length(): Returns the number of elements in the selection.
prop(name): Returns true if the specified property is set for the first element in the selection.
text(): Gets the combined text contents of each element in the selection including their descendants.
val(): Get the current value of the first element in the set of matched elements.