Scraping Configuration

📘

Helpful documentation

This documentation will supplement some context here.

Instead of using an OG tag or liftigniter-metadata tag, you might want our script to scrape from an arbitrary DOM that you can pick with a selector, or possibly a JavaScript variable that's accessible through global context (window).

For example, let's say that there's a HTML tag on a page that looks like:

<div id="item"> Delicious Pasta </div>
<a href="www.cooking123.com">Author's blog</a>
<a rel="author" href="/italian/Chef-James/">Chef James</a>

You would like to scrape the elements or attribute by defining features object on the config. i.e.

var customConfig = {
	config: {
    inventory : {
    	features: [
      	{
          name: "title",
          selector: "div[id=item]",
          type: "text",
          transform: function(value) {
        	  return value.trim();
          }
    	  },
        {
       	  name: "authorBlog",
          selector: "a[rel=author]",
          type: "attribute",
          attribute: "href"
        }
      ]
    }
  }
}

$p("init","{JS_KEY}", customConfig);

You can specify the DOM that contains the data you want with its CSS selector defined by selector parameter. name is the field name of data you will pass in to your item metadata. type specifies the type of data you want to scrape - if it's set as text, then it will grab the textContent of the specified DOM, if it's attribute, then it will scrape the value of specified attribute (for example, src of an image).

transform is a function that takes the value of scraped value and allow you to transform it in an output that you want before shipping the item to our data store.

Lastly, if you wish to scrape a variable on window, then you can define a feature in the following way.

// Take the value of window.metadata.pageUrl
var customConfig = {
	config: {
    inventory: {
  	  features: [
    	  {
          name: "url",
          type: "var",
          variable: "metadata.pageUrl" // scrapes window.metadata.pageUrl
    	  }
      ]
    }
  }
}

$p("init","{JS_KEY}", customConfig);

The feature object can be summarized as:

NameTypeDescription
nameStringName of the feature to be scraped.
typeStringType of feature to be scraped: we support text, attribute, url, and var.
selectorStringCSS Selector of the DOM that you want our script to scrape. Specifying the type will
attributeStringAttribute of the DOM element that you would like to parse.
transformFunction (String) -> AnyTransformation function applied to output of scraped value.