Home » Blog » Example of parsing an online store using the iDatica extension

Example of parsing an online store using the iDatica extension

In this article, we will look at an example, step by step, of how to parse data from an online store using the iDatica extension.

  • We will parse the Amazon website, this page ;
  • First, install the extension in your Google Chrome or Microsoft Edge browser .

Open the site in the browser (the article uses Google Chrome as an example, but MS Edge is the same), open the developer tools (F12), place the extension window on the side:

Now we need to create the first column with data, let it be the field “Product Name”. Click on “+” in the extension:

a field with column settings will appear, name the field:

The column contains:

  • Selecting a selector to search for xpath or css data;
  • A field for a query that will lead to the necessary data – this path tells the program what data needs to be collected on the site and placed in this column. How to create queries is described in this article ;
  • Button for searching the path to data – the data query is generated automatically;
  • A button to show which elements on the page match the entered query;
  • Button to show what data is on the page for this request and its quantity.

Let’s get the product names. To do this, right-click on the name and select “view code” from the context menu:

The focus in the developer panel will go to the place in the site code where the header is located:

We see the h2 heading , it ontains a link a , the link contains a span element that contains the heading text. Since the h2 heading is a unique element on the page, we will build an xpath query from it:


Let’s choose a selector — xpath. We’ll put the request in the corresponding field. Click on the magnifying glass icon to check what the parser finds on the page (the browser window with the site must be active) — the values ​​found will be highlighted:

The next step is to create a column that will collect the price. Please note that not all products have a price:

In the current configuration, the parser “does not know” where the data related to one product begins and ends, which means that if you parse the names and prices, then in the final unloading the data will follow one after another without gaps in those cells where there is no data:

In order for the data related to one product to be located in one line, you need to tell the parser the beginning and end of the block of one product. In order to find the block of one product in the code, start moving up the code, hover the cursor in the task inspector over the code, note that the data blocks will be highlighted:

Our task is to find the top block that is responsible for the product, such blocks will go one after another and when hovered will highlight the product card. Let’s write a request to this block, I decided to use the style .s-widget-spacing-small. We select the path to the data block: CSS, write the desired style there:

Let’s click on the magnifying glass icon and check that the parser correctly identifies blocks with product cards:

The parser highlighted the product cards, which means the data will be collected as we need. Let’s check, find the price block. Please note that in Amazon, there can be several prices in the product card, which means you need to select the block with the correct one and go down to the desired element:

It turns out like this – we look for a div with the class s-price-instructions-style , and go down to the desired element. Since after the element a some products have 2 span elements (crossed out price), we need to specify that the parser goes down to the first span[1] , otherwise the parser will collect data from both span elements . Then you can see that the price is duplicated in the code, so we specify that the parser looks for the price in the span element aria-hidden=”true” .


Let’s see what data the parser found for this request, click on the play button icon (the browser window with the site must be active, a window will open in which the found values ​​will be displayed:

Yes, this is our price and there are 10 of them on the page, that’s right.

Next, we’ll add the old price, as we’ve already found out, this is the adjacent span element , so we’ll just change 1 to 2:


Next we get the image, from the div with the class s-product-image-container we go down to the img element

Let’s add another user rating, we’ll just specify the CSS style:

.a-icon--small

Now let’s add the code for the next button, set the transition mode to “Next Button” and add the style for this button:

At this point, we will assume that this is all the data that needs to be collected and we can launch the parser. Click on play, parse the required number of pages, stop the parser or wait until the directory is finished, and click on the button to get the final file.

Scroll to Top