English
Telematic

How to Configure CSS Selectors for Parsing Content in Telematic.Pro

How to Configure CSS Selectors for Parsing Content in Telematic.Pro

Introduction

When setting up sources in Telematic.pro, correctly configuring CSS selectors plays a crucial role. They allow you to:

  • Extract article links from a website’s main page.
  • Locate and parse the main content within each article.
  • Exclude unnecessary elements (e.g., ads, footers, side menus).

For effective parsing, you need to find the correct selectors using DevTools in your browser. In this article, we'll go over how to do this.

Using DevTools to Find Selectors

1. Opening DevTools

In Google Chrome or Firefox, press F12 or Ctrl + Shift + I (on Mac: Cmd + Option + I) to open the Developer Tools (DevTools).

image

2. Identifying Link Selectors

image

image

Suppose we want to extract links to articles from the main page of the blog "https://example.com/news".

  1. Open DevTools and go to the Elements tab.
  2. Hover over an article title and right-click Inspect.
  3. In the page code, find the closest <a> tag that contains the article link.
  4. Check its classes, ID, or other attributes.
  5. To copy the ready-made selector, right-click the element and select Copy → Copy selector.

Example HTML code:

<article class="news-item">
    <a href="/news/article-123" class="news-link">Article Title</a>
</article>

This means the CSS selector for links will be:

.news-item .news-link

In Telematic.pro, enter this selector in the Link CSS Selector field.

3. Identifying Content Selectors

image

After obtaining the list of links, the system navigates to each article and extracts the content. We need to specify where it is located.

  1. Open DevTools on the article page.
  2. Find the main container with the text (usually a <div>, <article>, or <section>).
  3. Check its classes or ID.
  4. Select the required container, right-click Copy → Copy selector to quickly get the CSS selector.

Example HTML code:

<article class="article-content">
    <h1>Article Title</h1>
    <p>Article text...</p>
</article>

CSS selector for content:

.article-content

Enter this in Telematic.pro under Content CSS Selector.

4. Excluding Unnecessary Elements

image

Sometimes, articles contain unwanted elements such as ads, social media buttons, or links to other materials.

  1. Use DevTools to find the HTML code of unwanted elements.
  2. Identify their classes or IDs.
  3. Select the element, right-click Copy → Copy selector to obtain its CSS selector.
  4. Test the selector in DevTools: open the console (Console tab) and enter:
    document.querySelectorAll('selector')
    
    If multiple elements are found, you may need to refine the selector.

Example HTML code with unwanted elements:

<article class="article-content">
    <h1>Article Title</h1>
    <p>Article text...</p>
    <div class="related-posts">Other articles</div>
    <div class="ad-banner">Advertisement</div>
</article>

Excluding CSS selectors:

.related-posts, .ad-banner

Add these to the Excluding CSS Selectors field in Telematic.pro.

Full Configuration Example in Telematic.pro

image

Let's say we are configuring a source for a blog with the URL https://example.com/news. The settings in Telematic.pro would be:

  • Link CSS Selector: .news-item .news-link
  • Content CSS Selector: .article-content
  • Excluding CSS Selectors: .related-posts, .ad-banner

Conclusion

Configuring CSS selectors allows you to precisely extract the necessary data while excluding unwanted elements. Use DevTools to find the correct selectors, copy them via the context menu, and test them in the console before adding them to Telematic.pro.