Simplify BOM Building: Script Your Way to Scrape, Copy, and Paste

Is that you, Tampermonster?

Imagine the convenience of scraping (copying) multiple pieces of component properties from a product page on a vendor’s website and pasting them directly into your Excel bill of materials (BOM) spreadsheet – all in one go.

Behold! A Tampermonkey script for doing just that, saving you from the tedious task of copying and pasting each data point individually. Once the script is set up, this can be your new workflow:

  1. Open a product page on a vendor’s website.
  2. Press the magic hotkey sequence.
  3. Select the first Excel cell that should receive the data. Paste.

That’s it. Each property such as part number, description, and price, gets pasted into separate, successive columns.

This is one of those tools that’ll make you say, “I wish I did this sooner!” It’s not an Excel trick but it feels like one. It is a game-changer for anyone tasked with compiling bills of materials from various vendor websites. It promises to save time and reduce hassle for all you engineers and designers.

This JavaScript script is scripted intended for use on Windows with the Tampermonkey browser extension. Using this browser extension lets us use a hotkey to invoke the script instead of opening the developer console and pasting the script there. If you are using another operating system or have another script manager already installed such as Violentmonkey or Greasemonkey, you’re probably way ahead of me and can make any necessary adaptations to the provided script easy-peasy.

Let’s dig in.

Table of Contents

Installation

Customizing the Script

About Webpage Selectors

Resources

Conclusion

The Script


Installation

Downloading and Installing Tampermonkey

First things first, let’s get Tampermonkey on your browser. Head to Tampermonkey’s official website (https://www.tampermonkey.net/) and select the version compatible with your browser. Follow the instructions to add the extension.

Note: Descriptions in this article are based on using the Chrome browser.

Installing the Script

Once Tampermonkey is up and running, it’s time to install the script.

  1. Open the Tampermonkey Dashboard. See the Customizing the Script section or Tampermonkey documentation if you can’t find the Dashboard.
  2. Create a new script (look for the + icon next to the Installed Userscripts tab).
  3. Copy and paste the script provided here into your new script
  4. Select File > Save or CTRL+S.

Try it out

Do the following to try out the script:

  1. Open a product page on a vendor website, such as https://www.mcmaster.com/76155A29/. I mean, who doesn’t need nuclear-grade duct tape?
    Note: You need to be on a page for a single product, not the home page, not a category page.
  2. Press CTRL+SHIFT+C. Remember this hotkey combo by thinking of it like CTRL+C to copy, but on steroids.
  3. In the destination Excel file, select the first cell that should receive the data.
  4. Press CTRL+V to paste, or use whatever your favorite method of pasting may be.

Customizing the Script

To edit the script, click the icon for the Tampermonkey browser extension in your browser. If you have not pinned this extension, go to where you can browse installed extensions (in Chrome, this is a puzzle piece icon to the right of the address bar) and pin Tampermonkey.

Chrome browser extension listing after clicking the puzzle piece icon
Chrome browser extension listing after clicking the puzzle piece icon

After pinning the Tampermonkey extension, it should be visible on the browser’s toolbar. Click the icon, click the arrow next to the name of the script (“Scrape part data”), then click Edit.

The Tampermonkey browser extension menu
The Tampermonkey browser extension menu

If you don’t see the name of the script, select Dashboard and edit the script from there.

Note: after making any changes to the script you want to apply, remember to:

  • Save the script
  • Refresh the page from which you’re trying to scrape data so that it gets the most recent version of the script.

Changing the Hotkey

By default, the script uses CTRL+SHIFT+C to activate. To change this, Edit the script and find the following line:

if (e.ctrlKey && e.shiftKey && (e.key === "C" || e.key === "c")) {

This line means, “if the CTRL key AND the SHIFT key AND either upper OR lower case C are all pressed”. You can change it to whatever combination you want. You can add the ALT key to the combination by adding e.altKey to the if statement.

Changing the order of fields to match your columns

Find the “// Build the row to paste” comment

Change the order of the variables to suit your needs. Just make sure the variables remain separated by commas and there is no comma after the last variable.

Adding Vendor Sites

The script is versatile and currently works with the vendor websites I use most, including McMaster-Carr, AutomationDirect, Zoro, Grainger, and RS Online. But I suppose you’d like it to work with your favorite vendors, huh? Fine.

Copy the text from one

} else if (hostname.includes

to the closing curly bracket ( } ) for that “else if” and paste it after the last “else if” section. Make sure you end up with matching brackets, no missing or extra brackets.

Modify the hostname in the “else if” you pasted to match the domain of the vendor you’re adding.

Modify the script to find each piece of data. The next section is a deeper dive into how you can make these modifications.

About Webpage Selectors

This is not meant as a comprehensive guide to webpage selectors. It is meant only as an introduction so you have an idea of what topics to research. And those topics are going to be JavaScript and CSS selectors. Or skip to the TL;DR section.

The script finds the data to scrape from the jumble of code that comprises a web page using selectors. Selectors are essentially the names used to identify elements on a web page, like buttons, text fields, or prices. To find these, you’ll need to use your browser’s Inspect tool. Again, this article is geared towards the Chrome browser. The process is as follows:

  1. Right-click on the element you want to copy (e.g., the part number) on the vendor’s product page and select “Inspect”. This opens the DevTools panel, showing the HTML code for that element.
  2. Look for unique identifiers like id or class attributes in the code linked to the element. These are what you’ll use in the script to pinpoint data.
  3. Once you’ve identified a selector, you can test it directly in the browser’s console. Open the Console tab in DevTools and use commands like document.querySelector('your-selector') to see if it selects the right element.

Example

  1. Go to the Nuclear-Grade Duct Tape page (https://www.mcmaster.com/76155A29/).
  2. Right-click the part number (76155A29) on that page and select Inspect from the popup. The DevTools will open and
    <div class="bk attrComp productDetailATOCopy" data-mcm-attr-comp-itm-ids="">76155A29</div>
    will be highlighted on the Elements tab.
  3. Let’s posit that the bk attrComp productDetailATOCopy class is the selector we need. Copy this text from the DevTools Elements tab.
  4. Switch from the Elements tab to the Console tab along the top of the window.
  5. Paste document.querySelector('.bk.attrComp.productDetailATOCopy') into the console and press Enter. It should return
    <div class="bk attrComp productDetailATOCopy" data-mcm-attr-comp-itm-ids="">76155A29</div>.
    That is indeed the element we want. We just want its contents, however, not all the surrounding HTML code.
  6. Now try pasting
    document.querySelector('.bk.attrComp.productDetailATOCopy')?.innerText
    into the Console and press Enter. It should return 76155A29. Now we know what to put in the script to read the part number from a McMaster-Carr page.

Experiment with different selectors to find the most reliable one for each piece of data you need to copy. Be aware that sometimes a selector does not work or does not work as expected because it is used elsewhere in the web page’s code and therefore not unique (Remember honey, you’re unique just like everyone else). You may also need to use different selectors for different product types on the same website.

You may also need to update the script as the vendor websites change. I’ve already had to update the script twice in the few months I’ve been using it. Aw, I forgive, you mcmaster.com; all vendor sites should strive to have your drill-down product filtering capabilities.

TL;DR

If you’d like a shortcut for all of this, try pasting a snippet of the HTML surrounding the piece or pieces of data you want to scrape into ChatGPT or another AI chat and ask it for JavaScript code to extract the desired values from that snippet. For example, “This is a snippet of HTML. What JavaScript code will extract 76155A29 from this snippet?” Remember, you can get the HTML snippet by right-clicking the element you want in the browser and selecting Inspect.

Integrating your changes

After identifying the right selectors, you’ll need to integrate them into the script. Replace the existing selectors in the section of the script you copied with the ones you’ve identified for the new website.

Remember, you must Refresh (F5) the vendor website you’re targeting after saving changes to the script for the changes to take effect.

Bernie reminds you to Refresh the vendor page after modifying the script.

Test the script on the new site to ensure it’s copying the data correctly. You might need a few tries to get it right, so don’t be discouraged.

Resources

If you’re new to HTML and CSS selectors, consider exploring resources like Mozilla Developer Network (MDN) for a comprehensive guide. For a more interactive learning experience, websites like Codecademy offer hands-on tutorials. I’ve seen w3schools.com get a bad rap, but I’ve personally used that site for learning CSS and Javascript.

Conclusion

This script is more than just a convenience; it’s a significant efficiency booster that is bound to make your life easier. Give it a try. In the Comments, let others know which vendor sites you’ve added to your version of the script, or make requests for vendor sites to be added to the script provided here.

The Script

// ==UserScript==
// @name         Scrape part data
// @namespace    https://CTRLfreak.io
// @version      0.1
// @description  Copy part data from vendor site to clipboard for pasting into a row in Excel
// @author       CTRLfreak.io
// @icon         http://tinyurl.com/bdzmr45d
// @match        *://*/*
// @grant        GM_setClipboard
// @require      https://code.jquery.com/jquery-3.6.0.min.js
// ==/UserScript==

// REMEMBER TO REFRESH THE SOURCE (VENDOR) PAGE TO UPDATE THIS SCRIPT ON THAT PAGE AFTER MAKING AND SAVING ANY CHANGES HERE

/* global $ */
(function() {
    'use strict';

    $(document).keydown(function(e) {
        // Trigger the sequence with a hotkey, for example, Ctrl+Shift+C
        if (e.ctrlKey && e.shiftKey && (e.key === "C" || e.key === "c")) {
            // Prevent the default browser behavior when the hotkey is used so this script is executed instead
            e.preventDefault();

            // Function to copy text to clipboard using Tampermonkey's GM_setClipboard
            function copyTextToClipboard(text) {
                GM_setClipboard(text, 'text');
                console.log(`Copied to clipboard: ${text}`);
            }

            // Function to extract numerical price
            function getNumericalPrice(priceText) {
                const match = priceText.match(/\d+(\.\d+)?/);
                return match ? match[0] : 'N/A';
            }

            // Get the hostname of the current URL to determine which site we're on
            const hostname = window.location.hostname;
            // Get today's date
            const date = new Date();
            const formattedDate = date.toLocaleDateString("en-US", {
                month: "2-digit",
                day: "2-digit",
                year: "numeric"
            });

            let partNumber, description, priceText, price, mfgPartNumber,vendor, brand;

            if (hostname.includes('mcmaster.com')) {
                vendor = 'McMaster-Carr';
                partNumber = document.querySelector('.bk.attrComp.productDetailATOCopy')?.innerText || 'N/A';
                // Select the primary description by targeting class names that contain the stable portion of the class name
                const primaryDescriptionSelector = '[class*="Headers_product-detail-header-primary__"]';
                const primaryDescription = document.querySelector(primaryDescriptionSelector)?.innerText || '';
                // Select the secondary description in a similar manner
                const secondaryDescriptionSelector = '[class*="Headers_product-detail-header-secondary__"]';
                const secondaryDescription = document.querySelector(secondaryDescriptionSelector)?.innerText || '';
                // Combine primary and secondary descriptions
                description = primaryDescription + (secondaryDescription ? ', ' + secondaryDescription : '');
                priceText = document.querySelector('.PrceTxt, td.InLnOrdWebPartLayoutExpdView_prceLvlCell')?.innerText || 'N/A';
                mfgPartNumber = 'N/A';
                brand = '-';
            } else if (hostname.includes('automationdirect.com')) {
                vendor = 'AutomationDirect';
                partNumber = document.querySelector('#specsSection')?.innerText.split('Specifications')[0].trim() || 'N/A';
                description = document.querySelector('#itemdescription')?.innerText || 'N/A';
                mfgPartNumber = description.includes('Manufacturer part number:') ? description.split('Manufacturer part number:')[1].trim().replace(/[,.]$/, '') : partNumber;
                // First, check for the price per foot
                const allAdcGreenElements = Array.from(document.querySelectorAll('div.adc-green'));
                const pricePerFootElement = allAdcGreenElements.find(el => el.textContent.includes('/ ft'));
                let matches;
                if (pricePerFootElement) {
                    matches = pricePerFootElement.textContent.match(/\$\d+(\.\d+)?/);
                    priceText = matches ? matches[0] : 'N/A'; // Extracts the price
                } else {
                    const priceLabel = Array.from(document.querySelectorAll('div')).find(div => div.textContent.trim() === 'Price:');
                    priceText = priceLabel ? priceLabel.nextElementSibling.innerText : 'N/A';
                }
                // Extract the brand from the table
                const brandRow = Array.from(document.querySelectorAll('#taTable tr')).find(tr => tr.innerText.includes('Brand'));
                brand = brandRow ? brandRow.cells[brandRow.cells.length - 1].innerText.trim() : 'N/A';
                console.log('brand = ',brand);
            } else if (hostname.includes('zoro.com')) {
                vendor = 'Zoro';
                partNumber = document.querySelector('span[data-za="PDPZoroNo"]')?.innerText || 'N/A';
                description = document.querySelector('h1[data-za="product-name"]')?.innerText.trim() || 'N/A';
                priceText = document.querySelector('div.price-main')?.textContent.trim().match(/\d+(\.\d+)?/)?.[0] || 'N/A';
                mfgPartNumber = document.querySelector('span[data-za="PDPMfrNo"]')?.innerText.trim() || 'N/A';
                brand = document.querySelector('a[data-za="specifications-brand-link"]')?.innerText.trim() || 'N/A';
            } else if (hostname.includes('grainger.com')) {
                vendor = 'Grainger';
                partNumber = document.querySelector('dd[data-testid*="product-item-number-"]')?.innerText || 'N/A';
                description = document.querySelector('h1.lypQpT')?.innerText || 'N/A';
                mfgPartNumber = document.querySelector('dd:not([data-testid*="product-item-number-"])')?.innerText || 'N/A';
                priceText = document.querySelector('span.rbqU0E')?.innerText || 'N/A';
                const brandDt = Array.from(document.querySelectorAll('.ZTNukB dt')).find(dt => dt.textContent.trim() === 'Brand');
                brand = brandDt ? brandDt.nextElementSibling.innerText.trim() : 'N/A';
            } else if (hostname.includes('us.rs-online.com')) {
                vendor = 'RS';
                partNumber = document.querySelector('.allied-stock-number .OneLinkNoTx')?.innerText || 'N/A';
                const productName = document.querySelector('.product-display-name.OneLinkNoTx')?.innerText || '';
                const productDescription = document.querySelector('.product-display-description')?.innerText || '';
                description = `${productName} ${productDescription}`.trim();
                mfgPartNumber = document.querySelector('.manufacturer-part-number .OneLinkNoTx')?.innerText || 'N/A';
                priceText = document.querySelector('#unitPrice')?.innerText || 'N/A';
                let brandScript = Array.from(document.scripts).find(script => script.textContent.trim().includes('pdpPageData'));
                brand = 'N/A';
                if (brandScript) {
                    let pdpPageDataMatch = brandScript.textContent.match(/manufacturerName: '([^']+)'/);
                    brand = pdpPageDataMatch ? pdpPageDataMatch[1] : 'N/A';
                }
            }
            price = getNumericalPrice(priceText);
            const url = window.location.href;

            // Build the row to paste
            const dataRow = [
                partNumber,
                url,
                description,
                brand,
                mfgPartNumber,
                vendor,
                price,
                formattedDate
            ].join('\t');

            // Copy the concatenated data row to clipboard
            copyTextToClipboard(dataRow);
        }
    });
})();

Related Posts

Leave a Reply

Your email address will not be published. Required fields are marked *