User Tools

Site Tools


projects:digikey_partsdb

This is an old revision of the document!


digikey parts slurper

fetch www.digikey.com/product-search/en?FV=

grep for catfilterlink

remove beginning of line to inclusive

remove end of line from inclusive

produces following info
grabbing FV's

we need the FV's to crawl each subsection. grab all the above urls, make sure Results per Page = 500. The CSV download is capped at 500 results per fetch, so no point increasing this value.

  • <input type=hidden name=FV value=fff40000,fff80000>

also grab the total page count

  • <a class=“Last” href=”/product-search/en/undefined-category/undefined-family/0/page/8”>Last</a>

The page/8“ is the total page count, pages start from 1

grab the FV value and page count, and store for each of the above URL's

crawl individual pages

curl with a valid useragent i used –useragent “Chrome/1.0” but vary it to avoid rate limiters.

projects/digikey_partsdb.1381593057.txt.gz · Last modified: 2013/10/12 08:50 by charliex