1 | ||
1 | ||
1 | ||
1 |
I'm going to scrape these sites with or without their consent.
I just upped my game with scraping polymarket. They are a little locked down. I got around it using puppeteer and loading one of these embed urls and then instead of reading the dom which is obfuscated I just made a call from inside that page (headless browser) against the API that the page was calling, that by some magic I don't feel like dissecting can only be called from a polymarket page. This is a fast and brainless way to get at some data without figuring out what the magic headers and cookies are.
But this was a little slow at times taking a second and a half.
What I just did is I made a function and a helper that together return a cached promise to the loaded embed page. The client code then just awaits that to get the page and then makes the api call from there. Then we can have different api calls made all from the same page and we reduce the number of request from my IP by a lot, get much faster time, can start issuing multiple requests right away, and can hit it with enough frequency to start modeling with this data, and can get faster and more clearly resolved threshold events sent to my phone.
The idea isn't that unusual or impressive. I only mention it because I want to call myself the scrapist.
I'm going to scraaaaaaaaaaaaape!