Much of what we're doing now is scraping the webpages for the required information. Doing such activity without even declaring who we are might not be seen as a very good practice.
We need to settle on a useragent choice. Some websites even block regular scraper useragents. There are two solutions I can think of:
- One consistent and custom useragent
- A list of browser user agents
There could exist benefit of either methods but I'm not so sure.
References:
https://forum.mattermost.org/t/mattermost-website-returning-403-when-headers-contain-the-word-python/11412
Much of what we're doing now is scraping the webpages for the required information. Doing such activity without even declaring who we are might not be seen as a very good practice.
We need to settle on a useragent choice. Some websites even block regular scraper useragents. There are two solutions I can think of:
There could exist benefit of either methods but I'm not so sure.
References:
https://forum.mattermost.org/t/mattermost-website-returning-403-when-headers-contain-the-word-python/11412