The most preferred format (both for us and across clients) is XML considering its robustness. We can also do CSV and JSON.
What is the maximum frequency you can scrape data at?
Frequency would depend on your specific requirements. We can extract the data at a frequency ranging from a few minutes to once in a month.
Can you crawl sites which require login?
We can extract data that are behind login. We would require the login credentials from the Clients’ end. However, we will not be able to help if there is a captcha or the site legally blocks automated login.
Is there any way you could frame a work flow and harvest content (price and reviews) according to our needs, we can provide you the sources?
Sure that is possible, in fact that was our first offering, over the period we have added more.
Can your platform perform multi-lingual (non-English) crawls too?
Yes. Till date, we have crawled sites in German, Danish, Norwegian, Chinese, Japanese, Hebrew and Spanish, French and Finnish.
If we provide you with a list of URLs, can you crawl those and deliver in a format we specify?
Yes. We internally discover relevant pages to crawl. So if you already have the list, that’s even better as long as the sites involved allow bots.