https://kotlinlang.org logo
#skrape-it
Title
# skrape-it
c

Christian Dräger

11/18/2020, 4:32 PM
Hey currently not. It should be technically possible somehow but since the library is first and foremost focusing on parsing html performing clicks or stuff like may or may come not as a feature in the future. Regarding XPath, it's also not supported currently but sounds like a suitable feature for the library that should be inplemented I can the near future .
s

savrov

11/18/2020, 8:20 PM
Thank you for a response. Just playing around with a tool, and mentioned that it can not parse Etoro page. Can i ask u to check if its me or a website have a better protection? I receive only a big list if
<script/>
tags (
BrowserFetcher
does not seem work for it)
c

Christian Dräger

11/18/2020, 8:32 PM
Sure, I'll have a look tomorrow. Looks like they make heavy use of js on etoro. Pages like that are hard to skrape sometimes.
🙂 1
ok i see what you mean. crazy stuff. looks like they have a modular frontend where (maybe) different teams deliver parts in form of an js and the main page is only there to load them all to be part of there angular app and do other stuff like tracking, seo, sem etc. i think you will only have luck with a real browser there. i will add a
ChromeFetcher
tomorrow that will use a real headless chrome browser behind the scenes. i think its an useful feature anyway.
s

savrov

11/19/2020, 4:26 PM
Thank you for analysis, yeah, they have it done really well :)
c

Christian Dräger

11/19/2020, 4:32 PM
they also do
<body onload="setTimeout(function() {window.scrollTo(0, 1)}, 100)">
a scroll 100ms after ducoment ready (this is the time where skrapeit takes the page source). i mean this seams useless but maybe it is triggering another script or funky stuff. who knows 💁‍♂️
6 Views