So how do I use this package to get all the img sr...
# skrape-it
g
So how do I use this package to get all the img src urls off a page assuming I don't need the client? This is confusing:
Copy code
val reportPage = client.get<String>(report.link).toString()
                var images: List<String> = emptyList()
                htmlDocument(reportPage) {
                    images = img() { findAll { withAttribute }}
                }
                println(images)
Copy code
val reportPage = client.get<String>(report.link).toString()
    var images: List<String> = emptyList()
    htmlDocument(reportPage) {
        images = img() { findAll { eachAttribute("src") }}
    }
    println(images)
Better start... trying to figure out how to get all or, even better, just the main picture off of a news article site from the common outlets... cnn, fox, nbc, etc
c
hey sorry for late response. to get all links you could do (using version 1.0.0-alpha6 - which i highly recommand. it’s already very close to the final version that will be published soon):
sorry need to make a scrrenshot. somehow slack won’t let me paste the code propperly
in your case you would do the same but for
img
-tags and it’s coresponding src attribute, like this:
message has been deleted
g
Thanks man... I'm starting to figure it out. The New York Times site keeps causing problems while trying to scrape imgs, I see the appropriate meta tags on news articles but scrapeit isn't finding them, might be a security thing or something... but regardless my other test sites are working well so I am starting to figure this out. Thanks man.
👍 1