I'm trying to write a template generator that, amo...
# advent-of-code
n
I'm trying to write a template generator that, among other things, grabs the input from the website when it's available. I'm terrible at IO, and have never done any web scraping at all. I'm trying to follow Java StackExchange tips but my efforts result in a 400 Bad Request code. Eric said that he'd be blocking people who don't change their User Agent property but I'm getting the error even before I set the property. I'm clearly in over my head. Does anyone use such a thing, and if so can I look at your repo?
e
you get a 400 when you make an unauthenticated request. session cookie is required to get input because it differs by user
n
@ephemient Excellent point, and one that I had missed. But it doesn't steer me clear of the shoals just yet. I've got other issues. The java standard library seems to require an open connection before setting request properties, but the 400 error occurs on opening a connection. But undoubtedly I'm misunderstanding something, which is why I'd hope to look at someone else's code.
j
Someone I know created some scripts to automatically download samples and create templates. It's for Python, though. Could still be useful to you: https://gitlab.com/mpsijm/advent-of-code
e
my input fetchers: https://github.com/ephemient/aoc2021/blob/main/get-inputs.main.kts (last year's, Kotlin but not timezone aware and doesn't set user-agent) and https://github.com/ephemient/aoc2022/blob/main/get-inputs (this year's, Python but timezone-aware and sets user-agent)
in any case,
.openConnection(): URLConnection
is a bit of a confusing name. it doesn't mean that the a connection is open yet; you still have the opportunity to set properties before calling
.connect()
. partly this is because
<http://java.net|java.net>
is designed for around many types of URLs and connections, some of which aren't HTTP-like (such as
JarURLConnection
)
if you're willing to use other dependencies, there's plenty of better alternatives to
HttpURLConnection
n
Thanks! I will look through it. Yeah, it looks like the Exception was triggered by setRequestProperty, not openConnection.
@ephemient Hey, it worked! Thanks so much for your help!
m
Aha I just wrote myself a bash script that uses curl and hooked it to a toolbar button in intellij. You need to set the cookie in an env variable. super simple if anyone is curious I’ll share
e
reminder: Eric is requesting that all automated requests include contact information in User-Agent https://old.reddit.com/r/adventofcode/comments/z9dhtd/please_include_your_contact_info_in_the_useragent/
m
aha ok I’ll add that thanks
n
Awesome! Made a script to generate folders/template for each day. Then the template looks for an input.txt file in that folder. If it doesn't find it, it adds it from the AoC website. Thanks again for your help, @ephemient!
n
my "get input" is a 2-liner shell script:
Copy code
TZ=America/New_York day=${1:-$(date '+%-d')}
https -d -q -o "input/Day$day" --session=2022 "<https://adventofcode.com/2022/day/$day/input>"
e
mine is complicated mostly by having it wait for midnight if necessary, with output while waiting
I've got a GitHub action set up as well which downloads whichever ones are missing, with caching https://github.com/ephemient/aoc2022/blob/main/.github/workflows/get-inputs.yml
n
yeah, understood. I am not trying to solve quickly (am not good enough anyway, but also sometimes still have calls at 9pm), so was not worried about "get as fast as possible".
e
I've definitely got stuff in there that is not super useful to others, yep. but it does mean I can trigger to run and publish benchmarks like this: https://github.com/ephemient/aoc2022/actions/runs/3596772734 produces https://ephemient.github.io/aoc2022/jmh-visualizer/index.html 🙂