https://kotlinlang.org logo
Title
b

Brendan Weinstein

01/12/2020, 2:05 AM
What is the best library for resizing and piping images on the backend in jvm-land?
c

codebijan

01/12/2020, 2:52 AM
Bytes
t

tipsy

01/12/2020, 10:27 AM
what do you mean by best? fastest? easiest to use? just anything that works?
b

Brendan Weinstein

01/12/2020, 3:46 PM
Something comparable to https://github.com/lovell/sharp
I've done a few quick searches on this and found a hodge podge of libraries that seem to be of questionable maintenance health. SO has answers mentioning needing to take a lot of care to avoid quality loss when using java's baked in libraries. I am just wondering if there is something simple (that I am missing) where I can plug and play with in under an hour to take a 2000x2000 image, transform it into a 1000x1000 image, and pipe the bytes to S3.
t

tipsy

01/12/2020, 5:20 PM
i've been using https://github.com/coobird/thumbnailator, but my needs are very limited
b

Brendan Weinstein

01/12/2020, 5:22 PM
Are you temporarily storing files on an application server or piping them to something like S3? I have thumbnailator bookmarked. A little bit concerned by the lack of activity recently, but that could just be an artifact of the library not needing maintenance.
t

tipsy

01/12/2020, 6:07 PM
Piping them somewhere. It's been a while since I worked on that project or used the library, but if I recall correctly it just wraps the java std lib.
👍 1
this should be easy enough to make a library for though 🤔
c

codebijan

01/12/2020, 7:20 PM
actually ez enough that you don't need a library, just make an extension fnx, BOOM done.
b

Brendan Weinstein

01/12/2020, 7:43 PM
SO folks report issues with image quality from standard Java libs
c

codebijan

01/12/2020, 8:48 PM
That's because people rely on the out of the box compression algos. If you wanted to you can easily impl your own given that you put in X amount of time, where X is dependent on your aptitude for picking up such task.
t

tipsy

01/12/2020, 9:52 PM
that sort of goes for anything, he just wants a library that he can plug in
👍 1
b

Brendan Weinstein

01/12/2020, 10:30 PM
I am asking from the context of migrating away from a node monotlith to a ktor monolith for a small business. If I have endpoints that work fine on the node monolith, it's a hard sell to sink hours into reinventing the wheel in kotlin/java-land. The node library is just a wrapper around some native code; to get equivalent performance I'd probably want to copy android's bitmap manipulation code. I'd make a strong bet android java bitmap scaling code is a wrapper around the JNI. Any work with the JNI is non-trivial.
r

rrva

01/13/2020, 9:30 AM
libvips is excellent however you need a JNI wrapper
i

Ian White

01/16/2020, 5:16 PM
i use imagemagick with the im4java wrapper library - just make sure you are not using one of the old versions! https://github.com/stardogventures/starwizard/blob/master/starwizard-services/src/main/java/io/stardog/starwizard/services/media/resizers/ImagemagickResizer.java
but the resizing algorithms are very high quality
j

jimn

01/28/2020, 9:53 AM
why is the jvm a value here? at the end of the day you can't do much better than ffmpeg as the defacto point of gravity for image and media codec/filtering development. the value of a commandline wrapper in kotlin is questionable, you can generally write a good commandline shell template string for things that get you c-library performance and stability. even though you could access libavutil directly, you shouldn't bother with that unless you have some kind of analytical direct memory requirement
b

Brendan Weinstein

04/05/2020, 8:48 PM
Thank you for the ffmpeg+shell recommendation. It's taken me a while to revisit this, but I was able to write a pipeline that resizes images and pipes the output directly to S3. I am searching online for some literature on how to protect an endpoint that executes commands from being hijacked. I am all ears for suggested reading or methods.
j

jimn

04/09/2020, 4:39 PM
you are saying you pipe commands through something to execute remotely? chances are you are not saying you feeel insecure with ssh. with ssh you can create a remote shell. the simplest way to explain this is, with linux or git-bash in windows, 1. run
ssh-keygen -trsa -b2048
2. ssh user@remotehostname 3. mkdir .ssh;chmod 600 .ssh 4. exit 5. scp .ssh/id_rsa.pub user@remotehostname:.ssh/authorized_keys 6. ssh user@remotehostname 7. chmod 400 .ssh/authorized_keys
b

Brendan Weinstein

04/09/2020, 4:42 PM
I am talking about the use-case of having a user on a website uploading a file via a form and having a backend service that resizes the image and pipes the output to S3.
j

jimn

04/09/2020, 4:48 PM
most cloud providers have serverless ffmpeg access via messageques if you are trying to scale up. you may have to set params thru a rest api instead of the ffmpeg syntax. it is otherwise easy to build a daemon thread in main waiting for new files with json model details of what work needs to be performed and the run ProcessBuilder commands same as commandline
i tend to craft my own to get 100% perfect codec results and control the binaries and codecs built into ffmpeg
b

Brendan Weinstein

04/09/2020, 4:51 PM
it's a very niche product, so scaling up is not a big concern.
j

jimn

04/09/2020, 4:53 PM
the endgame for online video, (and it works for audio too) is to work out the ffmpeg DASH profiles you want to serve from the ingested content and you can build 3, 7, 12 geometries and codec bitrates for adaptive streaming clients
b

Brendan Weinstein

04/09/2020, 4:55 PM
From the reading I've done, if you are just piping data to ffmpeg, there isn't a best practice for ensuring that that data piped in cannot execute some other arbitrary commands
j

jimn

04/09/2020, 4:55 PM
one of your options is to lay the files down as segmented mp4 and webm files which can be downloaded media (about 5% overhead) but also support adaptive streaming
b

Brendan Weinstein

04/09/2020, 4:56 PM
I might be missing something with your thoughts on adaptive streaming, but my end goal is to just resize an image the user uploads and store it in S3
j

jimn

04/09/2020, 4:56 PM
i would not use pipe operator that's what i was saying about a peice of json describing the input files to process.
iirc you can just tell s3 to serve said new geometry with a GET query param
b

Brendan Weinstein

04/09/2020, 4:57 PM
Ah sorry, miscommunication, I am not using the pipe operator, I am using the word pipe to describe transformating the data as it comes through and then sending the transformed data to S3
ie executing this command
ffmpeg -i pipe:0 -vf scale=$width:-1 -f image2pipe pipe:1
where the data is coming from an inputstream from a web form
j

jimn

04/09/2020, 5:02 PM
I would use discrete file name references for input and output in ffmpeg and perform the work in discrete steps almost like having a makefile, there is so much less syncronization to go wrong serverless image rescaling from AWS ... https://github.com/amazon-archives/serverless-image-resizing
b

Brendan Weinstein

04/09/2020, 5:04 PM
what would be an example of synchronization gone wrong?
j

jimn

04/09/2020, 5:05 PM
well, you can tell when things don't finish...but you may not be able to diagnose a bad input stream and reproduce failures as easliy. maybe you can control the source of media but if it's user submission you are up against thousands of vendors.
lately, i have had strangeness in images that look correct in mature image viewers but are upside down or sideways when encoded and .... just saying variables
b

Brendan Weinstein

04/09/2020, 5:07 PM
that sounds like exif issue
j

jimn

04/09/2020, 5:12 PM
i guess the point is having the files that fail is key to building the pipeline to be robust, however failure occurs. ffmpeg treats images as 1-frame of video fwiw. not the other way around.
ffmpeg -i in.jpg -s $width:-1 -f -y out.pnm is slightly simpler fwiw. scale is a dedicated parameter
https://jnorthrup.github.io/ffblockly/ is my hack to link together filters fwiw
👍 1
r

rrva

04/11/2020, 10:59 AM
isnt there a cost to fork a new copy of ffmpeg on each request? For video this is such a small cost of the total runtime, but for images…
j

jimn

04/13/2020, 1:10 PM
how slow do you presume the ffmpeg c executable to parse options and execute codecs?
you can reduce the executable size at build time by exlucding what you dont use. if you exclude certain lbs you disable video or audio codecs ...
i love seeing the adjectives light-weight, high performance, minimalist in self-assigned summaries. you can stick them in front of js, php, ruby, or python, and it means the same to the authors as hand optimized machine code.