minerva's akasha - how I handled storing and rendering large files in a client-side web application

For about three months now, I've been working on a research data collection and analysis tool. it's called minerva's akasha, and it is the most complicated personal project I've ever worked on. I've learned so much stuff about how browsers work, how javascript works, and how to design a good user interface. I've also learned about a lot of things that I've never really had to think about in web development before, like multithreading, space complexity of algorithms, performance optimization, windowing systems, and much more.

loading an audio file into a minerva's akasha shard.

one of the major things that I needed to figure out was how to store and render large files with no perceptible delay to the user. when I first implemented file drag and drop into minerva's akasha, the entire ui would hang for a while while the browser loaded the file into memory on the main thread.

I think users are accustomed to things happening fairly smoothly. For example, if you click on an image file on your computer, you expect to be able to do other stuff while you're waiting for the file to open. even if the file is huge and takes time to load, you still have that expectation, right? it's a reasonable expectation, and vital to user experience, so it's something that was necessary to build into this software. I'll tell you how.

a file entry in minerva's indexeddb

files have always been stored in minerva's akasha using massive base64 strings in an indexeddb database. that's fine for storage, but when you load multiple 10mb+ base64 strings into a tab's memory, things start to get a little bit choppy. in fact, before I fixed this problem, here's what I was doing:

- when a file needs to be rendered, look in indexeddb for the correct entry using a file id
- then, take the base64 string and attach it as a src attribute to the audio, img, or video element
- do this on the main thread, preventing the user from doing literally anything else within the application

there are a few major problems with this approach. if you've ever seen a base64 string version of a file, you'll know the first problem: base64 is huge, usually even larger than the file that it represents. just take a look above at the database entry for that audio file — the actual file is only seven megabytes, but the base64 is eighteen megabytes.

the next problem is that using base64 as an element src causes a lot of performance issues, especially with files that are already large, like a high-quality .flac file or a long hd video.

so what's the best solution? as a newbie to building complex software and a long-time web developer, multithreading wasn't the first thing that came to mind. after a couple of days' research, that was the solution I arrived at in the end — using web workers in conjunction with object urls.

the audio component takes in a base64 string

I simply used a web worker to take the base64 string — the src variable in the image above — and pass it to a worker, seen in the image below. that way, the main thread doesn't have to do the heavy lifting of dealing with the eighteen megabyte string.

a worker meant for loading a base64 string and turning it into an object url. why is the function async? I don't remember, to be honest.

so, no more base64 media sources meant instant, quantifiable, massive performance gains. I wish I had a before / after performance analysis screenshot, but I didn't take any screenshots while working on this, (and I don't want to go back and unfix it just to get some) so here's the after shot (chrome dev tools performance tab):

very little work is being done on the main thread now, and none of it is interaction-blocking.

using small, lightweight object urls was incredibly performant, and that was the end of the problems affecting file loading and storage.

using blob urls as media sources.

here's the entry I made in my solved issues log on the day I fixed this problem.

thanks for reading! if you're working on an application with a complex ui that needs a performance boost, I hope this article gave you some ideas! otherwise, hope you learned something interesting. I'll be back on saturday with another album of the week or other music-related post, and then next week with a technical post on something interesting.

currently listening to:

- silent finale by the musmus

- q.t. rush by t+pazolite

- creep u by black dresses

- sunny by lucy van pelt / advantage lucy

- emil / ultimate weapon no. 7 by keiichi okabe

- the last page by arforest


this article by leo fabrikant really helped me out back when I was working on this feature!