Stylized Alex Ward logo

Tagged “tech”

Teracube 2e

The Teracube 2e is a budget phone that promises to be more eco-friendly than your typical phone by committing to 3 years of updates as well as affordable repairs for 4 years (which is also the duration of the standard warranty), while also promising features and hardware that will remain relevant for that time. All of this for the low price of $199.

I've found that it generally lives up to its promises, but had to make several (and sometimes bizarre) compromises to get there.

I picked this up primarily to be a device I'm okay accidentally chucking in a river when I'm out traveling internationally, and can slot in a second sim for local data rates via a local SIM (though now that there are several eSIM options, I might do this with another phone in the future, particularly one with a better camera).

Things I wish I had known before buying the 2e

Permalink to “Things I wish I had known before buying the 2e”

There are a couple of things I really wish I had known before buying the 2e, which might have affected my decision to get it.

  • It does not support Power Delivery, which in effect means you have to use a USB-A to USB-C cable to charge it.
    • This is maddening! All of my chargers are basically PD ports, so I have to carry a whole separate cable just to charge the phone.
  • Verizon is not supported, and this includes all of their VMNOs. This is due to Verizon's certification process evidently.
    • I wanted to toss my V1 Ting card in there, and I had to use the X3 card instead - not a huge deal but did impact my plans.
    • There are a few reports on the forums of people getting Visible to work, but I've not been able to.

In a similar category, that I did realize before buying was that there's no IP rating, and so I wouldn't recommend trying to use this in a downpour or maybe any moist condition at all.

For a $200 USD phone, you really shouldn't be expecting much in the way of "great" but the 2e actually does an admirable job in several areas. The first thing that I like to call out is the display. It has a 720X1560 IPS HD+ display, which actually looks pretty great for the price point. The refresh rate is only 60hz, but asking anything more at this pricepoint is unreasonable.

The battery life on the 4,000 mAh battery is also quite good, I'm easily able to get a full day out of moderate usage - which is not something I can say for moderate usage of my primary phone. One of the other major selling points is the fact that the battery is replaceable, so you can pop off the back and swap it out if you need to, which should prolong the life of the device. Plus, there are settings which can prevent overcharging so you can tell the phone to only charge up to say, 80%, and then stop charging. Given the good battery life already, this isn't a big deal at all and should prolong the battery life even further.

The placement and the speed of the fingerprint sensor are a nice change of pace from the in-display sensors I've gotten accustomed to. It's been able to read my prints every time and unlock the phone quickly.

NFC Support! Which means it does work with Google Pay. I've not actually added any cards to it, but I have tested it as a general NFC reader and it works perfectly fine for that.

Dual Sim card support is another big draw, not too many phones have that. On the Samsung Note 20 Ultra you can use two sims, but then you give up the extra SD card storage - with the Teracube 2e, you don't have to choose, you get 2 sim slots and a microSD slot. It's wonderful.

Generally working custom ROM support is also on the Teracube forums, and the folks at Teracube are super cool with flashing custom ROMs to the phone.

The big stand-out problem here is the Camera. It's serviceable, but with the stock camera app it takes forever to focus on an item - HDR mode is so slow it's basically unusable for moving targets. I didn't expect much from the camera...but I expected more than whatever this is. Being able to snap photos on the go while traveling and get a semi-decent picture out (if I wanted it to be great I'd just carry my primary phone). Not being able to do this definitely makes me question using this as a travel phone.

Likewise, the processor is a little sluggish at odd times. In general, performance is reasonably good and on par with the Moto Play line of phones, but sometimes it'll chug just a little - enough to miss a letter while typing or just stuttery enough to be annoying but not super disruptive. I expect this will get better after some updates or via a custom ROM - but it does make me wonder how the phone will handle Android 11 or 12. This phone's supposed to get updates for 3 years, remember?

So, if you're getting this phone you shouldn't expect that you'll be able to play modern phone games - but it can be nice to sit down and play something if it's the only device you've got.

The short answer is that any modern 3d game is going to have a really hard time, with some interesting exceptions. I've tried the following games and can report on how well they work:

  • Genshin Impact
    • Totally unplayable - the framerate is bad enough that gameplay is impossible. You can manage your inventory and collect daily things, but it won't be a fun experience.
  • Battlechasers: Night War
    • Mostly Unplayable - it stutters a lot when you try to move around the map.
  • Black Desert Mobile
    • Borderline - So this game is a weird one in that the performance is adequate, and you could play it if you really really wanted to... but the quality level is potato. It's going to be blurry. This is probably fine if you just want to hop on, sell some stuff, do dailies, etc - but I wouldn't want to be playing it for any length of time.
  • League of Legends: Wild Rift
    • Reasonably Playable - this surprised me, I managed to play a full match at a solid 30 fps without any noticeable slow down when I was in the game. Load times are noticably slower though, so you might be the last person to fully load the game, but otherwise it works just fine.
  • Love Letter; One Deck Dungeon; Reigns
    • Plays great - these are basically digital card games, and they play really well on the device.
  • Out There Omega
    • Plays well - a little slow at points, but all in all it runs great
  • Game Dev Tycoon
    • Plays great - I'm pretty sure this game could run on a literal potato, but it does do just fine.
  • Doors: Awakening
    • Mostly Playable - This is a nice little puzzle game, and it generally works fine but the load times are noticably slow and sometimes interactions are slow.
  • Polytopia
    • Extremely Playable - It's a nice little civ-like mini game. No issues, runs very well.

I ran a few benchmarks against it and unsurprisingly it didn't do so well:

  • 3DMark Sling Shot - 708
  • 3DMark Sling Shot Extreme - 378
  • Geekbench CPU - Single-Core 141, Multi-Core 825
  • Geekbench Compute - 0 (it crashed)

At the time of this writing, out of the box you cannot install apps to the SD card. You have to first enable developer settings () and then turn on the option to enable it. Afterwards, it works just fine. It'd be nice if it were in the stock image and maybe it will be after their first upgrade.

Use a different camera app. It doesn't make the camera much better, but it does speed up how quickly the auto-focus takes effect. The forums recommend Open Camera, which seems to be working fine for me (even though I kinda hate the UI).

So bottom line, I got this as a secondary phone - not to be my daily driver and I do not think it would be a good daily driver for me. I'd probably survive but I wouldn't enjoy it, especially since I like playing some fairly intensive games on the go or from the couch.

That said, if you're the kind of person who is already considering something like a Moto G line phone, or one of the budget Samsung Phones (like an A52) - and you don't need a serviceable camera, this is a solid contender at a price point that's difficult to beat. Once you consider the warranty, the accidental damage repairs, and the promise of a more eco-friendly phone that might just push you over the top.

ROG Ally

I don’t have a handheld gaming problem, I swear.

Okay. Maybe I have a bit of a handheld gaming problem. See, when we last talked about this, one of the key things that I outlined as a requirement for my gaming handheld was the ability to connect an external GPU to the thing so that I could play it on a desktop with decent performance.

While I could eventually do that with the Ayaneo 2, the process was extremely fiddly and prone to breaking any time there was a software update. Likewise, you had to plug the eGPU in during a particular part of the boot sequence so that it would properly detect and use the eGPU. Related problem: if you allowed the system to go to sleep, it would not properly function when it awoke, requiring you to force restart the machine and do the little plug-it-in-ritual all over again. It was extremely frustrating.

So, doing the thing that I seem to always do, I went completely overboard and looked for a different solution. When Asus decided to put out their own handheld, complete with a mobile eGPU solution (which uses laptop graphics cards - we’ll get to that in a minute), I decided that was the way to go. I purchased the Rog Ally Extreme at launch (which, at the time, did not contain the “extreme” moniker - that came later when Asus made a less-performant but cheaper model) along with one of their XG Mobile eGPUs (the 3080 version). The price was eye-watering, and for the eGPU - overpriced. What it promised, however, was a better eGPU experience, and I was willing to pay a premium if that worked. More on that later.

The Handheld Experience compared to my other handhelds

Permalink to “The Handheld Experience compared to my other handhelds”

On the whole, the Ally is a great experience as a handheld. The actual comfort of the device while it is in use ranks #2 out of my entire collection, just behind the Steam Deck. It’s really comfortable to use for extended periods of time (in my hands, which are rather…large), and it even has back paddle buttons which you can remap to do what you want with them.

The screen is also a delight - it has a 120hz refresh rate, which you’ll be hard-pressed to actually hit on anything but retro or indie games, that has a nice color quality to it for an LCD. Asus’s software support also has several visual tweaks you can apply to make the colors more vivid, should you desire that. It also is a 1080p display, which is just much better than the Steam Deck, and just behind the 1200p of the Ayaneo 2. It’s not as vivid as the OLED on the Ayaneo Air, but it’s probably my favorite screen of the bunch.

Sidebar: The Ayaneos both use displays with a native resolution that is vertical and then uses software to make the image horizontal - this causes an issue in some games, notably Phasmaphobia, where the resolutions only appear in the vertical orientation.

Performance-wise, it’s definitely more capable than the Steam Deck for sure. There are a ton of reviews out there that actually crunch the number, but my “feels” test for undocked performance, in 15W and 30W power profiles, is that the Ally is the most capable device I possess. The second place goes to the Ayaneo 2, but it still struggles with other games with the previous gen Ryzen processor.

The size is a bit wider than my Ayaneo 2, but smaller than the steam deck overall, so it’s really the perfect size for couch gaming - I still default to the Ayaneo Air for travel though, it’s really hard to beat its size profile. If you want to see a comparison Rock Paper Shotgun has several comparison photos.

Overall: This is the device I pick up most often, both when playing on the couch and when playing docked.

ROG Ally docked to the XG Mobile

Remember earlier how I said Asus promised a better docked experience than the Ayaneo 2? Well, that only partially materialized. For one thing, when you plug in the XG Mobile, it recognizes that you’ve done so, and it prompts you to switch the graphics driver from the internal to the external - giving you a prompt to restart the programs in question. So far, so good.

That works about 80% of the time now (when I first got it, it was maybe 60% of the time, so hey, improvement!). When it doesn’t work, you either have to run the script several times to try to get it to stick - or you need to reboot the device and try it again. So while I’m happy that I don’t have to do the little cable dance ritual anymore, I still have to deal with a non-functional script sometimes.

Likewise, this combo also has the problem where if it doesn’t output to the screen for a while (either by falling asleep or you switching monitor inputs), it will refuse to display on that screen again until you suspend / resume it. That’s better than a full reboot by far, but it’s still annoying. For a laptop eGPU that cost nearly $1500 at the time of purchase, I feel like the experience should be better.

Maybe I’m being nitpicky here, but that’s a premium price and I don’t think it’s super unreasonable to expect a premium experience. To Asus’s credit though, it’s been getting steadily better with software updates, so maybe it’ll get to that coveted 99% (nothing ever works 100% of the time with computers).

Cracking it open and upgrading its innards

Permalink to “Cracking it open and upgrading its innards”

After a couple of months with the stock drive, I upgraded the SSD inside of the Ally from the 512 GB it came with to a 1 TB SSD, and the actual installation process was a breeze. The case was simple to get open, and the SSD was readily accessible without having to move anything out of the way. Asus has also produced a handy guide that covers the process. It’s a snap.

I chose to reinstall the OS instead of copying the old SSD over, and … that was a bit of a process. On the good side, Asus includes a utility in the BIOS to re-flash the SSD with everything you need, Windows included. On the bad side, when I did it, my version of the BIOS had a bug that messed up the system clock, so I had to search around the internet until I found a post detailing that fact. The fix was simple: Set the system time to an accurate time yourself, and then you can proceed. HOWEVER! Even with that fix, I had to attempt this install a grand total of 4 times before I got a functional Windows install. It did eventually work, but it was several hours’ worth of “try, wait, and try again”.

It took them a while, but Asus did finally admit that the ROG Ally has an appetite for eating delicious SD cards, metaphorically at least. Apparently, the card reader gets a little too hot during operation and that can cause it to malfunction and completely break the SD card inside.

While I’m happy to report my unit has not destroyed any SD cards yet, the write speed on it is atrocious. Takes absolutely forever to install anything but the smallest games. As a result, I barely use it for anything that’s not tiny. Which is a shame, since I have filled up the SSD a few times now and have to regularly prune games in order to keep relevant stuff installed. Either that or resort to game streaming. Glad I have a good internet connection.

Look, I really like the ROG Ally - it’s become my daily driver. I even wound up giving the Steam Deck to a friend, despite it having the vastly superior control scheme. Perhaps I’d be happier with the Steam OLED’s upgraded display, but I don’t feel compelled to spend any more money when the Ally fits my needs for the most part. The fact that I dropped $2000 on it all told has also made me more fond of keeping it alive and relevant for as long as possible.

That said, the handheld gaming scene keeps evolving - who knows what we’ll see in the next couple of years.

Let's do some "Game Development"

Shutterstock vector art of some hands typing in front of a monitor showing an animation editor
Kit8.net @ Shutterstock #1498637465

The other day, a friend asked me a complicated question wrapped in what seems like a simple question: “How long would it take to make some simple mini-games that you could slap on a web page?”

My answer, as many seasoned developers will be familiar with (and of course it’s coming out of the architect’s mouth) was “It depends on the game and the developer. I could probably churn out something pretty good in a few days, but it’d take someone more junior longer”.

This set off the part of my brain that really wants to test out just how fast I could do a simple game in engines that I’m not familiar with. Thus, I decided to ignore my other responsibilities and do that instead! Mostly kidding about the responsibilities thing, I’m ahead of my writing goals for Nix Noctis, so I had a couple of hours to spare (and a free evening).

How long would it take to make a simple “Simon” game in GDevelop: a “no code” game engine? I wanted to start with GDevelop for a few reasons:

  1. I wanted to simulate the experience someone with little coding experience would encounter. Knowing that I’ve internalized some concepts that would make the experience easier (or, in some cases, harder) for me.
  2. You can run the editor in a web browser, and that’s bonkers.
  3. I want to “pretend” to be a beginner; only use basic and easily searchable features.
    1. I mean, I am a beginner at game dev, but I do understand several the concepts.

With those things in mind, I set out to remake Simon. Here were my design constraints:

  • Four Arrows, controllable by clicking them or via the keyboard.
  • Configurable number of “beeps” in the sequence
    • The game would not increase by one each time (though it easily could)
  • Three Difficulty levels which control the speed that the sequence is shown
  • Bonus points for sound effects

REMEMBER! I’m a beginner at GDevelop. You’re likely going to see something and say “hey that’s dumb, you should have done it another way”. Yes. Exactly.

The editor experience in GDevelop is actually really nice, especially when you can just get into it via clicking into a browser. I found adding elements to the page very intuitive. Sprites are simple, and adding animation states to them is effortless. Creating the overall UI took me probably 20 to 30 minutes to iteratively build out a structure I was happy with, it was fast. Another fun thing I discovered was that they have [JFXR] built into the editor, and that was a delight.

Screenshot of GDevelop's UI Editor

What was not so quick was wiring up the game logic to the elements on the page. I’ve looked at some GDevelop tutorials before, and if you’re treading a path that’s covered by one of their “Extensions”, you’re going to have a great time. A 2D platformer will be a breeze because you can simply attach a behavior to the sprites in your game and go. There are a bunch of tutorials on making a parallax background for really cool looking level design. Simple!

What is not so simple is if you fall outside those behaviors and need to start interacting with the “no code” editor. On one hand, the no code editor is nice! The events and conditionals are intuitive if you’re approaching it in certain ways. They even let you add JS directly if you know what you’re doing (though they recommend against it). On the other hand, I can see this getting quickly messy. In my limited experience with the engine, I could not find a good way to reuse code blocks. This will come up later.

Sidebar, dear reader, I believe this is where they would say “you should make a custom behavior to control this”. I’m not sure a beginner would think to do this, but I thought about it and said, “I’ll just duplicate the blocks”.

Screenshot of GDevelop's Code editor

As I worked through this process, I ran into a number of weird stumbling blocks that slowed down my progress while I tried different things.

Things that were surprisingly straightforward

Permalink to “Things that were surprisingly straightforward”

GDevelop has many layers of variables, Global, Scene, Instance, and so forth. Easy to understand, fairly easy to access and edit.

Were I making this game in pure JS, generating the sequence would be a pretty “simple” one-liner (It’s not that simple, but hey, spread operator + map is fun! I’d expand this in a real program to be easier to understand):

// Fill a new array of size number_of_beeps with a digit between 0 and 3 to represent arrow directions
let sequence = [...Array(number_of_beeps)].map(() => Math.floor(Math.random() * 4));

Generating the sequence was remarkably easy once I figured out how to do loops in GDevelop; they’re hidden in a right-click menu (or an interface item), but the “Repeat x Times” condition is precisely what we needed.

Screenshot of the Repeat x Times condition

Likewise, doing the animation of the arrows was pretty direct. All you need to do is change the animation of the arrow, use a “wait” command, and then turn it back. Easy!

Screenshot of GDevelop's editor

Turns out it’s not actually that easy. The engine (as near as I can tell) is using timeouts under the hood which means they’re not fully blocking execution of other tasks in sibling blocks while this is happening. Which means…

Wait for x Seconds is weird / doesn’t work right

Permalink to “Wait for x Seconds is weird / doesn’t work right”

Okay, so when you’re playing Simon, the device will beep at you, light up for a non-zero number of seconds, and then dim. It should do that for each light. If you’ve never experienced this wonder of my childhood, watch this explanatory video from a random store:

Now that we know how that works, we want to emulate that in GDevelop. The first thing I tried was to simply put the “Wait” block at the end of a For...In Loop. Yeah, remember what I said about timeouts? Those don’t block. The loop would just continue and totally ignore the wait. I think that’s a major pitfall for new devs, they’re not going to understand the nuance of how those wait commands function under the hood.

The second thing I tried is the “Repeat Every X Seconds” extension to do the same thing. I couldn’t get it to even fire, and I still don’t know why.

Anyhow, I settled on using a timer to do the dirty work. Here’s how our “Play the sequence loop” wound up looking at the end:

Screenshot of Play Sequence Loop

Conditionals / Keyboard Input Combined with Mouse Clicks

Permalink to “Conditionals / Keyboard Input Combined with Mouse Clicks”

There’s another thing I could not figure out. I wanted to have both mouse clicks and keyboard input control the “guess” portion of the code, so I sensibly (imo) attempted to combine those into a single conditional. That wound up being…very weird, and there’s still a bug related to it.

First off, there are AND and OR conditionals. It took me a bit to find them, but they do exist. So, with a single OR conditional and a nested AND conditional, I set out with this:

Screenshot of the Nested Conditionals

This mostly works. However, for whatever reason, if you use the keyboard to input your guess, the arrow animation does not play. I do not know why. It works if you click. I can prove the conditional triggers in either case. It’s just that the animation does. Not. Work. Maybe one day I’ll figure it out, but I chose not to.

Struggling past some of those hurdles, it took me about 5 hours to meet my original design goals in GDevelop. Not terrible for not knowing anything about GDevelop besides that it exists.

Here’s a video of the final product:

Okay! We’re done here. Post over.

Nah, you all know me, I couldn’t stop there.

After I’d done the initial experiment, I was curious if I could work faster in a game engine that has real scripting support.

The short answer is “yes”. It took me about 2.5 hours to complete the same task in Godot (with some visual discrepancies). I think the primary speed gain was from the fact that the actual game logic was much more intuitive to me and the ability to wire up “signals” from one element to the main script code made it much faster to do some of the tasks I was fighting in GDevelop.

Godot also has an await keyword which blocks execution like you’d expect it to, which is outstanding.

I did run into one major issue that I had to do a fair amount of research to solve:

AnimatedSprite2D “click” tracking is surprisingly difficult

Permalink to “AnimatedSprite2D “click” tracking is surprisingly difficult”

The only issue I had was that when I needed to determine if the user had clicked on an arrow, I had to jump through some interesting hoops to detect if the mouse was over the arrow’s bounding box.

While regular sprites have a helper function get_rect() which allow you to figure out its Rect2D dimensions, AnimatedSprite2Dvery much do not (you have to first dig into the animations property, and then grab the frame it’s currently on, and then you have to get its coordinates and make your own Rect2D. Gosh, I’d have loved a helper function there).

I think the expectation is you’d have a KinematicBody2D wrapping the element, but as the arrows are essentially UI, that didn’t make any sense to me. I’ll need to dig a bit further into how Godot expects you to build a UI to do all of that, but hey, I got it working relatively quickly.

Changing the text of everything in the scene was really bizarre due to how it’s abstracted via a “Theme” object that you attach to all the UI elements? Still haven’t quite figured that out. It was really easy in GDevelop. Not so much in Godot.

Yeah, so, I liked working in Godot more because it was easier to make the behaviors work, and I was getting exhausted at the clunkiness of the visual editor. Here’s the final product:

For me, working in both of these engines for fun was a positive experience and I can see myself using GDevelop for some quick prototyping, but personally, I like Godot’s approach to the actual scripting portions of the engine. Because I have a lot of software development experience, it’s much easier for me to just write a few lines of code over having to navigate the quirks of the interface.

I think GDevelop is perfectly serviceable, though. It looks like everything in the engine does have a JS equivalent, so you really could just write JS if you wanted to. If they exposed that more cleanly, I think it’d be pretty great for many 2D needs.

But I’m not a game dev, this is just me tinkering around and giving some impressions. Go try them out for yourself, they’re both easy to get started with!

Own Your Content

Shutterstock vector art of some computer magic
Andrey Suslov @ Shutterstock #1199480788

Not too long ago Anil Dash wrote a piece for Rolling Stone titled “The Internet Is About To Get Weird Again” and it’s been living rent free in my mind ever since I read it. As the weeks drag on, more and more content is being slurped up by the big tech companies in their ever-growing thirst for human created and curated content to feed generative AI systems. Automattic recently announced that they’re entering deals to pump content from Tumblr and Wordpress.com to OpenAI. Reddit, too, has entered into a deal with Google. If you want a startling list, go have a look at Vox’s article on the subject. Almost every single one of these deals is “opt-out” rather than “opt-in” because they are counting on people to not opt-out, and they know that the percentage of users who would opt-in without some sort of compensation is minimal.

Lest you think this is a rant about feeding the AI hype machine, it’s not (though you may get one of those soon enough). This is more of a lament from the last several decades of big social medial companies first convincing us that they are the best way to reach and maintain an audience (by being the intermediary) and then taking the content that countless creators have written for them and then disconnecting the creator from their audience.

Every bit of content you’ve created on these platforms, whether it’s a comment or a blog post for your friends or audience is being monetized without offering you anything in return (except the privilege of feeding the social media company, I guess). Even worse, getting your stuff back out of some of these platforms is becoming increasingly difficult. I’ve seen many of my communities move entirely to Discord, using their forums feature. However, unlike traditional website forums you cannot get your forums back out of Discord. There’s no way to backup or restore that content.

I’ve personally witnessed a community lose all of its backup content due to a leaked token and an upset spammer. It was tragic and I still mourn (but hey, we’re still there).

In one way, this is the culmination of monetizing views. As Ed Citron argues in Software has Eaten the Media, the trend from many of these social media companies has been “more views good, content doesn’t matter”. We’ve seen this show before, Google has been in a shadow war with SEO optimizers for over a decade, and they might have lost. The “pivot to video” Facebook pushed was a massive lie, and we collectively fell for it.

So what do we do about this? One thing I’m excited to see that Mr. Dash rightly points out is that there’s a renewed trend of being more directly connected to the folks that are consuming your content.

Own a blog! Link to it! Block AI bots from reading it if you’re so inclined. Use social media to link to it! Don’t write your screed directly on LinkedIn - Don’t give them that content. Own it. Do what you want to with it. Monetize it however you want, or not at all! Own it. Scott Hanselman said this well over a decade ago. Own it!

Recently, there was a Substack Exodus after they were caught gleefully profiting off of literal Nazis. Many folks decided to go to self-hosted Ghost instead of letting another company control the decision making. Molly White of Citation Needed (who does a lovely recap of Crypto nonsense) even wrote about how she did it. Wresting control away from centralized stacks and back to the web of the 90s is definitely my jam.

Speaking of Decentralization, we’ve also got Mastodon and Bluesky that have federation protocols (Bluesky just opened up AT to beta, which is pretty cool) allowing you to run your own single-user account instances but still interact with an audience (which is what I do).

Right, anyhow, this rant is brought to you by the hope that we’re standing on the edge of reclaiming some of what the weird web lost to social media companies of yore.

Edit: 03-10-2024: Turns out there's a name for this concept! POSSE: https://indieweb.org/POSSE. My internet pal Zach has a fun 1-minute intro on the concept: https://www.youtube.com/watch?v=X3SrZuH00GQ&t=835s. Go watch it!

Writing Tablet Showdown

It even does custom screen savers

I’ve searched long and hard for the perfect writing tablet for me, and I’ve found it in Supernote. It’s the perfect blend of “write-on-able”, readability, and features. It often features in blog posts over on Cthonic Studios because it also makes it super easy to make character sheets, do some quick doodles, and has several nice features.

But, I want to spend a bit of time talking about the other things I’ve tried and wound up walking away from (I’ve provided Amazon Affiliate links for convenience if you happen to want one of these alternatives, meaning I’ll get a small commission should you choose to buy something).

It’s worth noting that I am primarily approaching this from an “Is the writing experience good first, followed by the reading experience, followed by the extra features” so that’s the criteria I’m judging.

Other bit: I purchased all of these, no one is paying me to write about them.

All of these options support Templates and Layers for their notebooks, to varying degrees — these let you set custom backgrounds on your documents, so you can have more than just the basic lines / dots / etc. Where they lack in features, I’ll call out.

Most of the devices also have a “paper-like” feel to them, except for the Kobo Sage.

Basically, any eInk device is going to have issues with image-heavy PDFs, which I tried on all of them, but haven’t really had a great experience.

Kindle Scribe

My first foray into having a dedicated tablet for writing actually started as an attempt to get my wife something that was useful for marking up PDFs. At the time, her job involved a lot of scientific paper review, and she needed something that would be easy on her eyes but transfer the markup. The Scribe was on a very deep discount at the time, and it promised document markup, so we decided to give it a shot.

Reader’s sidebar: this became a moot point as organizational policy prevented her from actually using the device, but hey, we don’t create eWaste in this household willy-nilly.

Dear reader, the PDF markup capabilities of the Kindle Scribe leave a lot to be desired. You can only annotate PDFs that you have sent to your scribe via their Email conversion service, not via side-loading them by USB cable. I don’t really want to hassle with sending things through Amazon’s servers just to enable annotation, so this was a hard stop for the both of us. You see, I also like to annotate PDFs to review stuff I’m working on, for unpublished games / adventures and I don’t want that going through Amazon’s servers either.

The annotation problem also extends to Kindle books, only certain Kindle books in a particular format allow for direct annotation. Many of them default to the old kindle style annotation where you highlight a passage, and then you can make a text/handwriting annotation on that text. That’s way more annoying than just writing on the text. There is a growing body of Kindle books that are directly annotatable (along with several puzzle books you can buy), but the selection is still fairly small at the time I’m posting this.

The writing experience itself is actually fantastic, the pen feels nice on the device, and it feels pretty close to writing on paper. The input delay is minimal, I don’t really notice it. There are a pretty solid set of marker shapes and sizes to choose from, though switching between them takes a bit of time (this will be a common theme, throughout).

I appreciate the inputs on the pen. It features a side button which defaults to the highlighter and an “eraser” on the back, much like a real pencil. It feels pleasant in the hand, though if you’ve got wide hands like me, it’s going to feel a little small.

There’s not a dedicated template feature on the Kindle Scribe, you have to make a PDF with your template and then annotate the PDF directly. Annoying, but it works.

The one complaint I have about the pen is the need to replace the nibs periodically. The Supernote features a pen that has a permanent tip (the film on the device is self-healing so they can make the nib harder), but in practice I only changed the nib on the Kindle once.

Accessing your notebooks from other devices is a bit of a crapshoot. You used to be able to view them online via a specialized link (which as far as I can tell, they never advertised), but they removed that link.

You can access those notebooks via the Kindle mobile app in read-only mode. That’s about the only sync option you’ve got.

When exporting notes, you have two options:

  • Email a PDF
  • Convert to Text and Email the Text

I’ve tried both, the text recognition is “okay” but not great. My penmanship isn’t great, so this will be a common theme throughout.

This is about what you’d expect from a Kindle in 2024, the screen is clear and crisp. There’s a backlight. It can do cold and warm lighting with a bunch of adjustable brightness. This is impressive because one of Supernote’s core claims about why they do not have a backlight is because it would degrade the writing experience. Maybe a little, but for me, I can’t really tell (and kind of wish the Supernote had a backlight).

The biggest obvious advantage here is that if you’re already in the Kindle ecosystem and have given Amazon piles of money for books, you can easily access your library on the device because at its core it’s just a Kindle.

You can also still side-load DRM free eBooks in various formats. Picture-heavy PDFs still have some performance problems, an issue I’ve seen with other eReaders.

The Scribe is probably my second favorite of the tablets I’ll be talking about today, and also generally, the easiest to acquire. For me, it’s more of a “reading tablet that happens to have writing capabilities” (See also the Kobo Sage)—and while its writing experience is workable, it’s not ideal.

Its current price is $419 USD for the 64 GB model with the premium pen included, though it’s frequently on sale. You can also pick up a refurbished model for as low as $309.

Purchase Link

The second tablet we tried after finding out about the PDF limitations on the Scribe was the Remarkable 2, which was a fair amount pricier but hey, at least it supports PDF annotation via side-loading out of the box.

Like the Kindle, you can transfer files over USB. It does this via an embedded web server and specialized website that’s served over an IP address over the USB interface. It’s a little weird, I expected it to mount a drive like the Kindle does.

Remarkable has the Remarkable Cloud Service which allows you to sync notebooks between your devices via their cloud, as well as offering Google Drive, Dropbox, and OneDrive integrations. This allows you to sync your stuff to different clouds even without the subscription. There is a cost associated with Remarkable Cloud Service.

Finally, you can sync files via the Remarkable apps available on mobile and desktop over Wi-Fi.

The Remarkable 2 Writing experience is quite nice as well, it feels like you’re writing on paper. I went back and forth between the Remarkable and Scribe when we first got both of them (right around the same time, as you might recall) and it was difficult to determine a clear winner between the two when it comes to writing experience and accuracy.

The Remarkable suffers from the same problem as the Kindle, switching between pen types requires a few taps each time you want to do it. That slows me down a touch when I’m taking notes and want to call out stuff like headers. Like the Kindle, there are many pen types to choose from and switch between.

The pen is also fairly nice (we have the Marker Plus pen, but there is also a “basic” pen), but compared to the Kindle and Supernote Pens, I have two major complaints:

  1. The eraser side requires more pressure than I would like to use
  2. There’s no highlighter button (which is also a thing with the Supernote pen)

Likewise, the pen nibs are intended to be replaced on the Remarkable, and we tend to go through them quickly. There’s also a very fun quirk if the pen tip gets too worn down, it will start to write when the pen is not in contact with the Remarkable. On the one hand, great visual indicator that you need to replace the nib. On the other, pretty annoying if it starts happening when you’re not anywhere near your replacement pen nibs.

Remarkable also has a Keyboard Folio which allows you to type notes directly, but I’ve never used it, so I cannot comment on the experience.

(I don't have one of these handy yet as I need to borrow it from the wife in order to upload it sooo... soon?)

Your basic options are the same as the Kindle Scribe. You can either export as a PDF or you can convert it to text first. The difference here is how you get it to other devices. You can:

  • Sync with Remarkable Cloud
  • Sync to Dropbox, Google Drive, or OneDrive
  • Copy via USB
  • Email

The reading experience is pretty good, but not as convenient as the Kindle.

There are no built-in services to connect to, so you have to transfer your eBooks manually via any of the mechanisms mentioned above. It can handle reading PDFs very well, but like other eInk devices, it struggles with image-heavy PDFs.

The ePub experience is pretty good, it provides a few different ways to navigate the content, either via a full page view or via the quick browse feature they recently added.

There is no backlight, so you will want to have a suitably lit environment if you intend to read eBooks on the Remarkable 2.

Compared to the Kindle, this is a tablet that was designed for writing first, and reading second. The writing experience overall is better, especially when it comes to getting your content from the device. I, personally, don’t want everything going through anyone else’s servers, and Remarkable makes that possible.

It also comes in at a fair amount pricier than the Kindle Scribe, at $549 for the bundle with the cover and Marker Plus pen.

Purchase Link

Kobo Sage

Who in this house needs better control over the lighting in here? It's me!

I wanted to like the Sage. I really did. A while ago, I stopped buying books through Kindle and instead started borrowing more books from my local library and when I couldn’t do that, getting them through Kobo. So, I have a fair amount of content in the Kobo Ecosystem now.

The Sage + Pen combo was one of my first forays into the “can I go paperless” experiment, well before the Remarkable 2 and Scribe experiments (before the Scribe existed, even). The major thing I liked about it is the size. It’s compact and easy to carry around. If I could read on it and scribble some notes, all the better.

The battery life isn’t nearly as good as I’d expect from an eReader, which is probably why Kobo sells a special case that charges the device, but it’s not terrible.

I still use it for reading, but after giving it several shots, I’ve not used it for writing again.

If I had to summarize this in a single word: Bad.

The first problem with the writing experience is that the surface of the device is not grippy like the other devices in this list. The pen does not counter this in the slightest, so when you write the nib slides around a lot more than I would like. It’s very similar to the iPad pencil tips in that regard. Some people are okay with this, but I want my writing experience to feel more paper-like.

The second is sometimes it doesn’t register the pen strokes. Lines will wind up broken, drawings look weird. I think this also has something to do with how the Sage’s case functions with magnets. I’ve discovered that if you try to write when the Sage is in contact with metal, like my back porch’s table, it gets much worse. Dunno why that is, but it makes it basically unusable.

There is some noticeable lag between drawing a line and it appearing on the screen, and occasionally, it’ll just write before you tap. Yeah. Not great.

The stylus is “fine” (I have the original stylus, not the 2) but it rattles as you’re using it, so I didn’t really like using it. It has two buttons on the side, one for erasing, one for highlighting.

Notebooks come in two styles: Basic and Advanced. Advanced notebooks use Nebo under the hood, and it behaves basically exactly like that app. You can make nice diagrams, move things around, and numerous other things. Nebo’s also available on Android and iOS if you want to give it a shot there.

Your options on the Kobo are very similar to the Kindle, with the exception that you can do this over USB, or via Dropbox.

Depending on the notebook, you can also export it as a .docx file, HTML, Text, or a few image files.

Compared to the writing experience, the reading experience is good! Like modern Kindles, there is a good backlight with an adjustable temperature for warm and cold light. The device is pretty snappy, and you can side-load other eBooks onto the device.

Furthermore, like the Kindle, there are integrations with OverDrive for loading library books and finding them directly from the device.

Like other eReaders, it struggles with image-heavy PDFs.

Much like the Kindle Scribe, this device was built for reading first, and writing second. Unlike the Kindle Scribe, that is a very distant second. For me, it’s borderline unusable.

The Sage is the cheapest option on this list is $270 without the pen. The stylus costs an additional $36.

Purchase Link

Okay! I’ve saved my favorites for last. I like this device so much that I’ve actually got two of them. Not only that, but I picked up the A5X near the end of its lifecycle, and the Nomad as soon as it came out.

Of the devices we’re looking at today, this one has the most flexibility when it comes to getting content onto it. It’s running Android under the hood (though it’s way more locked down than something like an Onyx Boox), so you can also mount the filesystem via a program like OpenMTP to transfer files back and forth. I have had an issue on newer macs where the “Allow USB connection dialog” disappears too quickly to allow the Supernote to connect, however. Still not sure what’s up with that. This only happens on the Nomad, not the A5X.

It also has a nice feature where you can turn on the ability to transfer files over Wi-Fi. Turning on that feature starts up a web server on the device, allowing you to drop files on it. The downside to this is that it starts a web server on the device, which is discoverable on the network. I’d only do this at home or on other networks you control / trust. Don’t do it on public Wi-Fi unless you want someone doing shenanigans.

They also allow you to sync files back and forth via Dropbox, Google Drive, OneDrive, or Supernote Cloud as well as a companion Mobile app that works over Wi-Fi.

Like the Remarkable 2, it has the option to connect a keyboard; this time you can use any bluetooth keyboard you’ve got lying around. That’s a cool feature, but the refresh time makes the typing experience suboptimal. If you can put up with a little input lag, it’s not terrible.

Small sidebar, the way the Nomad attaches to its case via magnets is A+, love it.

Unsurprisingly, for me, the Supernote is the best writing experience of the bunch. It feels the closest to paper for me, and it has some convenience functions that make it much quicker to do certain kinds of writing.

One of the things you might have noticed that I mention on basically all the prior devices: Switching between pens is kinda slow and awkward. Supernote solves this problem by letting you set custom hotkeys on the sidebar so that you can rapidly switch between the kinds of pens you use rapidly.

Supernote Nomad showcasing quick switch

There are also a number of multi-tap and sidebar gestures for undo, redo, change the toolbar, and the like. It’s really solidly done.

The other thing I really appreciate is the ease of using custom Templates. You can use both images and PDFs and then easily set those as a background. Here’s an example of me using the Ironsworn Character Sheet PDF as a background:

Supernote Ironsworn Character Sheet

The only negative I’ve had with the Nomad (which does not happen on the A5X I think) is that sometimes tapping the pen will not produce a dot, making it so that I have to be careful to properly dot i and j.

I have the premium Heart of Metal 2 Pens, which feature the ceramic nib that never needs to be replaced (which is awesome). It doesn’t feature any buttons, but the gesture controls make that almost unnecessary (but it would be nice). There is a secondary pen option which has a side button that activates either the lasso or eraser lasso tool.

The Supernote gives you the ability to export in PNG, PDF, TXT, and .docx formats, and it lets you sync via any mechanism you can use to get content on the device (so, USB, Supernote Cloud, Google Drive, Dropbox, OneDrive, over Wi-Fi).

Additionally, it has the ability to “Share via QR Code” which uploads the note to Supernote and generates a URL that is good for 24 hours, where you can download the note via a web browser.

Likewise, the reading experience is almost perfect. As I mentioned above, there is no backlight, which I sometimes miss. It supports all the usual suspect formats, including PDFs. They’ve got strong enough processors that RPG pdfs do tend to work okay. They still struggle with the image heavy ones (which is sadly most Pathfinder adventure paths).

Because this is an Android-based device, you also have some access to Android apps. The “Supernote Store” on the devices allows you to install the Kindle app, providing you with access to your entire Kindle Library. It’s possible to side load applications on it if you can access the debugger, but I’ve not personally done so. As of the latest updates for the Nomad, they’ve added a “side-loading” button which enables debugging. I’d love to add the Kobo app, but I’ve not attempted to do so yet.

Otherwise, there’s not much else to say — navigating around books and PDFs is quite pleasant, and it’s got a nice feature for PDF bookmarks, letting you skip around the bookmarks quickly rather than having to scroll through them.

This is my favorite eInk writing tablet I own, so much so that I bought it in two different sizes.

Believe it or not, the Nomad is not the most expensive item on this list (nor was the A5X!) by itself. The device clocks in at $299, but when you start adding accessories, it adds up. The Heart of Metal 2 Pen is $75, the folio is $49, so a full package is $423, just over the Scribe’s price (even more if you get a refurbished one or grab it when they do periodic Kindle sales).

Purchase Link (this one is not an affiliate link).

I’ve tried to do some writing on my iPad, and it works out reasonably well, but the default pen tips suffer from the same problem of sliding around and not feeling good to write with. I have tried the Pentips 2+, but they are egregiously expensive, and I’ve had some quality control issues with them. I’m still rocking one of those for doing drawing / creating the OSRaon Icon Set, but it’s not great for writing.

Boox makes a ton of eInk tablets that are running “Basically stock Android”, which is wild to me. I’ve been debating picking one up. They have color eInk variants too, in various sizes. I’m a little worried that the writing experience won’t be great, but the temptation is strong.

If I wind up doing so, I’ll post a follow-up review.

If you made it to the end here, I hope this was entertaining and informative if you’re in the market for one of these things. Til next time, friends.

How I’m approaching Generative AI

The Duality of AI
Lidiia Lohinova @ Shutterstock #2425460383

Also known as the “Plausible Sentence Generator” and “Art Approximator”

This post is only about Generative AI. There are plenty of other Machine Learning models, and some of them are really useful, we’re not talking about those today.

I feel like every single day I see some new startup or post about how Generative AI is the future of everything and how we’re right on the cusp of Artificial General Intelligence (AGI) and soon everything from writing to art to music to making pop tarts will be controlled by this amazing new technology.

In other words, this is the biggest tech hype cycle I’ve personally witnessed. Blockchain, NFTs, and the like come close (remember companies adding “blockchain” to their products just to get investment in the last bubble?) and maybe the dotcom bubble, but I think this “AI” cycle is even bigger than them all.

There are a lot of reasons for that, which I’m going to get into as part of this … probably very lengthy post about Generative AI in general, where I find it useful, where I don’t, where my personal ethics land on the various elements of GenAI (and I’ll be sure to treat LLMs and Diffusion models differently). So, by the end of this, if I’ve done my job right, you’re going to understand a bit more about why I think there’s a lot of hype and not a lot of substance here — and how we’re going to do a lot of damage in the meantime.

Never you worry friends, I’m going to link to a lot of sources for this one.

If you’ve been living deep in a cave with no access to the news you might not have heard about Generative AI. If you are one of those people and are reading this, I envy you - please take me with you. I’m going to go ahead and define AI for the purposes of this article because the industry has gone and overloaded the term “AI” once again.

I’m going to be very constrained to “Generative AI”, also known as “GenAI”, of two categories: Large Language Models (LLMs) and Diffusion Models (like Dall-E and Stable Diffusion). The way they work is a little bit different, but the way they are used is similar. You give them a “prompt” and they give you some output. For the former, this is text and for the latter this is an image (or video, in the case of Sora). Sometimes we slap them together. Sometimes we slap them together 6 times.

Examples of LLMs: ChatGPT, Claude, Gemini (they might rename it again after this post goes live because Google gonna Google).

I’m going to take my best crack at summarizing how this works, but I’ll link to more in-depth resources at the end of the section. In its most basic terms, an LLM takes the prompt that you entered and then it uses statistical analysis to predict the next “token” in the sequence. So, if you give it the sentence “Cats are excellent”, the LLM might have correlated “hunters” as the next token in the sequence as statistically 60% likely. The word “pets” might be 20%. And so on. It’s essentially “autocomplete with a ton of data fed to it”.

Sidebar, a token is not necessarily a full word. It could be a “.”, it could be a syllable, it could be a suffix, and so on. But for the purposes of the example you can think of as words.

What the LLM does that makes it “magical” and able to generate “novel” text is that sometimes it won’t pick the statistically most likely next token. It’ll pick a different one (based on its Temperature, Top-P, and Bottom-P parameters), which then sends it down a different path (because the token chain is now different). This is what enables it to give you a Haiku about your grandma. It’s also what makes it generate “alternative facts”. Also known as “hallucinations”.

This is a feature.

You see, the LLM has no concept of what a “fact” is. It only “understands” statistical associations between the words that have been fed to it as part of its dataset. So, when it makes up court cases, or claims public figures have died when they’re very much still alive, this is what’s happening. OpenAI, Microsoft, and others are attempting to rein this in with various techniques (which I’ll cover later), but ultimately the “bullshit generation” is a core function of how an LLM works.

This is a problem if you want an LLM to be useful as a search engine, or in any domain that relies on factual information, because invariably it will make fictions up by design. Remember that, because it’s going to come up over and over again.

  1. Stephen Wolfram talks about how ChatGPT works.
  2. On the Dangers of Stochastic Parrots: Can Language Models Be Too Big? 🦜

I don’t understand diffusion models as well as I understand language models, much like I understand the craft of writing more than I do art, so this is going to be a little “fuzzier”

Examples of Diffusion Models: Dall-E 3 (Bing Image Creator), Stable Diffusion, Midjourney

Basically, a Diffusion model is the answer to the question “what happens if you train a neural network on tagged images and then introduce progressively more random noise”. The process works (massively simplified) like this:

  1. The model is given an image labeled “cat”
  2. A bit of random noise (or static) is introduced into the image.
  3. Do Step 2 over and over again until the image is totally unrecognizable as a cat.
  4. Congrats! You now now how to make a “Cat” into random noise.

But the question then becomes “can we reverse the process?”. Turns out, yes, you can. To get a from a prompt of “Give me an image that looks like a cat” the diffusion model will do the process in essentially reverse:

  1. We generate an image that is nothing but random noise.
  2. The model uses its training data to “remove” that random noise, just a bit
  3. Repeat step 2 over and over again
  4. Finally, you have an image that looks something akin to a cat

Now, on this other side, your model might not have generated a great cat. It doesn’t know what a cat is. So, it asks another model: “Hey, is this an acceptable cat?”. Said model will either say “nope, try again”, or it will respond with “heck yes! That’s a cat, do more like that”.

This is Reinforcement Learning - this is going to come up again later.

So, at it’s most “basic” representation the things that are making “AI Art” are essentially random noise de-noiserators. Which, at a technical level is super cool! Who would have thought you could give a model random noise garbage and get a semi-coherent image out of the other end?

  1. Step by Step Visual Introduction to Diffusion Models by Kemal Erdem
  2. How Diffusion Models Work by Leo Isikdogan (video)

These things are energy efficient and cheap to run, right?

Permalink to “These things are energy efficient and cheap to run, right?”

I mean, it’s $20/mo for an OpenAI ChatGPT pro subscription, how expensive could it be?

My friends, this whole industry is propped up by a massive amount of speculative VC / Private Equity funding. OpenAI is nowhere near profitable. Their burn rate is enormous (partly due to server costs, but also training foundational models is expensive). Sam Altman is seeking $7 Trillion dollars for AI chips. Moore’s law is Dead, so we can’t count on the cost of compute getting ever smaller.

Let’s also talk about the environmental impact of some of these larger models. Training them requires a lot of water. Using them uses way less water (well, as much as running a power-hungry GPU would require), but the overall lifecycle of a GenAI large foundational model isn’t exactly sustainable in the world of impending climate crisis.

One thing that’s also interesting is there are a number of smaller, useable-ish models that can run on commodity hardware. I’m going to talk about those later.

I think part of what’s fueling the hype here is only a few companies on the planet can currently field and develop these large foundational models, and no research institutions currently can. If you can roll out “AI” to every person who uses a computer, your potential and addressable markets are enormous.

Because there are only a few players in the space, they’re essentially doing what Amazon is an expert at: Subsidize the product to levels that are unsustainable (that $20/mo, for example) and then jack up the price later once you’ve got a captive market that has no choice in the matter anymore.

Go have a watch of John Stewart’s interview with FTC Chair Lina Khan, it’s a good one and touches on this near the end.

We’re already seeing them capture a lot of market here, too, because a ton of startups are building features which simply ask you, the audience, to provide an OpenAI API key. Or, they subsidize the API access cost to OpenAI through other subscription fees. Ultimately, a very small number of players under the hood control access and cost…. which is going to be very very “fun” for a lot of businesses later.

I do think OpenAI is chasing AGI…for some definition of AGI; but I don’t think it’s likely they’re going to get there with LLMs. I think they think that they’ll get there, but they’re now chasing profit. They’re incentivized to say they’ve got AGI even if they don’t.

Cool! So we modeled this on the human brain?

Permalink to “Cool! So we modeled this on the human brain?”

I’m getting pretty sick of hearing this one, because the concept of a computer Neural Network is pretty neat but every time someone says “and this is how a human brain works” it drives me a little bit closer throwing my laptop in a river.

It’s not. Artificial Neural Networks (ANNs) were invented in the late 1960s, and were modeled after a portion of how we thought our brains might work at the time. Since then, we’ve made advances with things like Convolutional Neural Networks (CNNs) starting in the 1980s, and most recently Transformers (this is what ChatGPT uses). None of these ANN models actually model what the human brain is actually doing. We don’t actually understand how the human brain works in the first place, and the entire field of neuroscience is constantly making discoveries.

Did Transformer architecture stumble upon how the human brain works? Unlikely, but, hey, who knows. Let’s throw trillions of dollars at the problem until we get sentient clippy.

Look, I could get into a lengthy discussion about whether free will exists or not but I’m gonna spare you that one.

Wikipedia covers this better than I could, so go have a read on the history of ANNs.

AI will not just keep getting better the more data we put in

Permalink to “AI will not just keep getting better the more data we put in”

Couple of things here, it’s really hard to model how well a generative AI tool is doing on benchmarks. Pay attention to the various studies that have released (peer reviewing OpenAI’s studies has been hard, turns out). You’re not getting linear growth with more data. You’re not getting exponential growth (which I suspect is what the investors are wanting).

You’re getting small incremental improvements simply from the adding more data. There are some things that the AI companies that are doing to improve performance for certain queries (this is human reinforcement, as well as some “safety” models and mechanisms) - but the idea that you just keep feeding a foundational model more data and it suddenly becomes much better is a logical fallacy and there’s not a lot of evidence for it.

Oh, oh gods, no. It can be very biased. It was trained on content curated from the internet.

I cannot do any better description than this Bloomberg Article - it's amazing, but it covers how image generators tend to racially code professions.

I'm skeptical that is even possible. Google Tried and wound up making "Racially diverse WWII German soldiers". AKA Nazis.

Right, so how did they train these things?

Permalink to “Right, so how did they train these things?”

The shortest answer is “a whole bunch of copyrighted content that a non-profit scraped from the internet”. The longer answer is “we don’t actually fully know because OpenAI will not disclose what’s in their datasets”.

One of the datasets, by the way, is Common Crawl - you can block its scraper if you desire. That dataset is available for anyone to download.

If you’re an artist that had a publicly accessible site, art on Deviantart, or really anywhere else one of the bots can scrape, your art has probably been used to train one of these models. Now, they didn’t train the models on “the entire internet”, Common Crawl’s dataset is around 90 TB compressed, and most of that is…. Well, garbage. You don’t want that going into a model. Either way, it’s a lot of data.

If you were a company who wanted to get billions of dollars in investment by hyping up your machine learning model, you might say “this is just how a human learns to do art! They look at art, and they use that as inspiration! Exactly the same.”

I don’t buy that. An algorithm isn’t learning, it’s taking pieces of its training set and reproducing it like a facsimile. It’s not making anything new.

I struggle with this a bit too. One of my favorite art series is Marcel Duchamp’s Readymades - because it makes you question “what is art, really?”. Does putting a urinal on its side make it art? For me, yes, because Duchamp making you question the art is the art. Is “Hey Midjourney give me Batman if he were a rodeo clown” art? Nah.

Thus, OpenAI is willing to go to court to make a fair use argument in order to continue to concentrate the research dollars in their pockets and they’re willing to spend the lobbying dollars to ask forgiveness rather than waiting to ask permission. There’s a decent chance they’ll succeed. They’ll have profited off of all of our labor, but are they contributing back in a meaningful way?

Let’s explore.

Part 2, or “how useful are these things actually”?

Permalink to “Part 2, or “how useful are these things actually”?”

Recycling truck
Paul Vasarhelyi @ Shutterstock #78378802

Let’s start with LLMs, which the AI companies claim to be a replacement for writing of all sorts or (in the case of Microsoft) the cusp of a brilliant Artificial General Intelligence which will solve climate change (yeahhhhh no).

Remember above how LLMs take statistically likely tokens and start spitting them out in an attempt to “complete” what you’ve put into the prompt? How are the AI companies suggesting we use this best?

Well, the top things I see being pushed boil down to:

  1. Replace your Developers with the AI that can do the grunt work for you
  2. Generate a bunch of text from some data, like a sales report or other thing you need “summarized”
  3. Replace search engines, because they all kind of suck now.
  4. Writing assistant of all kinds (or, if you’re an aspiring grifter, Book generation machine)
  5. Make API calls by giving the LLM the ability to execute code.
  6. Chatbots! Clippy has Risen again!

There are countless others that rely on the Illusion that LLMs can think, but we’re going to stay away from those. We’re talking about what I think is useful here.

The Elephant in the Software Community: Do you need developers?

Permalink to “The Elephant in the Software Community: Do you need developers?”

Okay, there are so many ways I can refute this claim it’s hard to pick the best one. First off, “prompt engineering” has emerged as a prime job, and it’s really just typing various things into the LLM to try and get the best results (again, manipulating the statistics engine into giving you output you want. Non-deterministic output). That is essentially a development job, you’re using natural language to try to get the machine to do what you want. Because it has a propensity to not do that, though, it’s not the same as a programming language where it does exactly what you tell it to, every time. Devs write bugs, to be sure, but what the code says is what you’re going to get out the other end. With a carefully crafted prompt you will probably get what you want out the other end, but not always (this is a feature, remember?)

The folks who are financially motivated to sell you ever increasing complexity engines are incentivized to tell you that you can cut costs and just let the LLM do the “boring stuff” leaving your most high-value workers free to do more important work.

And you know what, because these LLMs were trained on a bunch of structured code, yeah, you probably can get it to semi-reliably produce working code. It’s pretty decent at that, turns out. You can get it to “explain” some code to you and it’ll do an okay (but often subtly wrong) job. You can feed it some code, tell it to make modifications, or write tests, and it’ll do it.

Even if it’s wrong, we’ve built up a lot of tooling over the years to catch mistakes. Paired with a solid IDE, you can find errors in the LLMs code more readily than just reading it yourself. Neat!

I actually tried this recently when revamping the GW2 Assistant app. I’ll be doing a post on my experiment doing this soonish, but in the meantime let me summarize my thoughts (which are actually the second point):

An experienced developer knows when the LLM has produced unsustainable or dangerous code, and if they’re on guard for that and critically examine the output they probably will be more efficient than they were before.

Inexperienced developers will not be able to do that due to unfamiliarity and will likely just let the code go if it “works”. If it doesn’t work, they’re liable to get stuck for far longer than pair programming with a human.

Devin, the AI Agent that claims to be the first AI software engineer looks pretty impressive! Time for all software devs to take up pottery or something. I want you to keep an eye on those demos and what the human is typing into the prompt engine. One thing I noticed in the headline demo is that the engineer had to tell Devin 3 or 4 times (I kinda lost count) that it was using the wrong model and “be sure to use the right model”. There were also several occasions where he had to nudge it using specialized knowledge that the average person is simply not going to have. Really, go check it out.

Okay, so, we’re safe for a little bit right?

Well….no. I’m going to link to an article by Baldur Bjarnason: The one about the web developer job market. It’s pretty depressing, but it also summarizes my feelings well. Regardless of the merits of these AI systems (and I have a sneaking suspicion that the bubble’s going to pop sooner rather than later due to the intensity of the hype), CTOs and CEOs that are focused on cutting costs are going to reduce headcount as a money-saving measure, especially in industries that view software as a Cost Center. Hell, if Jensen Huang says we don’t need to train developers, we can be assured that the career is dead.

I think this is a long-term tactical mistake for a few reasons:

  1. I think a lot of the hype is smoke-and-mirrors, and there’s no guarantee that it’s going to be orders-of-magnitude better.
  2. We’ll make our developer talent pool much smaller, and have little to no environment for Juniors to learn and grow, aside from using AI assistants to do work.
  3. Once the cost of using AI tools increases, we’ll be scrambling to either rehire devs at deflated cost, or we’re going to try and wrangle less power hungry models into doing more development things.

Neat.

This LLM wish fulfillment strategy is essentially “I don’t have time to crunch this data myself, can I get the AI to do it for me and extract only the most important bits”. The shortest answer is “maybe to some degree of accuracy”. If you feed it a document, for example, and ask it to summarize - odds are decent that it’ll both give you a relatively accurate summary (because you’ve increased the odds that it’ll produce the tokens you want to see in said document) that will also contain degrees of factual errors. Sometimes there will be zero factual errors. Sometimes there will be many. Whether those are important or not depends entirely on the context.

Knowing the difference would require you to read the whole document and decide for yourself. But we’re here to save time and be more productive, remember? So you’re not going to do that, you’re going to trust that the LLM has accurately summarized the data in the text you’re giving it.

BTW, by itself an LLM can’t do math. OpenAI is trying to overcome this limitation by allowing it to run Python code or connect to Wolfram Alpha but there are still some interesting quirks.

So, you trust that info, and you take it to a board presentation. You’re showcasing your summarized data and it clearly shows that your Star Wars Action Figure sales have skyrocketed. Problem is you’re an oil and gas company and you do not sell Star Wars action figures. Next thing you know, you’re looking like an idiot in front of the board of directors and they’re asking for your resignation. Or, worse, the Judge is asking you to produce the case law your LLM fabricated, and now you’re being disbarred. Neat!

Remember, the making shit up is a feature, not a bug.

But wait! We can technology our way out of this problem! We’ll have the LLM search its dataset to fact check itself! Dear reader, this is Retrieval Augmented Generation (RAG). Based on nothing but my own observations, the most common technique I’ve seen for this is doing a search first for the prompt, injecting those results into the context window, and then having it cite its sources. That can increase the accuracy by nudging the statistics in the right direction by giving it more text. Problem is, it doesn’t always work. Sometimes it’ll still cite fake resources. You can pile more and more stuff on top (like doing another check to see if the text from the source appears in the summary) in an ever-increasing race to keep the LLM honest but ultimately:

LLMs have no connection to “truth” or “fact” - all the text they generate are functionally equivalent based on statistics

RAG and Semantic Search are related concepts - you might use a semantic search engine (which attempts to search on what the user meant, not necessarily what they asked) to retrieve the documents you inject into the system.

The other technique we really need to talk about briefly is Reinforcement Learning from Human Feedback (RLHF). This is “we have the algorithm produce a thing, have a human rate it, and then use that human feedback to retrain / refine the model”.

Two major problems with this:

  1. It only works on the topics you decide to do it on, namely “stuff that winds up in the news and we pinky swear to ‘fix’ it”.
  2. It’s done by an army of underpaid contractors.

You’d be surprised just how much of our AI infrastructure is actually Mechanical Turks. Take Amazon Just-walk-out, for example.

What we wind up doing is just making the toolchain ever more complicated trying to get spicy autocomplete to stop making up “facts”, and it might have just not been worth the effort in the first place.

But that’s harder to do these days because:

Google’s been fighting a loosing battle against “SEO Optimized” garbage sites for well over a decade at this point. Trying to get relevant search results amidst the detritus and paid search results has gotten harder over time. So, some companies have thought “hey! Generative AI can help with this - just ask the bot (see point #6) your question and it’ll give you the information directly”.

Cool, well, this has a couple of direct impacts, even if it works. Remember those hallucinations? They tend to sneak in places where they’re hard to notice, and its corpus of data is really skewed towards English language results. So, still potentially disconnected from reality (but usually augmented via RAG), but how would you know? It’s replaced your search engine - so are you going to now take the extra time to go to the primary source? Nah.

Buuuut, because Generative AI can generate even more of this SEO garbage at a record place (usually in an effort to get ad revenue) we’re going to see more and more of the garbage web showing up in search. What happens if we’re using RAG on the general internet? Well, it’s an Ouroboros of garbage, or, as some folks theorize, Model Collapse.

The other issue is that if people just take the results the chat bot gives them and do not visit those primary sources, ad revenue and traffic to the primary sources will go down. This disincentivizes those sources from writing more content. The Generative AI needs content to live. Maybe it’ll starve itself. I dunno.

But it’ll help me elevate my writing and be a really good author right?

Permalink to “But it’ll help me elevate my writing and be a really good author right?”

I’ve been too cynical this whole time. I’m going to give this one a “maybe”. If you’re using it to augment your own writing, having it rephrase certain passages, or call out to you where there are grammar mistakes, or any of that “kind” of idea more power to you.

I don’t do any of that for two reasons, one is practical, the other highlights where I think there’s an ethical line:

  1. I’m not comfortable having a computer wholesale rewrite what I’ve done. I’d rather be shown places that can improve, see some other examples, and then rewrite it myself.
  2. There’s a pretty good chance that the content it regurgitates is copyrighted, and we’re still years out from knowing the legal precedent.

The AI industry has come up with a nice word for “model regurgitates the training data verbatum”. Where we might call it “plagiarism” they call it “overfitting”.

Look, I don’t want to be a moral purist here, but my preferred workflow is to write the thing, do an editing pass myself, and then toss the whole thing into a grammar checker because my stupid brain freaking loves commas. Like, really, really, loves them. Comma.

I do this with a particular tool: Pro Writing Aid. It’s got a bunch of nice reports which will do things like “highlight every phrase I’ve repeated in this piece” so that I can see them and then decide what to do with them. Same deal with the grammar. I ignore its suggestions frequently because if I don’t, the piece will lose my “voice” - and you’ll be able to tell.

They, like everyone else, have started injecting Gen AI stuff into their product, but for me it’s been absolutely useless. The rephrase feature hits the same bad points I mentioned earlier. They’ve also got a “critique” function which always issues the same tired platitudes (gotta try it to understand it, folks).

This raises another interesting point about the people investing in Generative AI heavily. One of those companies is Microsoft. A company who makes a word processor. The parent of clippy themselves. They could have integrated better grammar tools into their product. They could have invested more in “please show me all the places where I repeated the word ‘bagel’”. They didn’t do this.

That makes me think that they didn’t see the business case in “writing assistants”, and why Clippy died a slow death.

Suddenly, though, they have a thing that can approximate human writing and suddenly there’s a case and a demand for “let this thing help you write”. I feel like they’re grasping at use cases here. We stumbled upon this thing, it’s definitely the “future”, but we don’t…quite….know….how.

I want to take a second here to talk about a lot of what I’m seeing in the business world’s potential use cases. “Use this to summarize meetings!” or “Use this to write a long email from short content” or “Here, help make a presentation”.

After all one third of meetings are pointless, and could be an email! I want to also contend that many emails are pointless.

Essentially what you’re seeing is a “hey, busywork sucks, let’s automate the busywork”. Instead of doing that, why not just…not do the busywork? If you can’t be bothered to write the thing, does it actually have any value?

I’m not talking about documentation, which is often very important (and should be curated rather than generated), but all those little things that you didn’t really need to say.

If you’re going to type a bulleted list into an LLM to generate an email, and the person on the other end is going to just use an LLM to summarize, lossily, I might add, why didn’t you just send the bulleted list?

You’re making more work for yourself. Just… don’t do that?

Let’s give it the ability to make API Calls

Permalink to “Let’s give it the ability to make API Calls”

Right, so one of the fun things OpenAI has done for some of their GPT-4 products is to give it the ability to make function calls, so that you can have it do things like:

  • Book a flight
  • Ask what the next Guild Wars 2 World Boss is
  • Call your coffee maker and make it start
  • Get the latest news
  • Tie your shoes (not really)

And so on. Anything you can make a function call out to, you can have the LLM do!

It does this by being fed a function signature, so it “knows” how to structure the function call, and then runs it through an interpreter to actually make the call (cause that seems safe).

Here’s the…minor problem. It can still hallucinate when it makes that API call. So, say you have a function that looks like this: buyMeAFlight(destination, maxBudget) and you say to the chatbot “Hey, buy me a flight to Rio under $200”. What the LLM might do is this: buyMeAFlight("Rio de Janeiro", 20000). Congrats, unless you have it confirm what you’re doing you just bought a flight that’s well over your budget.

Now, like all other Generative AI things, there are techniques you can use to increase the accuracy. Making just the perfect prompt, having it repeat output back to you, asking “are you sure”, telling it that it’s a character on star trek. You know, normal stuff.

Alternatively you could just... use something deterministic, like, I don’t know, a web form or any of the existing chat agent software we already had.

Sidebar: Apparently OpenAI has introduced a “deterministic” mode in beta, where you provide a seed to the conversation to get it to reliably reproduce the same text every time. Are you convinced this is a random number generator yet?

And on that note, let’s talk about our obsession with chatbots.

So the “killer” application we’ve come up with, over and over, is “let’s type our question in natural language and it does a thing.” I honestly don’t understand this on a personal level - because I don’t really like talking to chatbots. I don’t want to say “Please book me a flight on Friday to New York” and then forget about it. I want to have control over when I’m going to fly.

Do large swaths of people want executive assistants to do important things like cross-country travel?

Not coincidentally, I really struggle with that kind of delegation and have never really made use of an executive assistant personally.

We’ve decided that the best interface for doing work is “ask the chatbot to do things for you” in the agent format. This is exactly the premise of the Rabbit R1 and the Humane Ai Pin. Why use your phone when you can shout into a thing strapped to you and it’ll do…whatever you ask. Perhaps it’ll shout trivia answers at you.

But guess what, my phone can already do that. Siri’s existed for years and like, I hardly use it. It’s not because it’s not useful. It’s because I can do what I want without shouting at it. In public. For some reason.

We do need to talk about accessibility. One of the things that AI agents would be legitimately useful for is for those folks who cannot access interfaces normally whether that’s situationally (driving a car), or temporarily / permanently (blind, disabled).

If we can use LLMs to get better accessibility tech that is reliable, I’m all for it. Problem is that the companies pushing the technology have a mixed track record on doing accessibility work, and I’m concerned that we’ve decided that LLMs being able to generate text means we can abdicate responsibility for doing actual accessibility work.

Like many other things in the space, we’ve decided that “AI” is magic, and will make things accessible without having to do the work. I mean, no. That’s not how it works.

Remember back to the beginning of this article where I talked about other Machine Learning Models? I think that’s the space where we’re going to make more accessibility advances, like the Atom Limb which uses a non-generative model to interpret individual muscle signals.

Still with me?

If I had to summarize my thoughts on all of the above it’s that we’ve stumbled upon something really cool - we’ve got an algorithm that can create convincing looking text.

The companies that have the resources to push this tech seem to be scrambling for the killer use-case. Many companies are clamoring for things that let them reduce labor costs. Those two things are going to result in bad outcomes for everyone.

I don’t think there’s a silver bullet use case here. There are better tools already for every use case I’ve seen put forward (with some minor exceptions), but we’re shoving LLMs into everything because that’s where the money is. We’re chasing a super-intelligent god that can “solve the climate crisis for us” by making the climate crisis worse in the meantime.

If you were holding NVDA stock, something something TO THE MOON. They’ve been making bank off of every bubble that needs GPUs to function.

This feels exactly like the Blockchain and Web3 bubbles. Lots of hype, not a lot of substance. We’re tying ourselves in knots to get it to not “hallucinate”, but like I’ve repeated over and over again in this piece the bullshit is a feature, not a bug. I recommend reading this piece by Cory Doctorow: What Kind of Bubble is AI? It’ll give you warm fuzzies. But it won’t.

Midjourney, Sora, all those things that can fake voices and make music. We’ve got a big category of things that are, charitably “art generators”, but more realistically “plagiarism engines”.

This section is going to be a lot shorter. Let me summarize my feelings:

  • If you’re using one of these things for personal reasons, making character art for your home D&D game, or other things that you’re not trying to profit from - go for it. I don’t care. I’d rather you not give these companies money but I don’t have moral authority here.
    • I’ve used it for this too! I’m not exempt from this statement.
  • If you’re using AI “art” in a commercial product, you don’t have an ethical defense here (but we’ll talk about business risk in a sec). The majority of these models were trained on copyrighted content without consent and the humans who put the work in are not compensated for it.

I personally don’t find all of the existing AI creations that inspiring, other than how neat it is we’ve gotten a neural network to approximate images in its training set. Some of the things it spits out are “cool” and “workable” but I just don’t like it.

Hey, I do empathize with the diffusion models a bit though. Hands are hard.

As I mentioned earlier in the post, as far as we can tell, the art diffusion models were trained on publicly viewable, but still copyrighted content.

If for some reason you’re a business and you’re reading this post: That’s a lot of business risk you’d be shouldering. There are multiple different lawsuits happening right now, many of them on different lines, and we don’t actually know how that’s going to go. Relatedly, AI Art is not copyrightable, so that’s…probably a problem for your business especially if you’re making a book or other art-heavy product. At best you can do is treat it like stock art, where you don’t own the exclusive rights to the art, and you’re hoping you don’t get slapped with liability in the future.

So, if you’re using an AI Art model in your commercial work, these are all things you have to worry about.

This is where, and I cannot believe I am saying this, I think Adobe is playing it smart. They’ve trained Firefly on Art they’ve licensed from their Adobe Stock art platform and are (marginally) compensating artists for the privilege. They have also gone so far as to offer to guarantee legal assistance to enterprise customers. If you’re a risk averse business, that’s a pretty sweet deal (and less ethically concerning - though the artists are getting pennies).

The rest of them? You’re carrying that risk on your business.

But what if your business happens to be “crime”?

AI companies seem hell bent on both automating the act of creation (devaluing artistry and creativity in the process) and also making it startlingly easy for fraudsters to do their thing.

Lemme just… link some articles.

So, you create things that can A) Clone a person’s voice, B) Imitate their Likeness, and C) Make them say whatever you want.

WHAT THE HELL DID YOU THINK WAS GOING TO HAPPEN? WHAT PUBLIC GOOD DOES THAT SERVE?

I dunno about y’all, but I’m okay not practicing Digital Necromancy (just regular, artisanal necromancy).

The commercial business for these categories of generative AI are flat out fraud engines. OF COURSE criminals are going to use this to defraud people and influence elections. You’ve made their lives so much easier.

Hey, I guess we can take solace in the fact that the fraud mills can do this with fewer employees now. Neat.

But Netflix Canceled My Favorite Show and I want to Revive it

Permalink to “But Netflix Canceled My Favorite Show and I want to Revive it”

This is also known as the “democratizing art” argument. First thing I would like to point out is that art is already democratized? It’s a skill. That you can learn. All you need to do is put in the time. It’s not a mystical talent that only a select few possess.

Artists are not gurus who live in the woods and produce art from nothing, and the rest of us are mere drones who are incapable of making art.

So in this case “democratization” really means “can make things without putting in the effort”. The result of that winds up being about as tepid as you might imagine.

Now, there’s a question in there of if a person is having to work all the time to simply live, won’t this enable them to “make art”? There’s another way to fix that, by reducing the amount of work they need to do to simply exist, but nah, we’re gonna automate the fun parts.

Hey, awesome artists who are making things with AI tools but are using it as a process augmenter - all good. I’m not talking to you. I’m talking to the Willy Wonka Fraud Experience “entrepreneurs”

But you know what, I’m not even really that concerned with people who want to make stuff on their own that is for their own enjoyment. I don’t think the result is going to be very good, and I’d rather have more people creating good stuff than fewer, but hey more power to ya.

Another aside: I really do not want to verbally talk to NPCs in games. I play single-player games to not talk to people. I don’t want to be subjected to that in the name of “more realistic background dialog”.

It’s just not going to work out like you think it will. What’ll actually happen is:

The primary thing I suspect is going to happen with the AI “art” is back to the cost-cutting efforts. Where you might have used stock art before, or a junior artist, you’re going to replace that with Dall-E.

For marketing efforts, that’s not an immediate impact. Marketing content is designed to be churned out quickly, and shotgunned into people’s feeds in an effort to get you to buy something or to feel some way, etc. I don’t think those campaigns are going to be as effective as the best campaigns ever, but eh, we’ll see I guess.

The most concerning uses are going to be the media companies that are going to replace assets in video games and movies. Fewer employees, lower budgets, and … dare I say … lower quality.

You see, diffusion models don’t let you tweak them, yet (although, who knows, maybe if we start doing deterministic “seeds” again we’ll get somewhere with how Sora functions). They also have a propensity to not give you what you asked for, so, yeah, let’s spend billions of dollars trying to fix that.

So, at risk of trying to predict the future (which I’m definitely bad at), I think we’re going to gut a swath of creatives, devalue their work, and then realize that “oh, no one wants this generative crap”. We’ll rehire the artists at lower rates and we’ll consolidate capital into the hands of a few people.

Meanwhile, we’ve eliminated the positions where people would traditionally learn skills, so you won’t be able to have a career.

Because…

We live in a society that requires money to function. Most of us sell our labor for the money we need to live.

The goal of companies is to replace as many of their workers as they can with these flawed AI tools. Especially in the environment we find ourselves in where VC money needs to make a return on investment now that the Zero Interest Rate Phenomenon (ZIRP) is finished.

Now, “every time technology has displaced jobs we’ve made more jobs” is the common adage. And, generally, that’s true. However, the main fear here isn’t that we won’t be working, it’s that it’ll have a deflationary impact on wages and increase income inequality. CEO compensation compared to average salary has increased by 1460% since the 70’s, after all.

What I think is different about previous technological advances (but hey, the Luddites were facing similar social problems) is that we’re in a situation where the amount of capital is being invested in the hands of a few companies, and only a very few of them have the resources to control this new technology. I don’t think they’re altruists.

This is not a post-scarcity Star Trek future we’re living in. I wish we were. I’m sorry.

Right. Uh. I’m not sure I can, but here are some things I’d like to see:

  • There are a number of really small models that can run on commodity hardware, and with enough tuning you can get them to give you comparable results to what you’re getting on some of the larger models. Those don’t require an ocean of water to train or use, and run locally.
  • We’re going to see more AI chips, that’s inevitable, but the non-generative models are going to benefit from that too. There’s a lot of interesting work happening out there.
    • I’m also pretty cool with DLSS and RSR for upscaling video game graphics for lower-powered hardware. That’s great.

I honestly hope I’m wrong and that the fantastical claims about AI solving climate change are real… but the odds of that are really bad.

This is the longest post I think I’ve ever written, we’re well over 7,000 words. I have so many thoughts it’s hard to make them coherent.

Perhaps unsurprisingly, I’ve not used Generative AI (or AI of any kind) for this post. I’ve barely even edited it. You’re getting the entire stream of consciousness word vomit from me.

Call me a doomer if you want, but hey, if you do, I’ve got a Blockchain to sell you.