TrickJarrett.com

Posts Tagged: programming

Morning glowbug debugging

Some morning debugging as I noticed the normal automated post from last night did not run. Quickly realized my introduction of related links for blog posts broke it. But, in fixing that, I discovered another bug in it, this time pertaining to the running of the script for a specific date rather than the day it is being executed (aka, the code which lets me backfill a day for a post.)

Tracked down the error to a boneheaded variable usage mistake after a bit of puzzling out what was going wrong.

I do quite enjoy morning challenges for my brain, they feel like going to the gym for my hippocampus.

Share to: | Tags: glowbug, programming

Testing Related Links

There are times where something is going on and I have multiple links I want to reference. Usually this is news and capturing links to multiple news stories, etc.

I've now added the functionality where I can attach multiple links to an entry, beyond just embedding them in the body of the post. It allows me to standardize their display, and potentially more functionality in the future.

Related Links:
Share to: | Tags: programming, glowbug

40 Gig file to check even/odd of a number

Simply marvelous. This is the sort of thing the Internet was created for.

Share to: | Tags: programming, humor

Share to: | Tags: markdown, programming

Small Glowbug Updates

I've been making some small updates to Glowbug (the engine under this blog.) Nothing major, but a few small things recently.

First, I made it so the system can automatically give image uploads a random name without me having to do anything before uploading. This also works for the system where I have it able to download images from the Internet for local hosting.

Second thing, which I did today, was that I modified the Markdown parsing to apply the caption text for images. Not only as the accessibility 'alt' text for images, but also so that the text will appear when images are moused over.

Share to: | Tags: programming, glowbug, blog

"Pipe Dreams: The life and times of Yahoo Pipes"

I loved Yahoo Pipes, I used it for a number of things back in the day. One way I remember using it was for filtering on RSS feeds, I could use a feed as an input and have it filter out posts with specific keywords. I did it for avoiding spoilers on movies and shows as I recall.

It was very neat and robust, allowing for programming functionality without having to do a lot of the boring parts. Looking forward to diving into this write-up about it.

Share to: | Tags: technology, yahoo, programming

Miller Shuffle Algorithm

Everyone loves 'shuffle' on their music player. I never really thought about how it worked to ensure that it randomly shuffled through a playlist without repeats, but this guy has, and he has improved on it.

From the linked instructables page:

With the case of an MP3 player, or any play-list shuffle, one might and apparently some do (even on big name electronics and streaming services), simply use an operation like songIndex=random(NumOfSongs). This use of a PRNG will give you a good mathematical randomness to what is played. But you will get many premature repetitions of song selection. With a population of 1000 to choose from you will get ~37% repeats out of 1000 plays. From the start of a listening session on average a song will repeat within ~26 songs. The % of repeats is the same regardless of the selection population size.

The accepted goal of a “Shuffle” algorithm is herein defined as providing means to reorder a range of items in a random like manner, where each item is referenced once, and only once, when going through the range (# of items). So for 1-52 (think card deck) you get 52 #s in a pseudo random order, or similarly for a song list of 1000s. Re-shuffling gives a very different mix.

The Fisher-Yates (aka Knuth) algorithm has been a solution that fixes this unwanted repetition. The 1000 songs play in a 'random' order without repeating. The issue this algorithm does come with is the added burden of an array in RAM memory of 2 times the maximum number of songs (for up to 65,000 songs 128KB of RAM is needed) being dedicated to shuffled indexes for the duration that access to additional items from the shuffle are desired (so as to not give repeats).

After reading it and trying to read the code (written in C, I believe), there is also a link to a github repo with more iterations on the algorithm. A few excerpts:

The way the algorithm works its magic is by utilizing multiple curated computations which are ‘symmetrical’, in that the range of values which go in are the same values which come out albeit in a different order. Conceptually each computation {e.g. si=(i+K) mod N } stirs or scatters about the values within its pot (aka: range 0 to N-1) in a different way such that the combined result is a well randomized shuffle of the values within the range. This is achieved without the processing of intermediate “candidates” which are redundant or out of range values (unlike with the use of a PRNG or LFSR) which would cause a geometrically increasing inefficiency, due to the overhead of retries.

So basically you can query the algorithm, providing the "deck" of things, and what your location in the query is, the seed info (for randomness), and then it can tell you what is next in the shuffle without actually having to move things around.

For example, let's say we had a deck with five cards in it: [a, b, c, d, e] - We just tell the algorithm three things:

  1. What our current index is in the deck
  2. A "shuffle ID" which is the seed for the randomizing
  3. How many entries are in the deck

So if we're just starting it, we'd say we're at position -1, which doesn't exist. We give it the random shuffle ID of "123" and lastly we tell it there are 5 entries. It then calculates that the next position to start playing is 4th in the queue. Then when it is time to play the next song the algorithm is fed "4", "123" and "5" to then return 2, etc.

Under the most common other way of handling randomizing playlists what happens is it takes the indexes for each item in the deck, then shuffles them. So it might create a separate list of the deck positions, [2,5,1,3,4] which requires you to maintain this memory. For desktop computers, obviously that is not a problem. But what if you're making a tiny computer using a simple board like an Arduino or something and you have memory limitations, etc. This is a big step forward for it.

Will it change the world? Doubtful. But still interesting to learn about.

Share to: | Tags: algorithm, programming

Adding the Writing Log to End of Day Automated Entries

As you'll see in today's end of day post, the auto inclusion for tracking my writing log is done. At least for a first draft.

The output is very rudimentary, if I have a writing log entry for the day, then it will insert a header for "Writing Log" and then a sentence giving context to what I did.

Example:

Trick wrote 996 words over 40 min. with an average writing speed of 24.9 words per minute.

It has some additional functionality. I can theoretically track work on multiple projects at once, so if I do that it'll add a note for how many projects I worked on. Additionally, it will track my daily streak total if I've written for at least 2 days in a row.

I was thinking about doing a weekly summary every Sunday, but I haven't written that yet. I'll add it to the backlog.

Share to: | Tags: glowbug, programming, writing

Writing Tracker Coding

Spent a bit last night and this morning coding for Glowbug (this blog.)

For my larger novel writing efforts I've been using a Google sheet to track progress after each writing session. Overall, it works great and functionality-wise, there's no reason to change it.

Except... I decided I wanted to fold that tracking into my blog so that my writing updates can be integrated with the automated end of day posts.

This morning I finished implementing the basic tracking, after spending an infuriatingly long time tracking down a stupid typo which caused bugs. Next up is adding the graphs for tracking, adding more admin functionality, and then integrating it into the end-of-day posting. I feel like I should be able to knock most of that out tonight.

Share to: | Tags: programming, glowbug

Python is coming to Excel

I don't know how much I'll actually use it, but it could prove very interesting.

Share to: | Tags: data analysis, spreadsheets, programming

"The ancient technology keeping space missions alive"

Designed to fly in formation to investigate the interaction between charged particles from the Sun – the solar wind ­– and the magnetic bubble surrounding the Earth, known as the magnetosphere, Cluster II ranks as one of the most successful and long-lasting science missions ever flown. The satellites (named Rumba, Salsa, Samba and Tango, since you ask) have just celebrated 23 years in orbit.

I just love the name of the four satellites. But the article is a great read overall, highly recommend.

Share to: | Tags: science, space, programming, technology

Discussion around XHTML and its impacts on HTML

A great walk down the journey of the evolution of HTML through XHTML to today's HTML 5. Specifically the author comes at it discussing how XHTML led people to self closing some html tags, such as:

<img src="[URL" />

The /> at the end is the part in question and he explains why it isn't required anymore; something I didn't realize and will happily begin ditching.

Share to: | Tags: html, programming

This is a test of making posts from my blog to both Bluesky and Mastodon.

Edit:

Success!

I intend to give myself control over which to post to, but right now it defaults to both. I'm not bringing the bot posts over to Bluesky because I feel strongly about being able to delete them automatically, but we're on our way.

The base PHP came from James Cridland's blog. It's rudimentary for just posting text for now, but we'll get there for the rest.

Share to: | Tags: testing, programming, bluesky, mastodon

Testing Bluesky embeds

It's a lovely simple structure, if it works. I had a post for this before but pulled it down while triaging some of the site oddities this morning.

<iframe src="https://bsky.link/?url=[Post URL]"></iframe>

Share to: | Tags: programming, glowbug, social media, bluesky

Sometimes the correct answer is to not touch it anymore

Yesterday I did a surprising amount of coding on Glowbug. Most of it entirely invisible to you all. The quick overview:

  1. Bluesky embeds (we'll come back to this)
  2. Updating my css editor page in the admin to be able to edit any of the template files
  3. Updated my image management page in the admin to have pagination
  4. Simple CSS updates

Okay, so let's come back to Bluesky.

I managed to get invited to it yesterday, as noted when I shared my (current) account name. It's fine. It feels like basic Twitter. What I loved though was seeing how simple generating an embed URL can be. It's just an iframe to a path which includes the desired post.

Great.

Except that took me down a rabbit hole of trying to understand why the Markdown generation of my blog continually failed to handle embedded html example iframe to show how clean the embed code is. Embedding html in posts in my blog is currently one of the things which is broken and I haven't really fixed because I never do it. Except for yesterday when I wanted to do it.

It's broken in a few ways:

  1. When I go back to edit that post, the embedded html is processed as html into the in-browser editor, which often leads to it breaking itself.
  2. The code tag delineations are no help here. Even wrapping the iframe in HTML's 'pre' or 'code' tags didn't stop it from rendering as html. And I have no idea why.

So, this means once I submit a post with embedded html, I can't touch it again except directly through the database.

I'll figure this out eventually, but I didn't last night.

And as I did these various codings, I apparently broke something. The automated end-of-day post didn't happen. The code that generates it wasn't even something I worked on yesterday, but because of how the blog is coded it's quite possible I accidentally messed it up.

So I tried to figure it out this morning. In doing so, I also noticed the newsletter generation wasn't working. So I worked on that some.

None of these, by the way, have satisfactory "I fixed it!" resolutions. The newsletter eventually sent, though I don't know why. We'll see if the end of day post runs tonight or not.

Sometimes, when you run and use your own code, if it suddenly works - the correct answer is to not touch it anymore.

Share to: | Tags: javascript, php, glowbug, programming

Sometimes your blog breaks for no reason

I don't know why.

A few weeks ago, I rewrote how the blog generates what I call the "link tails" which is when a link is followed by the (domain.com) information. It was originally part of the publication of the blog and thus embedded in the html files directly. But then I decided I wanted to move it to be javascript.

Honestly... I'm not sure why. I had a reason. I'm sure of it.

In any case, it wasn't working today. I'm not sure what happened, but it appears to be an issue with how I was filtering the links. Not every link on the blog gets a tail. Only ones which go off the blog (so no trickjarrett.com tails) as well as no links in the sidebar.

And something stopped working with that such that no links were getting tails. I did the filtering a different way and it's all fixed.

Also it looks like there was a reversion of the CSS file. I'm not sure why, but something got lost and I had to recreate a bit of CSS. My best guess is that I accidentally overwrote the site's CSS file with a local copy which was not the most recent, as I sometimes tweak the CSS file on the server. Oops.

In any case, it's fixed now.

Edit: Regarding Wikindle project

It finished running overnight. I spent a little while coding a cleanup function which scans downloaded files and removes ones I don't want. This is one of the features I mentioned wanting. So I've got that figured out. And it blacklists articles so I don't re-download them in the future.

I started working on having it generate cross linking between articles, but there were enough bugs that I stopped and decided to come back to it.

For now, I consider the project done and stable. There are improvements, but it's time for me to move back to working on the blog and on Behemoth.

Share to: | Tags: glowbug, blog, programming, project

End of night update: The script is running, it's currently almost done with articles starting with O.

It's been running for probably 15 hours now.

We are up to 927 megs of text data.

I am estimating it will be in U/V when I wake up. We'll see.

Share to: | Tags: programming, wikindle

New Book Review Function

The coding bug continues. I've been checking out bookmarks.reviews, a book review website from LitHub.

I'm not sure yet how it will translate for RSS and E-mail. As of now, the answer is poorly. But I'm going to work on it.

Here's what it currently looks like. I'm working on increasing the thumbnail size.

For those of you visiting in the browser, at least currently, you'll see the actual embedded code implementation:

Monsters by Claire Dederer
Monsters by Claire Dederer
Share to: | Tags: programming, glowbug

And so the weekend begins

After a good productive workday the wife and I headed to our local plant business and bought some new plants for the yard, both flowering and fruiting. From there we came home and did some gardening before turning to some other chores.

I gave the car a light cleanout and moved some stuff around, and am now taking a breather before starting dinner soon.

I've also been fiddling more with Wikindle. I solved the issue of needing to find new articles to download. First off, it now can take in a list of page categories and pull all articles in that category. The goal is not to recreate Wikipedia on my local machine, but I do want my corpus of articles to be large enough that it covers the "normal" things people look out for. I also don't want bad articles, so I'm currently limiting all categories to be ones which are maintained for quality by Wikipedia.

As I write this, it's in the process of making the pull. We've ballooned from the 8000 this morning, to pulling almost 55,000.

Currently it is pulling from four categories to get that number (well, aside from the extra 100 it is pulling for being popular.)

The download process still has work to be done. I'm still not getting images from articles and I know some things are not translating smoothly, especially in the math sections.

The next action items as I see them:

First, figure out images. I'm not sure where they are being filtered out of the text, and then I need to be able to pull them down and convert the tag to work with the modern day markdown encoding for it.

Second, I need to dig into other conversions from html to markdown and look for other articles or issues with import.

Third, I want to also identify categories of articles I don't want. For example, I'm not going to go to this document for information about state roads in New Jersey (which is currently in the corpus.) So I'll need to add document filtering and a blacklist of articles so it doesn't get re-added.

Fourth, re-add cross linking via markdown/wiki text for articles which exist in my Wikindle.

And lastly, once this is all figured out I will need to figure out the whole "putting it on the kindle" or some other similar long-lasting device. The real nerdy thing would be building my own e-ink device or something. We'll see.

Share to: | Tags: life, programming

Update on last night's 'Wikindle' code

Code ran without issue, though the formatting was slightly off.

This morning I hammered out the code to grab the top 100 most popular entries from yesterday and add them to the archive. I'm not sure how useful that will actually be, a lot of those entries are pop culture (am I really going to need an entry about the new Kraven Marvel series?) But we'll see. It isn't like this is a major space hog.

Last night snagged roughly 8,000 entries and it took up 250 megs. Plenty of space.

One thing which is lost in this process is any cross linking. I'd love to go through and add that back in, or even better figure out how to best avoid that bring stripped out from the start. We'll see. In any case, a fun diversion to distract me.

Share to: | Tags: programming, project, wikipedia, python

Wikipedia on my Shelf Progress

At this moment my PC is downloading nearly 10,000 entries from Wikipedia as part of my idea of a locally hosted version of the encyclopedia. I'm making use of the Vital articles project, specifically level 4, which is roughly ten thousand entries on various topics.

I cobbled together a python script to pull from the API, parse the HTML to markdown, strip any lingering tags (such as spans, abbrs, etc.)

It isn't perfect, no images from the entries are brought over. I'll work on that further in a future iteration.

Not sure what to call this project. I called the folder "Wikindle" as a smerging of "wikipedia" and "kindle" but I don't love that name. I'll play around with it.

I also have an idea for this to be a "living" archive. Where perhaps it is a cron script which runs nightly for that day's top X entries, and snags them, accruing more and more notable content over time. Obviously quality will vary, but we'll see how it goes. Then, every few months, I update the Kindle.

Lastly, my observation as I work on code this evening. It remains comforting that ChatGPT struggles with some very basic coding concepts. I know lots of people worry LLMs will lead to the end of programmers but that simply isn't true as far as I can see. At least, not without more massive steps forward.

Share to: | Tags: programming, project, wikipedia, python

Behemoth morning code

I woke up at 5 this morning and so I decided to finally start implementing stuff on Behemoth, rather than just getting it framed and basically styled.

So, this morning, I figured out login and registration in Laravel. It was overall pleasantly easy. Though I made my life harder as the default identifying field is 'name' with the framework. Because I changed it to be 'username' in the table it didn't work right away. I had to trace through code and update it which took a little longer. That said, I had it all sorted and working in roughly 40 minutes.

I suppose I could have just reverted the table's column name but that breaks the naming structure I have defined so I'd have to go through and update table details.

So, it's fixed and working. Now the real fun can begin.

Share to: | Tags: programming, behemoth, php, laravel

Behemoth Coding Update

The biggest obstacle to me starting on Behemoth, my collection management software, has been the database. I've had the schema concept for over a year; but, now that I have finally started to sit down and code it I've begun to find issues with it. They are, thus far, all addressable and largely just oversights from the simplistic schema. One example is that I had not fleshed out how exactly I'd track the collection-specific entry details in such a way that would enable the flexibility of the system I envisioned.

I am convinced the idea is sound, it just is proving to require a bit more nuance than I had originally conceived.

As I write more, I am more and more excited to be learning Laravel. Its power has been immediately obvious. Comparing the experience from the work on web apps I've coded entirely on my own for pathing and models, as well as their implementation of database management for deployments, etc. It is just obviously powerful and robust.

Additionally, I'm learning Tailwind CSS. It's an interesting concept which entirely assigns elements to objects rather than adopting the semantic CSS concept where you define larger classes, etc. I will say I'm not yet convinced by it, the way I am with Laravel. But we'll see. Perhaps the simplicity of it will convince me.

Share to: | Tags: programming, behemoth, laravel, php

I'm quite enjoying delving into Laravel for app development as I work on Behemoth. We'll see how this goes.

Share to: | Tags: programming, php, laravel

Behemoth has begun

Last night, I finally truly began work on what I've come to call 'Behemoth.' It is the generalized collection management app that I've been thinking about for a few years now. Originally I was going to call it CeMeNT for "Collection ManagemeNt Tracker" but the more I thought about this project and how big it would be, I decided to give it a more impressive sounding name.

The idea centers around a flexible abstract database design which would allow a single centralized system to manage collections of all different types. The two most notable for Katie and me are her PEZ and my Magic: The Gathering cards. (Disclosure: I work for the makers of Magic, Wizards of the Coast.) But we could also track our vinyls, books, boardgames, and metal lunch boxes in a system like this.

Elwood woke me up at 3:30 this morning and I sat down to re-diagram the database schema. I came up with initial schema design back in December of 2021:

And at the time, when I first shared it, I framed it as a concept that I wouldn't write because I knew it would be a very large project. So I sat on it for the past 18 months, occasionally it would come back to mind and I'd think more about it. Over that time I'd refined the concept a bit mentally, but I had never sat back down to redo the schema.

Well, with this morning's wakeup, I found my brain iterating on the Behemoth project and thinking about the database.

So, I sat down and rebuilt the schema which is a bit more robust version 2:

The biggest change from V1 to V2 is that I formalized a design where you have the 'library' which is the "master list" of possible things in a collection. This simplifies a few things in the overall management, though it introduces its own issues.

A lot of the "version" details are tracked in the library rather than the individual collection. This makes sense as you'll have new items which refer to the library. But, collectors love to collect random things. What if your collection has something which isn't part of the larger library, whether by obscurity, or maybe something like an artist signature - how is that tracked? I'm not sure yet. I'm letting my subconscious chew it over right now.

Another thing is that collections come in two varieties. I have come to think of them as the vertical and the horizontal. Verticals are the ones which are "easy." They have a very clear backbone, whether it is "Magic: The Gathering cards" or "PEZ dispensers." Horizontal collections might be for an IP or fandom, where it is much more broad in varieties and details. Think of a "Star Trek" collection which might include DVDs, books, clothes, toys, etc. Currently Behemoth is well suited for vertical collections, and so I'm trying to figure out how best to handle horizontal collections.

To that end, I'm still trying to come up with different collections, and find corner cases which might be out there that this structure can't handle.

Even as I seek more unknowns, there are a few known things I need to add and account for. That said, I am quite happy with the concept V2 currently represents.

Two of the things I know I still need to account for:

  1. Commenting / discussion - Allowing additional conversations on the collection.
  2. Transaction tracking - Tracking when we buy or trade for things in the collection. Tracking the acquisition price, where we got it, etc.

While the database comes into focus and crystallizes, I'm also preparing to actually start coding.

I'm going to build this in PHP using the Laravel framework. I'm still deciding on the CSS, probably Tailwind. The database will be MySQL. I debated if I should use MongoDB to allow it to be more flexible, but the truth is I don't know MongoDB well enough to schema it properly. So I'm sticking with what I know, MySQL.

We'll see how this goes.

My intent is to open source the project and allow others to contribute to it. In theory. If it works and I'm not embarrassed at the quality of my code.

Share to: | Tags: programming, project