Rendered at 18:16:24 GMT+0000 (Coordinated Universal Time) with Cloudflare Workers.
bentobean 5 days ago [-]
> As we all know, the foundation of Western diatonic music theory is ¹²√2, the ratio between the frequencies of successive semitones.
Nods knowingly. Yes, of course. I definitely know this.
anamexis 21 hours ago [-]
In a nutshell:
An octave (for example from a C to the next C) is a doubling in frequency. In the Western diatonic system, there are 12 notes per octave. (C, C#, D, D#, E, F, F#, G, G#, A, A#, B). Notes are "evenly spaced" within the octave - every note has the same ratio between its frequency and the frequency of the next note. Hence, that ratio is ¹²√2
n4r9 11 hours ago [-]
I always knew this but used to wonder why 12?, as I'm sure a lot of people did. It turns out of course to be a human choice, but the convenience of that choice can be codified mathematically.
On a piano you move up an octave by going up by 8 white keys, or 12 semitones (white and black keys). Going up by 4 semitones is called a "major third", which multiplies the frequency of the note by 5/4. If you do three major thirds you get an octave. However, notice that (5/4) multiplied by itself three times is 125/64 which is actually slightly less than 2.
In fact there is no way to tune a piano perfectly - there has to be a compromise in the intervals somewhere. The reason for this is exactly that no rational number (fraction) equals 2 when raised to an integer power.
soulofmischief 4 hours ago [-]
This is referred to as equal temperament, but one can also use just intonation. Each approach comes with tradeoffs and Western music mostly decided that equal temperament was worth it because instruments can play in any key without retuning.
Just intonation also suffers from harmonic issues when building certain chords, but the tradeoff is that there isn't "beating", or resonant pulsing due to frequency mismatches, since in equal temperament, the notes are slightly detuned in order to fit into the scale, as you've mentioned. Another benefit of just intonation is that it's been observed to be the instinctive intonation used by humans.
This is the equal-temperament tuning, which is used a lot now because it's simple and consistent. Other tunings were common in the past, where they'd set notes to simple fractions of other notes like 3/2 of the note 5 below, idk the actual numbers they used. 3/2 isn't any power of the 12th root of 2 so nothing in this new system is actually as harmonic as it should be, but nothing is particularly dissonant either and the system is simple.
stavros 11 hours ago [-]
Oh, that's the twelfth root of 2? That does make sense when you explain it like that, thank you. That's 12-TET, then?
anamexis 5 hours ago [-]
Yes, exactly. 12-TET stands for 12 tone equal temperament, where "temperament" is a tuning system.
There are other tuning systems, which I intentionally avoided discussing because it starts involving LOTS more music theory very quickly. But to quickly describe it: pleasant sounding combinations of notes generally occur at simple ratios. If you look at a simple major chord, like C-E-G, the E would be at 5/4 the frequency of C, and the G would be at 3/2 the frequency of C. However if you tuned a piano like this, it would be specifically anchored to the root note of C, as that's what we're referencing those ratios from. (This would be "the key of C major.") It just so happens that in 12-TET tuning, we get ratios that are very close to these simple fractions, and since the tuning is "equal", it works for any key/root note.
> I ignore other temperaments; they are all close enough to 12-TET.
As any reasonable person would.
neon_diogenes 5 days ago [-]
Im building some music playback software and am currently struggling with the implementation of a spectrum analyzer to visualize the music.
This is incredible stuff and I learned a lot. Well done sir.
Ps, also mourning the loss of Fable! It sorted out a 3 month bug hunt odyssey in 3 days. For a somewhat novel problem in a pretty niche area (DSD DoP audio crackle problems during certain playback edge cases).
matheusmoreira 20 hours ago [-]
Fable was quite relentless, it was fun watching it work. I described my lisp interpreter project's short term plans and long term roadmap, Fable thought for like 20 minutes then just told me it was all "inevitable" and started working on the stuff. Ever since then I started to picture Fable as some kind of Terminator.
Left me that code and a massive code review that unfortunately didn't contain any of the I/O and memory safety hardening I wanted. I haven't fully reviewed the code yet. I get a little sad when I read it. Not a US citizen so I'm not sure I'll ever get to use a state of the art model again.
Jblx2 18 hours ago [-]
>I get a little sad when I read it. Not a US citizen so I'm not sure I'll ever get to use a state of the art model again.
That's great news. I was going to cancel my subscription because of the export restrictions.
arcticbull 17 hours ago [-]
Also, note, everyone assumed it was going to be limited to citizens, but that's not what the government said. They put it under ITAR/EAR's US person definition which includes citizens, permanent residents, refugees and asylees. Basically only visa holders in the US were banned, and of course, people in foreign countries.
contrary to some people's intuitions, most people have never lived in or been to the united states.
arcticbull 32 minutes ago [-]
... did you think that's what I was implying? I'm not even American my guy. I was adding clarification because the commenter I responded to claimed they had to be a US citizen to ride.
matheusmoreira 14 hours ago [-]
I am one of the banned foreigners.
krackers 20 hours ago [-]
> thought for like 20 minutes then just told me it was all "inevitable"
I have in mind an image of ASI as something that's able to seamlessly work across time as if it was weaving cloth. Reasoning about not just first or second order effects, but able to richly play with the nature of causality itself. In the limit, it effects change far into the distant future simply by making only the most minute change in the present then sitting back and waiting for things to play out.
For an AI that can do this, things like "managing subagents" or "context compaction" become child's play. Perhaps we'll know if we're getting close by seeing how well models do at prediction markets.
actionfromafar 18 hours ago [-]
Predicting the future by making it.
mortenjorck 5 days ago [-]
I was not expecting the part where Fable produces a passable 3Blue1Brown-style explainer video of the algorithms it just implemented that sounds like it's narrated by a character from Dora the Explorer.
What a strange era we now live in.
kingstnap 21 hours ago [-]
Relevant YouTube video about content farming channels creating AI generated math explainers.
I actually had no idea Fable is able to generate videos from scratch like that. I guess it shouldn’t surprise me. But it never occurred to me.
nerdsniper 11 hours ago [-]
As long as the video can be created from code (TTS + animation libraries). It won't do diffusion for a 10-minute long video, obviously.
sigmarule 2 days ago [-]
You mean what a strange era an opaque set of administration-approved companies live in...
MisterKent 5 days ago [-]
That was my experience with Fable as well. Pulled my extremely complex project that I could squint and see was possible, but actually put mathematical concreteness to things in a way I could only intuit.
On the flip side, visualizers have always fascinated me. I love this one, but one build off I've always wanted to see: analyze the entire file a priori, and then generate the visuals. Sort of like a normalization pass, but getting longer form structures decoded ahead of time could be pretty neat.
carb 18 hours ago [-]
The video game Audiosurf[1] did this, as it has to generate the game track ahead of time so that it would be playable!
I've been planning exactly what you describe in that second paragraph for creating videos for the music I make. It's a lot easier than doing it realtime, and because I make the music, I'm planning on doing it multi-track so I can put individual stems in.
tkgally 18 hours ago [-]
My Fable example is not nearly as cool but still (to me) impressive.
Last year, I would occasionally test the latest models by vibe-coding in-browser music generators using only HTML, CSS, and JS. Here’s one made in July by Gemini:
It’s still a long way from creating music I would want to listen to, though.
nl 17 hours ago [-]
I love these kind of things - we forget so quickly how much these have improved.
neynt 17 hours ago [-]
These are lovely. That the Fable one is one-shot is shocking.
tkgally 14 hours ago [-]
Just for reference, here is the metaprompt I first gave to Opus:
“I want to ask Claude Code to write a browser-based synthesizer for me. Please prepare a prompt for it that I can give to it for it to write the synthesizer. The synthesizer should automatically create interesting polyphonic music in which the various voices play off against each other in both harmony and contrast. The controls will affect the tone, rhythmic patterns, number of voices, complexity and randomness of the melodies, and other features. The controlled features should be original—not just standard synthesizer functions—and encourage creative explorations even by naive users. So write a prompt that I can give to Claude Code to create that synthesizer.”
I then gave the prompt produced by Opus to Fable in Claude Code.
stavros 11 hours ago [-]
You'd probably get better results giving the metaprompt to Fable. Information doesn't magically get created, running something through a dumb model to give to a smart model gets worse results than if you just give the smart model the prompt directly.
tkgally 10 hours ago [-]
That occurred to me, too, when I posted that metaprompt above. But that was only the second or third prompt I gave to Fable during the couple of days I had access to it, and I used the metaprompting strategy that had worked well with earlier models. If and when I get access to Fable again, I will try giving it a similar short prompt directly.
I should mention that Fable also did an impressive job on a couple of major project-redesign tasks I gave it. Those aren't things I can share here, though.
stavros 10 hours ago [-]
The general way I think about it is that it's OK (and usually encouraged) to have a smarter model prompt a dumber model, but never the other way around.
reverius42 10 hours ago [-]
Amazing how quickly things are moving that you can call Opus a "dumb model"!
stavros 10 hours ago [-]
That is actually true and amazing to think about. It was mostly a figure of speech, but also kind of true when compared to Fable?
reverius42 9 hours ago [-]
So, how many months from now before we start calling Fable a "dumb model"? I give it maybe 6?
stavros 9 hours ago [-]
Well, if Mythos gets rolled out generally, zero!
monk_grilla 18 hours ago [-]
That generated video was eye opening for me. I've been using Opus in Claude Code for studying and at work, but it never occurred to me to use 3b1b's excellent python library for generating maths visualisations to let it generate such good graphical demonstrations.
recursive 21 hours ago [-]
This is amazing.
One of the weaknesses of the video is that there are artifacts in the narration of passing through a text layer. "Bass" is pronounced as the fish at one point. "Wound" is pronounced as the injury. It's clear that these are homonyms of what was actually intended by the script.
nerdsniper 11 hours ago [-]
Those are the things that can be most clearly pointed out but the (wrong) meaning is also constantly asserted by the TTS voice using the wrong inflection, intonation, and emphasis.
It honestly makes my ears bleed. To me, it sounds like an extremely unintelligent person reading a teleprompter. Absolutely nothing going on between the ears.
nl 17 hours ago [-]
> This model doesn't shy away from drawing upon all its knowledge. It casually refers to alpha premultiplication and fundamental frequencies in the same breath. It is fond of acronyms.
Yes - I had Fable tackle some long-standing bugs in some code I had and I quickly lost track of what it was talking about and had to ask a lot of clarifying questions.
It killed my bugs like they were nothing though. Opus and even GPT5.5 had churned on these same things for ages, but even with my manual help we made no progress.
It felt like they weren't the slightest bit challenging for Fable. So glad to see it back!
mohamedkoubaa 20 hours ago [-]
>The writing is also literary. It draws an analogy between the 12 musical pitch classes and the 12 markings on a clock. Noise lingers. Material surges off the rim.
I absolutely hate this revolting writing style by LLMs
JHonaker 10 minutes ago [-]
I physically cringed at both the quote and the surrounding section. The idea of this project is cool, but the amount of LLM glazing is bizarre.
aetherspawn 19 hours ago [-]
It’s like an angsty book written by a teenager with only a middle-school level understanding of the world: throwing a few completely random words in there to sound smart.
Hard to believe that something that writes so terribly is so good at mathematics, given that writing non-slop must be at least some part formulaic.
oooyay 19 hours ago [-]
This was really cool. I would love to play some Dave Tipper on this to see what it looks like.
beepbooptheory 21 hours ago [-]
Kinda interesting how its just like a FFT chart in a circle but perhaps the author is not aware that is the case. Would be curious to know what things were "implmentation details" for the fancy AI and what wasn't.
I could be wrong but milkdrop already would do light FFT analysis for effects right?
recursive 21 hours ago [-]
Pretty sure the author is aware. I think the interesting part is that the frequency is logarithmic and one rotation = 2x. This means you can make musical observations about chord qualities from the plot. That's not generally true for FFT plots.
beepbooptheory 20 hours ago [-]
You are right, it is cool idea in general. But, idk if we are seeing the same thing, in practice it ends up being kinda mushy looking right? In part, I think, because like its not rooted by a given root note, so at best we end up seeing constantly rotating, slightly different clock arrangements. Even in ideal conditions, anything like, e.g., Cmaj7 to Em is going to look almost the same, which feels off given the perceived harmonic change. I don't know if the arrangements being the same after transposition is as much a feature as a bug I guess.
recursive 19 hours ago [-]
Yes, it doesn't seem like it actually works super well. Love the idea though. I would probably pay money for a thing that did this accurately.
Personally, I think it's a feature that Em looks substantially similar to CM7. If this was all working fully as intended, I suppose you might get a clue from the darker-colored bass note.
A bass player probably has a different perspective, but as a keyboard player, it's pretty much always fine to play an Em over a CM7. It's just a "voicing choice".
RugnirViking 3 hours ago [-]
I suggest you watch the explainer video the ai made, its pretty awesome, but yeah, thats exactly what it is, with some depth to how exactly it uses FFT, and solving some problems with getting good resolution on different frequencies
neynt 18 hours ago [-]
I may have been overselling the AI's initiative in the article -- it still required a fair bit of steering. I put most of the prompts involved here: https://saltblock.neynt.ca/waveloop-prompts.md
Wrapping FFT in a log2(freq) % 1 spiral was part of the human direction :)
_jx 20 hours ago [-]
I'm also curious about the implementation details, the result is visually beautiful, but the code could be interesting too, at least as a 'Fable hystorical artifact'. Is it visible on github?
yes fable left you a gift. i wish you can continue to further develop the project when things return to normal, if they ever will. or at least branch and use an alternative model.
as fable came out, the first thing i did was asking it to analyze some of my projects and ideas and write plans and suggestions, more than implementations. nothing incredibly revolutionary came out but i still see these as its 'last will'.
i'm sure some of you here have a few fable's relics/stories that you consider special precisely because of its abrupt demise.
_jx 19 hours ago [-]
cool, yeah all comments are preserved, i'll read the source. thanks!
scubbo 21 hours ago [-]
> steady stream of promotions until they cap out at L5
Am I missing a joke? L5 is just a single promotion away from hiring-out-of-college, at least for the FAANG that I was at.
iLoveOncall 19 hours ago [-]
Only Amazon starts new grads at L4, the rest starts them as L3 so L5 would be senior.
Not that 2 promotions is a "steady stream"...
10 hours ago [-]
20 hours ago [-]
manofmanysmiles 14 hours ago [-]
I love this!
tquinn 19 hours ago [-]
Very, very cool.
AIorNot 18 hours ago [-]
That Fable Generated video is something else... wow I love it.. along with the app
Anyone who says LLMs can't reproduce intelligence I mean really? can you make this? its not just a talking database guys or a stochastic parrot...
too bad Fable was nerfed/gatekept by the Trump corruption selection committee..but the technology will not be silenced.. we just need to get humans ready for this capabilities. the jury is out on the future of that.
opan 19 hours ago [-]
Thought this might be about the video game series, and even after seeing "Fable 5" my first reaction was "wow, up to 5 already?". It looks like they made it up to Fable III before rebooting the series with a 4th game (well, it's not out yet), so there is no Fable 5 yet.
TFA seems to be about some AI thing. Crazy how many words are actually just AI things now. Learning, reinforcement, language, model...
Nods knowingly. Yes, of course. I definitely know this.
An octave (for example from a C to the next C) is a doubling in frequency. In the Western diatonic system, there are 12 notes per octave. (C, C#, D, D#, E, F, F#, G, G#, A, A#, B). Notes are "evenly spaced" within the octave - every note has the same ratio between its frequency and the frequency of the next note. Hence, that ratio is ¹²√2
On a piano you move up an octave by going up by 8 white keys, or 12 semitones (white and black keys). Going up by 4 semitones is called a "major third", which multiplies the frequency of the note by 5/4. If you do three major thirds you get an octave. However, notice that (5/4) multiplied by itself three times is 125/64 which is actually slightly less than 2.
In fact there is no way to tune a piano perfectly - there has to be a compromise in the intervals somewhere. The reason for this is exactly that no rational number (fraction) equals 2 when raised to an integer power.
Just intonation also suffers from harmonic issues when building certain chords, but the tradeoff is that there isn't "beating", or resonant pulsing due to frequency mismatches, since in equal temperament, the notes are slightly detuned in order to fit into the scale, as you've mentioned. Another benefit of just intonation is that it's been observed to be the instinctive intonation used by humans.
https://en.wikipedia.org/wiki/Just_intonation
There are other tuning systems, which I intentionally avoided discussing because it starts involving LOTS more music theory very quickly. But to quickly describe it: pleasant sounding combinations of notes generally occur at simple ratios. If you look at a simple major chord, like C-E-G, the E would be at 5/4 the frequency of C, and the G would be at 3/2 the frequency of C. However if you tuned a piano like this, it would be specifically anchored to the root note of C, as that's what we're referencing those ratios from. (This would be "the key of C major.") It just so happens that in 12-TET tuning, we get ratios that are very close to these simple fractions, and since the tuning is "equal", it works for any key/root note.
As any reasonable person would.
This is incredible stuff and I learned a lot. Well done sir.
Ps, also mourning the loss of Fable! It sorted out a 3 month bug hunt odyssey in 3 days. For a somewhat novel problem in a pretty niche area (DSD DoP audio crackle problems during certain playback edge cases).
Left me that code and a massive code review that unfortunately didn't contain any of the I/O and memory safety hardening I wanted. I haven't fully reviewed the code yet. I get a little sad when I read it. Not a US citizen so I'm not sure I'll ever get to use a state of the art model again.
...or tomorrow:
https://news.ycombinator.com/item?id=48740771
https://www.ecfr.gov/current/title-15/part-772/section-772.1...
I have in mind an image of ASI as something that's able to seamlessly work across time as if it was weaving cloth. Reasoning about not just first or second order effects, but able to richly play with the nature of causality itself. In the limit, it effects change far into the distant future simply by making only the most minute change in the present then sitting back and waiting for things to play out.
For an AI that can do this, things like "managing subagents" or "context compaction" become child's play. Perhaps we'll know if we're getting close by seeing how well models do at prediction markets.
What a strange era we now live in.
https://youtu.be/mRO_QonhC2c
On the flip side, visualizers have always fascinated me. I love this one, but one build off I've always wanted to see: analyze the entire file a priori, and then generate the visuals. Sort of like a normalization pass, but getting longer form structures decoded ahead of time could be pretty neat.
[1] https://en.wikipedia.org/wiki/Audiosurf
Last year, I would occasionally test the latest models by vibe-coding in-browser music generators using only HTML, CSS, and JS. Here’s one made in July by Gemini:
https://gally.net/temp/20250701synthesizer-gemini2/index.htm...
And one made in September by Claude:
https://gally.net/temp/20250917rhythmdrone/index.html
With Fable, I was able to one-shot something much more sophisticated:
https://gally.net/temp/20260610-fable-synthesizer/index.html
It’s still a long way from creating music I would want to listen to, though.
“I want to ask Claude Code to write a browser-based synthesizer for me. Please prepare a prompt for it that I can give to it for it to write the synthesizer. The synthesizer should automatically create interesting polyphonic music in which the various voices play off against each other in both harmony and contrast. The controls will affect the tone, rhythmic patterns, number of voices, complexity and randomness of the melodies, and other features. The controlled features should be original—not just standard synthesizer functions—and encourage creative explorations even by naive users. So write a prompt that I can give to Claude Code to create that synthesizer.”
I then gave the prompt produced by Opus to Fable in Claude Code.
I should mention that Fable also did an impressive job on a couple of major project-redesign tasks I gave it. Those aren't things I can share here, though.
One of the weaknesses of the video is that there are artifacts in the narration of passing through a text layer. "Bass" is pronounced as the fish at one point. "Wound" is pronounced as the injury. It's clear that these are homonyms of what was actually intended by the script.
It honestly makes my ears bleed. To me, it sounds like an extremely unintelligent person reading a teleprompter. Absolutely nothing going on between the ears.
Yes - I had Fable tackle some long-standing bugs in some code I had and I quickly lost track of what it was talking about and had to ask a lot of clarifying questions.
It killed my bugs like they were nothing though. Opus and even GPT5.5 had churned on these same things for ages, but even with my manual help we made no progress.
It felt like they weren't the slightest bit challenging for Fable. So glad to see it back!
I absolutely hate this revolting writing style by LLMs
Hard to believe that something that writes so terribly is so good at mathematics, given that writing non-slop must be at least some part formulaic.
I could be wrong but milkdrop already would do light FFT analysis for effects right?
Personally, I think it's a feature that Em looks substantially similar to CM7. If this was all working fully as intended, I suppose you might get a clue from the darker-colored bass note.
A bass player probably has a different perspective, but as a keyboard player, it's pretty much always fine to play an Em over a CM7. It's just a "voicing choice".
Wrapping FFT in a log2(freq) % 1 spiral was part of the human direction :)
as fable came out, the first thing i did was asking it to analyze some of my projects and ideas and write plans and suggestions, more than implementations. nothing incredibly revolutionary came out but i still see these as its 'last will'.
i'm sure some of you here have a few fable's relics/stories that you consider special precisely because of its abrupt demise.
Am I missing a joke? L5 is just a single promotion away from hiring-out-of-college, at least for the FAANG that I was at.
Not that 2 promotions is a "steady stream"...
Anyone who says LLMs can't reproduce intelligence I mean really? can you make this? its not just a talking database guys or a stochastic parrot...
too bad Fable was nerfed/gatekept by the Trump corruption selection committee..but the technology will not be silenced.. we just need to get humans ready for this capabilities. the jury is out on the future of that.
TFA seems to be about some AI thing. Crazy how many words are actually just AI things now. Learning, reinforcement, language, model...