Quickly adding face properties to regions
20 September 2024 | 12:25 am

  • [2024-09-20 Fri]: Set the first frame of the animated GIF to a reasonable backup image.
  • [2024-09-20 Fri]: Add :init-value nil to the mode.
output-2024-09-20-13:09:17.gif
Figure 1: Screencast of modifying face properties

Sometimes I just want to make some text look a little fancier in the buffer so that I can make a thumbnail or display a message. This my-add-face-text-property function lets me select a region and temporarily change its height, make it bold, or do other things. It will work in text-mode or enriched-mode buffers (not Org Mode or programming buffers like *scratch*, as those do a lot of font-locking).

(defun my-add-face-text-property (start end attribute value)
  (interactive
   (let ((attribute (intern
                     (completing-read
                      "Attribute: "
                      (mapcar (lambda (o) (symbol-name (car o)))
                              face-attribute-name-alist)))))
     (list (point)
           (mark)
           attribute
           (read-face-attribute '(()) attribute))))
  (add-face-text-property start end (list attribute value)))

enriched-mode has some keyboard shortcuts for face attributes (M-o b for bold, M-o i for italic). I can add some keyboard shortcuts for other properties even if they can't be saved in text/enriched format.

(defun my-face-text-larger (start end)
  (interactive "r")
  (add-face-text-property
   start end
   (list :height (floor (+ 50 (car (alist-get :height (get-text-property start 'face) '(100))))))))
(defun my-face-text-smaller (start end)
  (interactive "r")
  (add-face-text-property
   start end
   (list :height (floor (- (car (alist-get :height (get-text-property start 'face) '(100))) 50)))))

What's an easy way to make this keyboard shortcut available during the rare times I want it? I know, maybe I'll make a quick minor mode so I don't have to dedicate those keyboard shortcuts all the time. repeat-mode lets me change the size by repeating just the last keystroke.

  (defvar-keymap my-face-text-property-mode-map
    "M-o p" #'my-add-face-text-property
    "M-o +" #'my-face-text-larger
    "M-o -" #'my-face-text-smaller)
  (define-minor-mode my-face-text-property-mode
    "Make it easy to modify face properties."
    :init-value nil
    (repeat-mode 1))
  (defvar-keymap my-face-text-property-mode-repeat-map
    :repeat t
    "+" #'my-face-text-larger
    "-" #'my-face-text-smaller)
  (dolist (cmd '(my-face-text-larger my-face-text-smaller))
    (put cmd 'repeat-map 'my-face-text-property-mode-repeat-map))
This is part of my Emacs configuration.
View org source for this post

Archiving public toots on my blog
18 September 2024 | 2:40 pm

I want to compile my global microblog posts into weekly posts so that they're archived on my blog. It might make sense to make them list items so that I can move them around easily.

my-mastodon-insert-my-statuses-since
(defun my-mastodon-insert-my-statuses-since (date)
  (interactive (list (org-read-date "Since date: ")))
  (insert
   (format "#+begin_toot_archive\n%s\n#+end_toot_archive\n"
           (mapconcat
            (lambda (o)
              (format "- %s\n  #+begin_quote\n  #+begin_export html\n%s\n  #+end_export\n  #+end_quote\n\n"
                      (org-link-make-string (assoc-default 'url o) (assoc-default 'created_at o))
                      (org-ascii--indent-string (assoc-default 'content o) 2))
              ;; (format "#+begin_quote\n#+begin_export html\n%s\n#+end_export\n#+end_quote\n\n%s\n\n"
              ;;        (assoc-default 'content o)
              ;;        (org-link-make-string (assoc-default 'url o) (assoc-default 'created_at o)))
              )
            (seq-filter
             (lambda (o)
               (string= (assoc-default 'visibility o) "public"))
             (my-mastodon-fetch-posts-after
              (format "accounts/%s/statuses?count=40&exclude_reblogs=t&exclude_replies=t" (mastodon-auth--get-account-id))
              date))
            ""))))

Here's a little thing I used to convert a two-level list into my collapsible sections:

my-org-convert-list-to-collapsible-details
(defun my-org-convert-list-to-collapsible-details ()
  (interactive)
  (let ((list (org-list-to-lisp t)))
    (mapc (lambda (o)
            (when (stringp (car o))
              (insert
               (format
                "#+begin_my_details %s :open t\n%s#+end_my_details\n"
                (car o)
                (mapconcat
                 (lambda (s)
                   (concat "- " (string-trim (org-ascii--indent-string (car s) 2)) "\n"))
                 (cdr (cadr o)))))))
          (cdr list))))

And here are my toots from the past week, roughly categorized into collapsible sections:

EmacsConf
  • 2024-09-17T21:55:39.065Z CFP, draft schedule

    The #EmacsConf call for proposals (https://emacsconf.org/2024/cfp/) target date is this Friday (Sept 20), so I've started drafting the schedule. Thanks to my SVG schedule visualization code and function for checking availability constraints from last year ( https://sachachua.com/blog/2023/09/emacsconf-backstage-scheduling-with-svgs/), it took only about an hour to sketch this out: https://emacsconf.org/2024/organizers-notebook/#draft-schedule . We can run two tracks simultaneously and I can also slightly reduce the buffer between talks, so there's plenty of space for more #Emacs talks if people want to propose them or nudge people to propose them. =)

  • 2024-09-15T17:07:41.935Z diversity

    I feel complicated feelings about #EmacsConf and diversity. On one hand, yes, I would love to have a mix of speakers that reflects the mix of interesting stories and people I come across in the #Emacs community. (I wouldn't get rid of or discourage anyone; I just want more! :) )

    On the other hand, preparing and giving a presentation is a lot of work, and I have first-hand appreciation of how difficult it can be to find time to think - much less predict a specific time to have a conversation. (I'm only just beginning to be able to have some thinking time that isn't accompanied by the guilt of letting my kiddo binge-watch YouTube videos or the uncertainties of sacrificing my sleep, and I still rarely schedule anything for myself.)

    In addition, there are little risks that other people might not even have on their radar. All it takes is one person developing a parasocial relationship or fixation, or someone getting grumpy about someone's pronouns or personal characteristics or opinions, and then deciding to go and ruin someone's day (or life)... I'd hate to encourage someone to put themselves out there and end up with that happening to them, even if it's not at all their fault or mine.

    So yeah, it's a little hard for me to reach out. I can deal with impostor syndrome making people feel like they might not have much to say (share what you're learning! We're all figuring things out together), but I'm not so sure about the other concerns. While I'd like to think that in the Emacs community we often have a convivial atmosphere, sometimes it gets weird here too.

    I'm not sure what to do here aside from thinking out loud. I wish I could wave a magic wand and solve some structural issues that could make things more equitable, but that's waaay above my paygrade. I can keep working on figuring out how to make use of fragmented time, and maybe that will help other people too. I like working on the captions for EmacsConf; they help me a lot, too. I can experiment with workflows for sharing what I'm learning in a way that doesn't require a lot of focus time, speech fluency (I occasionally stutter and have to redo things), or a powerful computer. (Emacs is totally my nonlinear video editor.) I can make an indirect request for more people to consider proposing a talk for https://emacsconf.org/2024/cfp/ (target date is Sept 20, but I think the other organizers are considering extending it too), even with all the caveats my anxious brain suggests. (I know, I'm terrible at sales. :) ) And really, EmacsConf isn't important in the grand scheme of things, it's just a fun excuse to get together and connect with other people who like this stuff too. :)

    I wonder how this can be better. Thoughts?

Emacs
  • 2024-09-17T15:07:41.153Z consult-omni

    All right, I just got consult-omni and a Google custom search JSON API key set up so that I can call consult-omni-google, type keywords, pick the correct match, and insert it as an Org Mode link (or linkify the current region). I can think of more tweaks (embark-act on the current word or region to linkify it), but this is already pretty neat.

  • 2024-09-16T14:05:06.263Z - user-init-file

    Is there already an interactive #emacs command for opening user-init-file? I think that could be handy for newbies if we could just tell them to use "M-x visit-user-init-file" or even "Select 'Open init file' from the menu", although I suppose by the time we ask them to fiddle with the init file to add stuff to it, it's fine to encourage them to be comfortable with C-h v user-init-file and then maybe even teach them about M-x ffap at that point. Hmm...

  • 2024-09-16T13:51:12.091Z - casual-symbol-overlay

    Trying out casual-symbol-overlay (http://yummymelon.com/devnull/announcing-casual-symbol-overlay.html) by hooking it into embark-act, which I've bound to `C-.`:

    ```emacs-lisp
    (use-package casual-symbol-overlay
    :if my-laptop-p
    :init
    (with-eval-after-load 'embark
    (keymap-set embark-symbol-map "z" #'casual-symbol-overlay-tmenu)))
    ```

  • 2024-09-11T17:12:03.791Z no nested lists for Org Babel

    TIL that #OrgMode Babel only takes the top level of nested lists passed in via :var (https://orgmode.org/manual/Environment-of-a-Code-Block.html - Note that only the top-level list items are passed along. Nested list items are ignored.) When I try the manual example on my computer, I do indeed get only the top-level list items, unlike the nested data from https://mail.gnu.org/archive/html/emacs-orgmode/2020-10/msg00536.html . Of course, now that makes me want nested lists for both input and output...

Mastodon
Moving to P52
  • 2024-09-18T00:03:09.981Z WhisperX

    Now that I have word-level timestamps from WhisperX (https://sachachua.com/blog/2024/09/using-whisperx-to-get-word-level-timestamps-for-audio-editing-with-emacs-and-subed-record/), I think I'll be able to write an elisp equivalent of the merging/splitting strategies of https://github.com/jianfch/stable-ts?tab=readme-ov-file#regrouping-words to merge subtitles considering gap, duration, length, and maximum number of words.

  • 2024-09-15T17:08:17.757Z upgrade

    (reposting, forgot to make it public)

    I installed a 2TB Crucial T500 NVMe into my Lenovo P52 so that I can try dual-booting into Linux, since it was hard to figure out how I could get all my usual conveniences in WSL.

    A preliminary test with a fresh Kubuntu install showed that my 11ty static blog generation takes about the same time as it does on the X230T, which is a little surprising considering the newer processor and the faster SSD, but maybe I'll have to look for speed gains elsewhere there. I think whisper.cpp is a lot more usable on this computer though, so I'm looking forward to taking advantage of that. The P52 might also make video editing possible, and it might support more modern monitors. It is a fair bit larger and heavier, though. I might end up still using both.

    Anyway, I decided to redo the install by cloning my previous SSD. I want to see if I can skip the step of setting all those things up (although I'll need to redo the Syncthing config, of course). I don't have the extra parts that would let me install the 2.5" SSD from my X230T directly into the P52, but W- has a drive dock that works off USB 2.0. Slow and steady, but that's fine, I can run things overnight. I woke up today to find out that dd doesn't handle extended partitions and needs me to dd them one by one. That's cool, I'll just have that running in the background today.

    If the clone doesn't work or if it's too much trouble to take the clone and give it its own identity, I'll probably wipe it and do another install. Since the X230T is on Kubuntu, I think I'll keep it on Kubuntu as well, to minimize the things I need to keep in my head as I switch between computers. My home directory is in a separate partition, so I can keep it if I want to try something different.

    Now I just have to wait a few hours for these dd commands...

  • 2024-09-11T14:14:55.602Z static blog

    My "plugins is not iterable" issue got fixed when I downgraded `@11ty/eleventy` from `@beta` to `@2.0.1`. Yay, that's one thing off my list!

Other tech stuff
Parenting
  • 2024-09-17T12:28:29.082Z emotion check-in

    I appreciate my kiddo's grade 3 teacher. =) She's currently doing the morning check-in of emotions (how's everyone feeling) using 9 images of Grogu with different facial expressions, which gets the kids (1) laughing, (2) interpreting facial expressions that aren't explicitly labeled, and (3) figuring out what they're feeling.

  • 2024-09-12T12:50:02.014Z pull system

    The kiddo is 8 and I'm developing a better understanding of what "fiercely independent" means. One of the things I'm working on learning is how to shut up and trust the process. =) I've started thinking of it like the pull system of Lean manufacturing principles. Things work out better when I wait for her to ask a question (to pull from me) because at that point, she's ready to hear the answer.

As it turns out, org-list-to-org uses the Org export mechanism, so it quietly discards things like #+begin_export html blocks. I decided to hard-code assumptions about the list's structure instead, which works for now.

View org source for this post

Using WhisperX to get word-level timestamps for audio editing with Emacs and subed-record
17 September 2024 | 4:34 pm

I'm gradually shifting more things to this Lenovo P52 to take advantage of its newer processor, 64 GB of RAM, and 2 TB drive. (Whee!) One of the things I'm curious about is how I can make better use of multimedia. I couldn't get whisper.cpp to work on my Lenovo X230T, so I mostly relied on the automatic transcripts from Google Recorder (with timestamps generated by aeneas) or cloud-based transcription services like Deepgram.

I have a lot of silences in my voice notes when I think out loud. whisper.cpp got stuck in loops during silent parts, but WhisperX handles them perfectly. WhisperX is also fast enough for me to handle audio files locally instead of relying on Deepgram. With the default model, I can process the files faster than real-time:

File length Transcription time
42s 17s
7m48s 1m41s

I used this command to get word-level timing data. It makes VTT and SRT files that underline the specific word:

~/vendor/whisperx/.venv/bin/whisperx --compute_type int8 --highlight_words True --print_progress True "$1"

The resulting VTT file looks like this:

WEBVTT

00:00.427 --> 00:00.507
<u>I</u> often need to... I sometimes need to replace or navigate by symbols.

00:00.507 --> 00:00.587
I often need to... I sometimes need to replace or navigate by symbols.

00:00.587 --> 00:00.887
I <u>often</u> need to... I sometimes need to replace or navigate by symbols.

00:00.887 --> 00:00.987
I often need to... I sometimes need to replace or navigate by symbols.

Sometimes I just want the text so that I can use an audio braindump as the starting point for a blog post or for notes. WhisperX is way more accurate than Google Recorder, so that will probably be easier once I update my workflow for that.

Sometimes I want to make an edited audio file that sounds smooth so that I can use it in a podcast, a video, or some audio notes. For that, I'd like word-level timing data so that I can cut out words or sections. Aeneas didn't give me word-level timestamps, but WhisperX does, so I can get the time information before I start editing. I can extract the word timestamps from the underlined text like this:

(defun my-subed-load-word-data-from-whisperx-highlights (file)
  "Return a list of word cues from FILE.
FILE should be a VTT or SRT file produced by whisperx with the
--highlight_words True option."
  (seq-keep (lambda (sub)
              (when (string-match "<u>\\(.+?\\)</u>" (elt sub 3))
                (setf (elt sub 3) (match-string 1 (elt sub 3)))
                sub))
            (subed-parse-file file)))

(defun my-subed-word-tsv-from-whisperx-highlights (file)
  (interactive "FVTT: ")
  (with-current-buffer (find-file-noselect (concat (file-name-nondirectory file) ".tsv"))
    (erase-buffer)
    (subed-tsv-mode)
    (subed-auto-insert)
    (mapc (lambda (sub) (apply #'subed-append-subtitle nil (cdr sub)))
          (my-subed-load-word-data-from-whisperx-highlights file))
    (switch-to-buffer (current-buffer))))

I like to use the TSV format for this one because it's easy to scan down the right side. Incidentally, this format is compatible with Audacity labels, so I could import that there if I wanted. I like Emacs much more, though. I'm used to having all my keyboard shortcuts at hand.

0.427000	0.507000	I
0.587000	0.887000	often
0.987000	1.227000	need
1.267000	1.508000	to...
4.329000	4.429000	I
4.469000	4.869000	sometimes
4.950000	5.170000	need
5.210000	5.410000	to
5.530000	6.090000	replace
6.270000	6.370000	or
6.490000	6.971000	navigate

Once I've deleted the words I don't want to include, I can merge subtitles for phrases so that I can keep the pauses between words. A quick heuristic is to merge subtitles if they don't have much of a pause between them.

(defvar my-subed-merge-close-subtitles-threshold 500)
(defun my-subed-merge-close-subtitles (threshold)
  "Merge subtitles with the following one if there is less than THRESHOLD msecs gap between them."
  (interactive (list (read-number "Threshold in msecs " my-subed-merge-close-subtitles-threshold)))
  (while (not (eobp))
    (let ((end (subed-subtitle-msecs-stop))
          (next-start (save-excursion
                        (and (subed-forward-subtitle-time-start)
                             (subed-subtitle-msecs-stop)))))
      (if (and end next-start (< (- next-start end) threshold))
          (subed-merge-with-next)
        (or (subed-forward-subtitle-end) (goto-char (point-max)))))))

Then I can use subed-waveform-show-all to tweak the start and end timestamps.

2024-09-17-12-06-12.svg
Figure 1: Screenshot of subed-waveform

After that, I can use subed-record to compile the audio into an .opus file that sounds reasonably smooth.

I sometimes need to replace or navigate by symbols. casual-symbol-overlay is a package that adds a transient menu so that I don't have to remember the keyboard shortcuts for them. I've added it to my embark-symbol-keymap so I can call it with embark-act. That way it's just a C-. z away.

I want to make lots of quick audio notes that I can shuffle and listen to in order to remember things I'm learning about Emacs (might even come up with some kind of spaced repetition system), and I'd like to make more videos someday too. I think WhisperX, subed, and Org Mode will be fun parts of my workflow.

View org source for this post


More News from this Feed See Full Web Site