Creating Karaoke with Aegisub

From a missed opening theme to your own karaoke file — how Aegisub turns anime songs into syllable-perfect sing-alongs.

If you spend any time watching anime, you have probably caught yourself singing along with the opening, the ending, and the insert songs that play during the big scenes. When you watch a speedsub, the song lyrics are usually nowhere to be found, so you end up humming along from memory. With a qualitysub, though, the song sometimes comes with karaoke effects: syllables that spin, fade out, or catch fire as the vocalist hits them. If you have ever wanted that same effect for a song that nobody has subtitled yet, or wanted to build a full karaoke video of an opening for your own collection, this is the tool that makes it possible: Aegisub, a free and open-source subtitle editor that has been the de facto standard for anime karaoke work for years.

Sometimes you just want the lyrics written in Japanese, in the original kana, or you want an AMV (Anime Music Video) with the text floating on top of the action. In this guide, we will walk through what Aegisub is, how to install it, how to time lyrics syllable by syllable, how to add karaoke tags and Lua-driven effects, and how to export the finished result.

Screenshot of the Aegisub editor showing a loaded anime opening, the audio waveform panel, and several karaoke-timed subtitle lines
Contents 9

What is Aegisub?

Aegisub was originally built for subtitling anime, and it has grown into a full-featured karaoke production tool. The program works with the Advanced SubStation Alpha (ASS) format, which, unlike a plain SRT file, can store individual syllables, fonts, colors, motion paths, and complex per-character effects. That single feature is what makes Aegisub so popular for karaoke: one line of text can be split into dozens of syllables, each with its own start and end time.

Because this site does not support piracy, we will not be going into detail about pulling subtitles from licensed fansub releases. The focus here is on the legitimate side of the workflow: timing and styling karaoke for anime openings, endings, and insert songs you have legally obtained, and producing your own sing-along versions of them.

To make a karaoke video, you only need a video file and the lyrics. You can grab the lyrics from a lyrics site, type them out from a CD booklet, or write your own translation if you understand the language. Adding animations and transitions to the subtitles requires a working knowledge of the program's override tags and a little of the Lua scripting language, both of which we will cover below.

Aegisub runs on all major operating systems, and the installation is straightforward. Once it is installed, the first thing you will want to do is pick an opening, ending, or short clip from an anime and start timing it.

Advantages of Aegisub

For an open-source tool, Aegisub covers a surprising amount of ground. With it, you can:

  • Subtitle anime, films, series, and any other video you want to translate or annotate;
  • Place song lyrics on top of openings, endings, and full-length music videos with per-syllable timing;
  • Insert pre-made subtitle files into videos and tweak the timing or styling to fit;
  • Follow the audio waveform while you work, so you can line each syllable up against the actual vocal peaks;
  • Style subtitles with multiple fonts, colors, outlines, and shadow effects, and animate them with karaoke and override tags;
  • Automate repetitive work with Lua scripts, including the community-standard Karaoke Templater that handles most of the heavy lifting in modern fansub productions.

Most of these features are absent from simpler editors like Subtitle Edit or from online captioning tools, which is why Aegisub has stayed the karaoke editor of choice even as the fansub scene has shrunk.

How to install Aegisub

The official builds of Aegisub are hosted on the project's own site and mirrored on GitHub. There is no installer in the Windows Store or the Mac App Store, so you download a build directly and run it. On Windows, you grab the latest installer from the download page and run it as you would any other program. On macOS, you download the disk image, drag the Aegisub app into your Applications folder, and grant it the Accessibility permission it asks for on the first launch so it can read video and audio. On Linux, Aegisub is available in the community package repositories for most distributions, or you can build it from source if you prefer.

Whatever platform you are on, the first run is the same: pick a video file to test the program with, and confirm that the audio waveform panel lights up when you open the audio from that video. If it does, you are ready to start a real project.

Step-by-step workflow: making karaoke from an anime video

The first time you open Aegisub, the interface can feel busy. Once you know what each panel does, the workflow is logical. A typical first project goes like this.

1. Load the video. Go to the Video menu in the top bar, choose Open Video, and select the anime clip you want to subtitle. The video will appear in the preview panel.

2. Open the audio. Before you can time anything, you need the audio waveform. Go to Audio in the top bar and choose Open audio from video. The waveform panel will populate with a visual representation of the track, and you can zoom in on each syllable.

3. Start a new subtitle file. If you are creating a subtitle from scratch, go to File and select New Subtitles. If you are editing an existing karaoke file, open it here instead.

4. Type the lyric line. In the subtitle text box at the bottom of the editor, type the line of lyrics for the current moment in the song. Press play, let the line play through, and pause it where you want the line to end. Aegisub records the start and end times of the line automatically based on the play and pause points.

5. Split into syllables with karaoke tags. This is the step that turns a plain subtitle into karaoke. The most common tag is \k, with a number that gives the duration of each syllable in centiseconds (hundredths of a second). For example, a three-syllable line timed for two seconds (200 cs) might look like this:

{\k67}na {\k67}ni {\k66}no

The first syllable (na) is highlighted for 67 centiseconds, the second (ni) for 67, and the third (no) for 66, so they fill the line evenly. Variants like \kf (fill from the right side) and \ko (outline-only highlight) let you match the style of a specific song.

6. Repeat for the whole song. Go line by line, splitting the lyrics at natural syllable breaks and timing each one to the vocal. The waveform panel is your best friend here, because it shows exactly where each consonant and vowel lands.

7. Style and export. Once the timing is locked, you can apply fonts, colors, motion, and effects, and then export the ASS file for muxing with the video. We will cover both in the next sections.

Aegisub tags and Lua scripting for effects

Plain karaoke timing is only the start. The real visual flair comes from combining karaoke tags with override tags and, when the timing gets complicated, the built-in Lua scripting engine.

Override tags change how a line looks and behaves. The ones you will reach for most often are:

  • \move(x1,y1,x2,y2) — moves the line from one point to another across its duration;
  • \t(start,end,tag) — applies a tag (usually a color, position, or transform) gradually between two times;
  • \fad(t1,t2) — fades the line in and out, useful for smooth entrances and exits;
  • \3c&H&HHH& — changes the outline color, often used to make highlighted syllables pop;
  • \fn, \fs, \frx, \fry — control font name, size, and rotation, which is how you get the spinning, tilting, and shaking effects you see in modern fansub openings.

You can combine all of these on a single line. A line that has its syllables fill in color, spins slightly as it fills, and then slides off the screen on the last beat is still just one subtitle line, with maybe a dozen tags stacked together.

For anything beyond simple effects, the Karaoke Templater — a Lua framework that ships with Aegisub — is the standard tool. The Templater runs a small Lua script for each karaoke line, reads the \k timings you have already entered, and generates the final tag stack. This is how high-end fansub groups produce hundreds of identically styled lines in minutes instead of typing the same effect by hand over and over. The official documentation walks through the templater syntax, and there are pre-made templates in the community repository for everything from simple syllable fades to character-by-character particle effects.

You do not need to learn Lua to do basic karaoke. But if you want effects that go beyond \k highlights, spending an afternoon with the templater pays off quickly.

Export and muxing

Once timing and effects are done, you have an ASS file and a video file. The last step is to combine them into a single playable container. The standard tool is MKVToolNix, which contains the mkvmerge command and a graphical front end. Add the original video, mark the ASS file as a subtitle track, set the default-subtitle flag, and mux to MKV. MKVToolNix is free, cross-platform, and re-encodes nothing, so quality is preserved.

If you need an MP4 instead of an MKV — for example, to upload to a service that does not accept MKV — the usual pipeline is to mux to MKV first and then transcode to MP4 with HandBrake. HandBrake will burn the subtitles in (hardcode them into the video frames) or pass them through as a soft subtitle track, depending on the settings. Soft subs are searchable and can be turned off, so prefer them unless you have a specific reason to hardcode.

For testing, MPC-HC on Windows, MPV on every platform, and VLC on every platform all render ASS subtitles correctly. If the karaoke effect looks wrong in one of them, try another before assuming the file is broken — a few players still rely on the older VSFilter library, which renders a small subset of the ASS spec.

Common issues and troubleshooting

Even with a clean workflow, a few things go wrong often enough to be worth knowing about.

Fonts are missing. Boxes in place of characters mean the font is not installed on the playback machine. Install it, or switch to a widely available font like Noto Sans CJK or Source Han Sans for Japanese and Chinese text.

Tags show up as text. If you see {\k67} in the rendered subtitle, the player does not support ASS. Switch to MPV, MPC-HC with recent libass, or VLC with libass enabled.

Syllables do not line up with the vocals. Almost always a timing problem, not a tag problem. Re-open the file, zoom in on the waveform, and adjust the centisecond values of each \k tag until the highlight tracks the singer's delivery.

Effects look right in the editor but wrong in the player. The editor previews with libass, but some older players rely on VSFilter, and a few ASS features (per-character \t, some transforms) render differently. When in doubt, test with MPV, the closest match to modern libass.

The exported MP4 has no subtitle track. HandBrake sometimes drops the track if the source container is unusual. Mux to MKV first with MKVToolNix, then transcode to MP4. That path is the most reliable.

Resources and community

The fastest way to learn modern Aegisub karaoke workflow is the project's own documentation, which has dedicated chapters on the Karaoke Templater, the override tag reference, and example Lua scripts. The Aegisub GitHub repository, currently maintained under the TypesettingTools organization, is where the active builds and issue tracker live, and it is also where the community posts updated templates and bug fixes.

For visual learning, fan-produced tutorials on YouTube walk through timing a full opening from scratch; audio quality varies, so pick a recent one with clear commentary. Once the basics feel natural, the exported ASS files of the Japanese-language fansub community are a great way to learn which tag combinations produce the effects you like.

Closing thoughts

Aegisub is a thirty-megabyte download that can do what most commercial subtitle editors still cannot: split a line of Japanese lyrics into individual syllables and time each one to the millisecond. The learning curve is real, and the first karaoke you finish will probably take longer than you expect, but the workflow is the same one the major fansub groups have used for two decades, and the tools have not changed much because they did not need to.

If you have ever watched a beautifully timed opening and wondered how it was made, the answer is almost always an ASS file, a karaoke template, and a long evening with the waveform panel.

Sources
Kevin Henrique

About the author: Kevin Henrique

Specialist with more than 10 years of experience in Asian culture, focused on Japan, Korea, anime and games. Self-taught writer and traveler focused on teaching Japanese, travel tips and deep, engaging curiosities.

Community

Comments

0 comments

There are no published comments in this language yet.

Send comment

Comment on this article

Loading security check...

Do not send links, embeds or promotions. Comments go through anti-spam and automatic translation before appearing.