Parsing Guitar Tab

Introduction

Standard musical notation is extremely expressive but has an incredibly steep learning curve. Although it can specify notes and duration precisely it takes practice and time to learn how to convert from what you see to how you play it.

For reasons like this, it was long the nemesis of many a guitar player. Although some players have a solid grasp of musical theory, many players just want to, you know, play. This is where tablature comes in.

Tablature represents music not in terms of the musical notes, but rather as how it is played. For example, a transcription of a part of the Ghostbusters theme.

Here the horizontal lines represent strings on a guitar. If the guitar was laid down with the head on your left, the strings correspond from top to bottom. The notes are indicated on the side of the tablature, here in standard EADGBe guitar tuning.

Reading the tablature from left to right, points where a note should be played are indicated by a number. This number represents the number of the fret that should be held down on the guitar and played.

In this case, the B string is played open (the zeroth fret) twice, then with the fourth fret hed down, then open again, then with the second fret held and finally the G string is played on the second fret.

This extremely basic format means that it is relatively easy to parse. The parsing of this is part of a multi-step journey towards making an HTML5 tablature player, but here we’ll just look at what was needed to parsed the text. We’ll start from a fairly low level however, so all that is required is a basic understanding of Javascript and HTML.

Problem

We know / assume several things about our input –

  • It is made up of several lines contained in a text file.
  • Some of the lines contain standard words and sentences, such as “Intro”, “Guitar”, or other write ups included by the maker of the tab.
  • Some of the lines contain tablature for instruments that we are not interested in, such as bass.
  • Some of the lines contain our target text. These lines will be in groups of six, one per guitar string. As such, we can discard any tab lines that are not part of a continuous group of six.
  • Tab lines may contain a note indicating the base note. This is not compulsory however as some tabs assume a standard tuning.
  • Further, basic tab lines contains the numbers 0 – 9, as well as -, |. Further characters such as (, ), ^, /¬†and \¬†may be contained that indicate more complicated playing methods such as mutes and slides.

Requirements

We have several clear requirements that we’ll need to satisfy in order to solve this problem.

  • We will need to read in a file uploaded by the user.
  • This file will need to be broken down into lines.
  • Each line will have to be checked to see if it is a valid line of tablature.
  • Lines which appear in groups of six will be kept, others will be discarded.

More requirements could be found for more elaborate parsing, but as an initial attempt this should be sufficient.

File Reading

First off, let’s see how we’ll read in a file from the user. Drag and drop is the future, so we’ll have a drag and drop interface where the user can hurl tablature with ease. First off, we need a standard file input box that we can associate some actions with.

Next we’ll need to associate our own functions with the dragover and the drop events. jQuery’s¬† .on()¬†method was behaving strangely here, most likely because I’m using it incorrectly, so I went with straight Javascript to attach the event listeners.

As you can see, we’ll be writing new handlers for the drag over event as well as the drop event. In the drag over event we want to prevent the event from propagating – that is, from being handled by event handlers further down the chain. We’ll also make the cursor a copy cursor so that the interface feels more comfortable.

The drop event is where we’ll actually access the file. We’ll prevent the event from propagating, then check how many files they are dropping. At the moment we’re only going to support one, but this could be extended in the future.

A new FileReader will allow us to read the file in from the user, and pass it on to the next step of our solution. We set it to call our tab parser once the file is loaded via our soon to be written parseTabFile(), and pass the FileReader the actual uploaded file.

Now that our file has been read, we can check that first requirement and start parsing the tablature contained therein. Onwards!

Parser

Parsing the text is remarkably simply, with the help of regular expressions. We are looking for lines that potentially start with a note, then generally either ]  or |  and finally have a long line consisting of the special characters mentioned earlier. The regular expression matching this is:

This line appears somewhat mangled at first glance, as many regexes do. Let’s have a look at each part to see what it does.

Base Note and Accidental

First off, we need either one or no base notes. Given that musical notes range from A to G and may be in lowercase (such as the high ‘e’ in the standard tuning previously), we want to pick up zero or one character in the ranges A – G and a – g. We need to handle a flat (b) and a sharp (#) as well, so let’s have a look at that in this section too.

This entire section is enclosed in standard round brackets – ( ... )¬†¬†This allows us to extract this section later when we want to actually use this information instead of just verifying that we have a line of tablature. The square brackets indicate a range that we wish to capture – in this case, ‘A’ through to ‘G’, and ‘a’ through to ‘g’:¬† [A-Ga-g].

Braces (that is, {  and } ) allow us to specify a minimum and a maximum number of matches that we are looking for. As we are looking for either one or none, we specify 0 and 1, giving us  a complete expression for one or no standard note of  [A-Ga-g]{0,1}.

We do exactly the same thing for either one or no accidentals –¬† [#b]{0,1}¬†. So far so good, we have an option base note indicator sorted. Next up, we need to detect one of two potential indicators of the start of the line.

Start of the Line

Browsing through some tablature, I noted that they tend to begin with either a ]¬†¬†or a |¬†. We’ll look for those as well, using the same range syntax as we used previously.

There is one additional complication in this case however – both of these characters are reserved characters for regular expressions (indicating the end of a range and a choice, respectively), so we need to show that we’re looking for the actual character rather than their special meaning.

To do this, we can escape the character using a backslash \. The pattern for either of these two characters is therefore  [\|\]].

Rest of the Line

The rest of the line is comprised of numbers indicating fret positions, dashes representing no note played, pipes |  indicating approximate bars as well as a range of other characters representing slides, bends, etc.

Although the expression looks complex, it is really just a list of the different options we have. The entire pattern is surrounded by brackets allowing us to retrieve the value later. The range of characters we are expecting is again surrounded by square brackets. Inside them, we specify all of the different characters, escaping them when necessary.

We now have a regular expression that will allow us to verify that a line is a valid line of tablature (in most cases), as well as to extract the relevant pieces from the line in order to parse it. Next we’ll have a look at how this is implemented in JavaScript.

JavaScripting

JavaScript has a fairly robust RegEx API, although as with all APIs there is always room for improvement. Declaring a regex is as simple as surrounding it with  /

Once we have our pattern, we can either use the String¬† .match()¬†method or call¬† patt.exec(String)¬†. ¬†For simplicity’s sake¬†we’ll go with the match method in this case.

Our parsing method is fairly simple – we’re going to take the input text, break it up on newlines, and then iterate through it looking for blocks of six lines of tablature. It’ll look something like this.

Our pattern for newlines, once you break it down, is simply an option between a carriage-return/newline, just a carriage return or just a newline.

Recombining

We’ve finally broken our text down into what appear to be valid sections of guitar tablature. All that remains is to store them in some sort of convenient structure and convert them into values that we can work with.

In order to do this, I’ve created three objects - TabLine¬†, TabPhrase¬†and Tab. TabLine¬†stores the basenote and fret progressions for a single string. A TabPhrase¬†stores six of these to represent a complete sequence of notes. Tab¬†stores several TabPhrases¬†in order to make up an entire song. The parseTabFile¬†function is included as a method on the Tab¬†object.

Let’s start at the top and then burrow deeper. The complete parseTabFile¬†¬†function is

Each time we find a valid line of tablature, we add it to the current tab phrase. Once we’ve added six lines, we add that phrase to our tab and begin a new one. The logic that governs this has been pulled into the tab phrase, as it’s in a better position to judge if it is complete or not.

If we encounter a non-valid line, we’ll abandon our current phrase and start a new one. Finally, once all lines have been parsed, we return our complete tablature. The rest of our Tab¬† object is fairly basic.

The TabPhrase  object is similar, although we keep provide a method to check if the phrase is complete.

The TabLine  object is slightly more interesting. This object has the responsibility of converting its note and sequence of frets into numbers that we can use.

For this initial parsing, we’re going to avoid more complicated issues such as hammer-ons, pull-offs etc. That means that we’re interested in only the fret number and makes parsing each line fairly simple. For each character in the line, we check if it is a number. If so, we make that the fret numbering. We’ll use negative numbers to indicate special characters for now, as a negative would not make sense in terms of frets.

The getProgression¬†¬†method defaults to octave four, although this could be altered in the future. We’re working with the convention from the standard tuning where a lower-case letter indicates a higher note (such as with the high E), and increase the octave if that is true.

I’m using a NoteOctave¬†object here that stores a note and an octave as well as a Progression¬†¬†object that stores a progression of notes from a base note. These are part of my music Javascript experiment, which can be read about in more detail as soon as it is relatively stable. The result of this parsing action could be used by any musical application however, as it is just the converted representation of what was a noisy text format. Now it’s just a noisy format.

The result of this method is the creation of a Progression object with a base note and a series of halfsteps that the note should be shifted in. The value of the notes produced by the progression object for the line

is

 

Tagged with: , , , , , ,
Posted in Javascript

Leave a Reply