In this post I’m going to try and explain how to add custom tags that link to node pages to a Drupal 6 input format. That probably sounds a bit far-fetched, so I’ll attempt to clarify by using my own project as an example.
last.fm has this nifty feature where you can write [artist]Meshuggah[/artist] and it’ll turn into a link to that artist’s last.fm page. A set of custom BBCode tags, basically. The project I’m working on, got-djent.com, is also a website about music, and will contain a band/album database maintained by the users in a wiki-like fashion. On the front page, it will be possible to post news articles. I thought it would be nice if band and album pages could be linked from these articles easily, by using custom tags.
Installing the customfilter module
What I set out to implement was having <band>Meshuggah</band> turn into <a href="http://got-djent.com/node/2">Meshuggah</a>, e.g. a link to that band’s page. Something like this is best implemented as a “custom filter” in Drupal 6. To be able to do this, you have to install and enable the customfilter module first. You can get it here. Once you have uploaded it to the modules/ directory of your Drupal install, you can enable it under Administer > Site Building > Modules (admin/build/modules).
Drupal is pretty nifty.
Creating the filter
Now we can create a custom filter. To do this, go to Administer > Site configuration > Custom filters (admin/settings/customfilter) and click Add filter. Fill out the fields (only the Name field is compulsory). I called my filter “Node links”. It is also a good idea to disable filter caching here (it is enabled by default), for the time being. This makes debugging a lot less frustrating, should any issues arise. Once you have verified that the filter works correctly, you can enable caching again.
To use the filter, you have to add it to a input format. I chose to be lazy and just add it to the builtin “Filtered HTML” input format, since that was already being used everywhere. To do this, Go to Administer > Site configuration > Input formats (admin/settings/filters) and click the configure link for “Filtered HTML”. Under the Filters heading, Tick the box next to your custom filter and click the Save configuration button.
To make sure the filter is applied at the right time, click configure again, and this time click the Rearrange link at the top. This will allow you to rearrange the order in which the filters are applied. For some reason, the default order rendered my filter useless so I had to change this. I dragged my custom filter all the way to the top. This is probably a bit overkill, but at least it works properly now.
Adding replacement rules
For the filter to actually do anything, you have to add replacement rules to it. In the Custom filters overview, click the filter’s name, then click Add rule at the top. Give the rule a name (my filter has two rules, called “bands” and “releases”). Now comes the interesting part: you have to specify a pattern and a replacement text.
The pattern is specified as a regular expression. More information about regular expressions can be found at regular-expressions.info. If you don’t know what they are, you might want to familiarise yourself with the concept before you continue reading. The pattern I used for the “bands” rule was relatively simple:
#<band>(.*?)</band>#i
This matches text between <band></band> tags. The #’s are the delimiters. The i modifier flag at the end makes the expression case insensitive. The parentheses form a capture group, which means that whatever is matched between them will be available when we construct the replacement text. Within the capture group, we use lazy matching (*? is the lazy version of *) to make sure that the filter works correctly when the tag is used multiple times. The * wildcard is greedy by default, which means that it attempts to consume as many characters as possible; <band>foo</band> <band>bar</band> would result in a single match, where (.*) matches foo</band> <band>bar! Adding the question mark prevents this.
Next, we have to specify the replacement text. We have two options here: the simplest is just entering some text, in which $1 or ${1} is included as a placeholder, which will be replaced by whatever was matched inside the first capture group of the regular expression. Note that you can also use $2, $3, … when you have multiple capture groups. $0 always contains the entire match. Unfortunately, this method isn’t suited for what we’re trying to do. Luckily, there is a more advanced option available as well: we can tick the PHP Code box, and then enter some custom PHP code to “compute” the replacement text instead.
Computing the replacement text
The customfilter module now expects us to write some PHP code which computes the replacement text and stores it in the $result variable. It can access the regular expression matches through the $matches array. $matches[n] corresponds to $n, so the value we are interested in here is $matches[1]. We have to find a node with the band node type, whose title corresponds to this value.
The simplest way to do this is to use an alternative (and badly documented) syntax for Drupal’s node_load function:
$nd = node_load(array('title' => $matches[1], 'type' => 'band'));
The title matching is case insensitive, which is convenient. However, this only works properly if we can find an exact (case insensitive) match. Seeing as bands and albums can have pretty long names sometimes (like, say, Fredrik Thordendal’s Special Defects), it would be even more convenient if we could specify the name only partially. The solution is to use Drupal’s search functionality.
I’m not sure, but I think you have to enable the search optional core module for this to work. Instead of immediately calling node_load, we run a search first. For this, we can use the do_search function. Note that we need some trickery here to limit the search results to certain nodetypes (the 3rd and 4th arguments):
$res = do_search($matches[1], 'node', "INNER JOIN {node} n ON n.nid = i.sid", "(n.type = 'band')");
do_search only returns the 10 best results, which is fine because we only need a single one in this case. If you intend to use it for other purposes, this is something to take into account, though. The results are returned as an array. The sid field of a result corresponds to the node id of the matched node. This means that we can acquire the node object we need as follows:
$nd = node_load($res[0]->sid);
Now all we need to do is use the node information to create a link. Drupal provides a convenient function for this called l:
$result = l($nd->title, 'node/' . $nd->nid);
Now, we can enter <band>thordendal</band>, and it will be replaced by a correct link: <a href="http://got-djent.com/node/36">Fredrik Thordendal's Special Defects</a>. Nice. There are a few edge cases that we haven’t taken into account yet, though: what if there are no matches, for example? The nice thing to do is then to just return the text between the tags, without linking anywhere:
$res = do_search($matches[1], 'node', "INNER JOIN {node} n ON n.nid = i.sid", "(n.type = 'band')"); if (empty($res)) { $result = $matches[1]; // return the unchanged match if no node is found (the tags are removed) } else { $nd = node_load($res[0]->sid); // get the node corresponding to the first (best) result $result = l($nd->title, 'node/' . $nd->nid); // create a link }
There is one more problem with this: do_search complains if you give it a single keyword that is shorter than 3 characters. It would be nice if this error was supressed and nothing happened, instead:
if (strlen(trim($matches[1])) < 3) { $result = $matches[1]; // search keywords need to be at least 3 characters, so do nothing } else { $res = do_search($matches[1], 'node', "INNER JOIN {node} n ON n.nid = i.sid", "(n.type = 'band')"); if (empty($res)) { $result = $matches[1]; // return the unchanged match if no node is found (the tags are removed) } else { $nd = node_load($res[0]->sid); // get the node corresponding to the first (best) result $result = l($nd->title, 'node/' . $nd->nid); // create a link } }
All done! Setting up <release> tags for albums works in exactly the same way. Note that using do_search has another advantage in this case: when there are two albums with the same name, the artist can be specified to differentiate between them. This also enables linking to albums with names shorter than 3 characters. To link to the album “I” by the band “Xerath” (an awesome album by the way
), you can enter <release>Xerath I</release>. To link to the album “I” by the band “Meshuggah” (also a brilliant album), enter <release>Meshuggah I</release>.
Stuff that I forgot to mention earlier
To experiment with PHP code using the Drupal API, you can use the method described here. In short: enable the PHP filter optional core module, stuff your code into the body field of a new node, set the input format to “PHP code” and preview it.
It is also interesting to note that this works great with the pathauto module; the links inserted are the aliased versions, if they exist. The l function takes care of this, apparently. This means that <band>Meshuggah</band> will link to http://got-djent.com/band/meshuggah, rather than to http://got-djent.com/node/2. Awesome
.
A downside of this method is that it isn’t very DRY. The same code has to be added for each content type. I guess it could be interesting to turn this into a full fledged Drupal module, but I don’t have the time to write and maintain it at the moment. If you think this is useful and would like to implement it, by all means, go ahead.