Jump to page content

Better disc searching

Something that I have yet to see on any operating system is a satisfactorily comprehensive GUI disc search utility, offering a means of supplying complex search criteria when needed. Illustrated on this page is a theoretical interface which offers the level of features I would like to see.

Addendum (2004/04/06): The new Find facility in Mac OS X Panther comes pretty close to what I describe here, I’ve discovered (see screenshot). You are limited to one search dialog at once, and you cannot save and recall searches, but it is a huge improvement over everything that I have seen until now. Things are looking up.


The interface

The interface design was initially based on Apple’s Sherlock utility from Mac OS 8.5 to 8.6:

With all the enhancements added, the redesigned window would look something like the following:

Shown above is the search criteria window in its initial state, defaulting to “Name contains”. The window elements are:

Search criteria

The more complex view

There are two aspects of how criteria are added to the search – criteria categories (such as “Name”, “Creation date”), and specific criteria (such as “starts with”, “ends with”, “contains”). To add a criteria category, use the Add button:

When a category is selected from the Add menu, this will place the default (or next available) criteria within that category into the search criteria box. For example, were you to add the Size category, you’d get the “Size” label and ‘−’ button, the criteria pop-up menu (as seen below defaulting to “less than”) and one or more controls for providing a value for the selected criterion. To select a different criteria within the Size category, choose it from the pop-up menu.

To remove a category, click the ‘−’ button to the left of its label. This button is disabled when there is only one category present, as you cannot have no categories selected. You may also notice that under the Add menu, the entry for Name is disabled. This is because the Name category is already present in the search criteria box.

To add new criteria from a category (such as name contains), use the ‘+’ menu button in the category, as illustrated below:

The example picture shows a second criterion added to the Size category; you can add as many criteria from a category as makes sense – you cannot add both “Name ends with” and “Name is” for example. To remove a criterion, click the ‘−’ button to its right. Removing the last criterion will also remove the category (except when it’s the only criterion in the only category).

The simpler view

An alternative approach to the above is to merge the Add button and the ‘+’ buttons together, such that you can select the category and then the criteria using a single, hierarchical menu on the Add button. This is illustrated below:

This simpler design is a new alternative (whereas the previous design was devised some time ago) – quite which is preferable, I am not sure.

Saving and recalling searches

Like Sherlock, this design offers a means to save and recall previous searches. However, instead of saving searches to files on disc (which means you have to go create a folder to store them in) the searches are stored by the application in its preferences.

The save search dialog allows you to specify which aspects of the search to save, and to name the search. Once named, the search would be written out as something like:

 <search name="medium PNGs" location="all volumes">
  <criterion category="name" name="ends with" value=".png">
  <criterion category="size" name="less than" value="500" extra="kB">
  <criterion category="size" name="greater than" value="100" extra="kB">

The above XML applies to the more complex search as shown under The more complex view; it might not follow conventions as far as Mac OS X property list XML goes, of course.

To recall a search, just click the Recall button and select a previous search:

More advanced facilities

Fuzzy matching

Something already offered by a number of search engines (including the Verity engine shipped with ColdFusion that I used) offer fuzzy matching of English text, so for example a search for “make” might return matches containing “made” and “making”. It would be interesting to see this feature applied to disc searches too; it sounds particularly useful when you know what a document was about but forget quite how you worded the filename.

A similar idea could be applied to some of the other search criteria. For example, if you cannot remember quite how big a file was, being restricted to searching for a certain size or under, or a certain size or over, isn’t as helpful as it could be if you don’t know which side of your guess the actual size falls. An alternative approach would be to enter your best guess at the size, and the search results would each be rated according to proximity to the original size. The same idea could be applied to dates. Quite how the results for such items would be displayed is uncertain – would they be sorted by the absolute difference between the criteria and the found value (e.g. 0, +1, −2, +3…), or would they be sorted by the found values with the closest match in the middle and lower values above and higher values below (e.g. −2, 1, 0, 1, 2)?

Text search modes

There are various approaches in use by different systems for text searching. Search engines like Google will search for any or all words specified, in any order, in a case-insensitive manner. Sherlock in Mac OS will search for the exact string specified, although its search is not case-sensitive. I have not had enough experience with Windows search to ever be able to remember what it does – it might come closer to Google than to Sherlock.

Certainly what would be nice is to be able to specify whether you want to find the exact substring within filenames, or to only find some or all of the words listed in any order. Being able to choose whether or not to search case-sensitively would be another bonus. In the mock-up UI shown earlier in the article, the “contains” criteria under name searching could be expanded to “contains this string” and “contains any words from”; quoted strings could be used to enforce substring matches within the latter option. I am not presently certain as to the best approach for dealing with case sensitivity, however.

Multiple choice

If you look back at the add criteria image (and the simpler variant below it), you will notice that choices which cannot be repeated are disabled in the menu. For example, if you specify a string with which the filename must end, it is not possible to specify a second such string, as a filename cannot have more than one ending.

This decision imposes a restriction, however – on systems with no support for wildcards (Mac OS, maybe others), you can only select one ending to use within one search. Thus, you could not search concurrently for all files whose names end in either “.sit.bin” or “.sit.hqx”. One possibility would be to place a pop-up menu by each criterion with the option list {and, or, not} for adding boolean options between search criteria; I’ve always found these to be confusing, though, as the order of precedence is not clear to me and I don’t feel certain that I know what they are going to do. On the other hand, it does have some interesting implications for criteria pairs like “contains” and “does not contain”. A simpler albeit more confusing solution, though, is to simply not disable any menu items, creating an implied boolean ‘or’ between them. So, if you select “ends with” twice, the application will return results whose names end in either of the two given endings. The final solution to this issue depends on the target audience of the software, although the latter option can be improved by possibly relocating the two “ends with” items adjacent to each other and placing “or” somewhere on the line of the second (within the menu item name or beside it) just to offer a hint as to what will happen.

Live searching

Considering that Mac OS X Panther now has live searching by name (the results appear on the fly as you type the search string), and that the UNIX find utility also has very rapid searching, you would expect this tool to also offer live searching. The only consideration involved here is how this effects performing multiple searches – with the results shown in a pane below the criteria, it might be necessary for each search window to contain its own criteria pane and results pane. Alternatively, the results pane could be made detachable so that it can be left open in its own window while further searches are performed.

Another thought – if you leave a search window open, would it automatically update when the contents of the disc change? (thanks to About the Finder… page 5 at Ars Technica for that idea)

Daniel Beardsmore, 2nd March 2004
Comments? Send them to the author. Thanks go to silvestrij for proof-reading.