Vision
Note: The following text is copied from the beacon vision document
from Jason before beacon was written. The text may require some
updates.
Beacon’s job is to act as an intermediary between the application and
the filesystem. It will offer an interface for applications to perform
queries on files. The result set can include any known data about the
files that match the query. For example, if one of the files in the
result set is an image, it could include the image’s dimensions, color
space, thumbnail, EXIF data (in the case of a JPEG), precomputed
histogram, and so on.
Queries can include evaluations on any indexed field. One common query
would be “get a list of all files in directory X.” Another possible
query is “get all music by Queen.” Another slightly more complicated
query is “get all music by Moby, and all movies whose soundtrack has
Moby.” Beacon is responsible for gathering all data necessary to
compute such queries and providing an appropriate interface to the
application. How Beacon computes such queries internally may vary, but
the API should not make this difference evident.
Another necessary feature of Beacon is notification of changes to the
query. If an application queries “all files in directory X,” it should
be able to request notifications to changes, so that if, for example,
a file is added to directory X, the application is appropriately
notified. The actual behavior or performance may vary based on kernel
support or the nature of the query, but the API Beacon provides to the
application must support this functionality.
Common Use-Cases
The typical user will have a collection of media files, likely
organized in one or more hierarchies, on either a local filesystem, or
accessible via NFS or SMB. Users will also commonly have files stored
on removable media, such as USB drives, CDs, or DVDs.
Let’s explore some scenarios of how the user will want to interact
with Beacon, and how we expect the interface might behave:
- The user enters a directory of images on the local harddisk. The
interface will display a grid of thumbnails in the current
directory. The directory contents should be displayed in the
interface as quickly as possible. Files are initially represented
by a “Thumbnail loading” icon, and the icons are updated with the
thumbnails as they are loaded. This process does not block the
interface.
- While the user is in the directory above, she copies a new image to
that directory through some external method. The interface should
quickly update to reflect the addition to the directory with the
“Thumbnail loading” icon and subsequent thumbnail as described
above. The user then deletes the image, and the interface quickly
removes the image from the displayed list.
- The user does a search for all music by Delerium. The interface
displays a list of all songs matching this criteria.
- On another computer, the user copies an MP3 file of a new song by
Delerium to one of her music directories over her local
network. The interface quickly updates the list from the above
query to reflect the new song. (This assumes the MP3 has the proper
ID3 tags.)
- The user inserts a Delerium CD into the CD ROM drive. The list from
the earlier query updates to show the songs from this CD. (This
assumes the CD is listed in an accessible CDDB directory or that
the CD has already been indexed.)
- The user does a search for all images with the keyword
vacation. The interface presents a list of thumbnails for images
which the user has previously tagged with the vacation keyword.
- The user inserts a DVD into the DVD drive which contains a number
of images from her Hawaii vacation. This DVD was previously viewed
by the user and she tagged the DVD with the vacation
keyword. Shortly after the DVD is inserted, the list from the query
in #6 automatically begins updating to show all images on the DVD.
- The user does a search for all movies directed by Kevin Smith. The
interface displays a list of movies on the harddisk matching this
query, which are represented by images of their DVD covers. The
list further displays all DVD movies by Kevin Smith that the user
has played before. The interface will provide some appropriate
indication that these DVDs are not present and must be inserted
before they can be played. The user saves this query under the
title Movies by Kevin Smith.
- The user inserts a data DVD containing a number of MPEG4
movies. The interface shows some indication that a DVD has been
inserted. The user navigates to the DVD and it presents a list of
video files (acquired through Beacon) on the root of the DVD with
thumbnails. One of these movies is Dogma, by Kevin Smith. The user
accesses the necessary interface to perform an online lookup by
movie title, and selects the appropriate movie title/year for that
video file. The user then navigates the interface to the saved
query Movies by Kevin Smith and the interface shows the same list
as in #8 with the addition of the recently added Dogma, which is
now represented by an image of the Dogma DVD cover, rather than a
thumbnail as before. The interface indicates that Dogma is stored
on a DVD and is currently accessible.
The above use-cases illustrate some of the functionality the
application will require from Beacon. This list is not comprehensive,
and in many cases it may not be a very good idea for the interface to
behave exactly as described. However, Beacon should make any of the
above possible.
Feature Overview
From these use-cases we can derive a number of features we require Beacon to have:
- List and index files in a requested directory. When Beacon indexes
a file, it gathers any data it can about the file (metadata), which
will depend on media type, and stores that metadata for future
retrieval. Directories with thousands of files should be handled
gracefully.
- To ensure quick response, Beacon must index files, or load
previously indexed metadata, asynchronously. When an application
requests a directory list, Beacon will first obtain minimal data (a
list of file names, sizes, and modification times). If the time for
this operation so far is less than 0.1 seconds (the maximum time
which the user perceives instant response), Beacon will begin
loading previously indexed data until 0.1 seconds is reached. It
will then return to the application the complete file list, and all
metadata it managed to load before the 0.1 second timeout. For
those files whose metadata was not loaded, it will load
asynchronously and inform the application using some notification
mechanism so that the interface may be updated as needed.
- Beacon must be able to monitor directories for changes. A changed
directory is one which contains an added file, a deleted file, or
modified file that the application is not aware of. There must be
an API to provide a means for the application to receive change
notifications.
- There must be some method to query all indexed files by certain
metadata fields. Querying should behave as described in #1 and
#2. In fact, even though they may be implemented differently within
Beacon, the API to obtain a list of files in a directory and to
query by metadata should be the same; they are both queries.
- Similar to #2, Beacon should be able to monitor queries for
changes. This is much more complicated than monitoring a single
directory as in #2 because it requires monitoring all directories
in the user’s media repository and performing automatic indexing of
new or changed files. Because this requires support from the kernel
that may not be present on all systems , this live indexing should
be optional. This means that Beacon should not rely on this
functionality internally.
- Beacon must be able to monitor removable media devices and
determine when removable media has been added or removed. It must
provide an API so that the application can be notified of such
events.
- Beacon must be able to index files on removable media, and
distinguish between two files with the same name but on different
media. For example, if the CDROM is mounted under /mnt/cdrom and
there exists a file foo.jpg on two different CDs, Beacon should
treat these as separate files, even though they have the same
pathname /mnt/cdrom/foo.jpg.
- Beacon must allow the application to store and retrieve metadata
for a file. For example, in use-case #9, Beacon will retrieve
certain metadata about the video such as resolution or length
transparently to the application, but the application will retrieve
the DVD image cover.