How to add entries to the DocDir

The DocDir is generated from two types of files, a directory of papers DIR, and one file per paper that details the contents of each paper, usually called CONTENTS. Both use an SGML syntax, the DTD for the DocDir can be found in docdir.dtd. The DTD is a tremendous help in describing a paper/talk for the DocDir, especially when used together with a powerful SGML-capable editor (emacs with psgml-mode, anybody ? :)

The DTD is rather simple, let me know if you need some more tags to satisfactorily mark up papers and talks. Please do not improvise with the tags provided. If you want to do something and the current set of tags doesn't let you express is, it's time for extending the DTD.

How it works

The program docdir reads the DIR file, that should yank various CONTENTS files in and produces an HTML page, called docdir.html, from this information. It looks for a file called DIR in the current directory and writes docdir.html in the current directory.

docdir understands the following options:

-s, --short
Produce a short HTML page. The page does not contain a list of figures and extras.
-f file
Write the output to file file instead of docdir.html
-d
Generated links go through the deliver script to allow counting hits.
-u uri
Use uri as the URI for the deliver script. Implies -d.
-o
Omit broken links. Do not generate links in the HTML that are known to be broken.
-h
Print a list of options.
I usually produce a short listing as the index file with docdir -s -f index.html

To get started, it is best to copy a CONTENTS file to the new directory you're working in and modifying it instead of creating one from scratch. Then add an <ENTRY> line to the DIR file and run docdir.

The docdir program has two big shortcomings right now:

  1. Error messages are abysmally non-existent, mainly since I don't feel like putting all the necessary checks into docdir. If you have nsgmls (an SGML parser), you can use the script check to check for syntax errors. Run check to validate the DIR file in thecurrent directory. If you don't get any output, then everything should be ok. Otherwise, you get more detailed error messages than from docdir.
  2. Closing tags are important. The docdir parser can not automatically add appropriate closing tags, even though they could be gleaned from the context: in <author>Eva Luator <abstract> ... it is clear where the closing </author> should go. docdir doesn't know that. This is very likely a source of very many errors !

Structuring your DIR file

The materials for one talk or paper are stored in one subdirectory. When the time comes to write a DIR file that contains those materials, you somehow need to get the description from that subdirectory into the DIR file. Preferrably without cut and paste. That's what external entities are for in SGML. You tell the SGML processor that you want to include a file by saying

      <!ENTITY name SYSTEM "some/path/to/the/file">
    
in the specification of the document type in the header of the DIR file. It should go where the ... is in <!DOCTYPE DOCDIR SYSTEM [ ... ]>.

With this entity definition, any occurence of &name; (note the ampersand and semicolon !) in the body of the DIR file will be replaced by the contents of the file some/path/to/the/file.

Adding / updating a paper or talk

After you've written your paper, you need to copy all the files that are mentioned in the CONTENTS file for that paper to a new subdirectory. The script ddfiles makes this easier: running ddfiles CONTENTS in your work directory produces a list of all the files mentioned directly or indirectly in CONTENTS. To copy all these files to a permanent location, you can simply run

      cp `ddfiles CONTENTS` ${GEOGP}/papers/my_new_paper
    

Tags play by play

The DIR file contains only ENTRY tags.

 <ENTRY base="dir">
base
A directory. All file names in the CONTENTS file are taken to be in this directory.

The following tags can be used in the CONTENTS file detailing the contents of the subdirectory of a paper. Note that order matters: the correct order of elements can be found in the DTD.

Each CONTENTS file contains either a <PAPER> or a <TALK> entry:

 <PAPER file="filename" created="date"> 
 <TALK file="filename" created="date"> 
file
The name of the main file of the paper/talk. If your paper is in paper.tex, this should be "paper". docdir will then look for files called paper.dvi, paper.ps etc. and link them from the generated HTML page.
created
The approximate date when the paper/talk was created, something like "1999-05-02".

Inside the <PAPER> or <TALK> tags, the paper/talk is described by various tags that have to come in the following order:

<TITLE> titletext </TITLE>

The title of the paper/talk.

<PURPOSE> purposetext </PURPOSE>

The purpose of the paper/talk, e.g., which conference it was created for.

<AUTHOR> authorname </AUTHOR>

The author(s) of the paper/talk; for multiple authors, put each author in its own <AUTHOR> tag.

<PUBLICATION status="submitted|accepted|appeared" pages="pages" volume="vol" number="number"> 
  pubtitle 
</PUBLICATION> 

The publication in which the paper/talk will appear together with some minimal bibliographic information.

status
The current status. Has to be exactly one of the three strings listed above. You need to include this attribute.
pages
The page numbers.
volume, number
The volume and number of the publication.
pubtitle
The title of the publication

For example

<PUBLICATION status="appeared" pages="110-128" volume="7" number="2">
  Computer Aided Geometric Design
</PUBLICATION>
<ABSTRACT> abstracttext </ABSTRACT>

The abstract of the paper/talk.

<FORMAT file="filename"> labeltext </FORMAT>

Usually, docdir looks for various formats of the paper/talk, like postscript, dvi and pdf. The <FORMAT> tag can be used to indicate a file or format of the paper/talk that docdir can't find automatically. There can be an arbitrary number of <FORMAT> tags, including none.

After the above tags, the following tags can appear in any order and any number of times at the end of the paper/talk entry:

<FIG file="filename" generator="filename"> captiontext </FIG>
file
The name of the file with the figure, e.g. figure.eps
generator
The name of a file used to create the figure in file, for example a MATLAB script.

This tag is helpful in organizing all those little .eps files that are usually needed for a paper talk. Just stick a <FIG> tag in for each and explain what they are for in the captiontext. This will help everybody tremendously (including yourself) when fishing for a nice figure for the next paper/talk.

To ease the task of finding all figures used in a TeX file, the script latexfigs can be used to extract the names of all .eps files used in a TeX file from the corresponding .log file.

<LINK file="filename"> labeltext </LINK>
file
Name of the file to be linked to

This tag is for linking additional, random stuff into the generated HTML page, like separate notes made while writing a paper, a README file, etc.


Last modified: 2000-05-11
  Page created: 1999-04-10