SEA Technical Memorandum #0402, ARC 6.02; Extended Data
Last updated: April 28, 1989
Copyright 1989 by System Enhancement Associates, Inc.



                                  ARC 6.02

                               Extended Data


As of version 6.00, the ARC file archive format has been extended to 
include additional information about the archive itself and the files 
within it.  The purpose of this document is to describe those extended data 
fields.  


Extended data fields are implemented by the definition of additional header 
type codes (refer to the ARC program sources, especially ARC.H, and to 
TM0401).  The following ranges of header type codes have been defined: 

          0        End of archive marker
     1 - 19        Standard compressed files
    20 - 29        Information items
    30 - 39        Control items
    40 and up      Reserved for future use

Most of the values in these ranges have not been assigned, and are reserved 
for future use.



Information Items
=================

An information item consists of a "standard header" (identical to the 
header for a standard compressed file), followed by one or more data 
records.  The "size" field in the header is the total number of bytes of 
data records which follow the header.  I.e. after reading the header a 
relative seek of "size" bytes should position the file at the start of the 
next header.  The "crc" field should be a calculated CRC of the data.  The 
"length" field may be meaningless, and should be ignored.  The "date" and 
"time" fields should be set to the date and time that the item was last 
modified.

The data records in an information item will always have the same format:

    length    A two-byte count of the number of bytes in this record 
              (including the length and type), stored in standard Intel 
              format.  

    type      A one-byte record type code (see below).

    data      <length> bytes worth of data.  Content and format is 
              dependend on which type of record this is and which type of 
              information item it is in.  

Data records are not compressed, but may be encrypted.  The total length of 
all data records in an information item may not exceed 2,000 bytes.


The following types of information items are currently defined:

    20   Archive information

         This item contains information about the archive itself.  The 
         "name" portion of the header should contain a null string.  The 
         following record types are currently defined for this type of 
         information item: 

         0    Archive description;  The data consists of a null-terminated 
              ASCII text string describing the archive itself.  

         1    Creator;  The data consists of a null-terminated ASCII text 
              string giving the name of the program that originally created 
              this archive.  

         2    Modifier;  The data consists of a null-terminated ASCII text 
              string giving the name of the program that last modified this 
              archive.  


    21   File information

         This item contains information about the file which follows this 
         item in the archive.  This item is only valid if the "name" field 
         in its header matches the "name" field in the file which follows.  
         The following record types are currently defined for this type of 
         information item: 

         0    Description;  The data consists of a null-terminated ASCII 
              text string describing the file.  

         1    Long name;  The data consists of a null-terminated ASCII text 
              string giving an alternate name for the file, which may be 
              longer than the "eight dot three" filename allowed under MS-
              DOS.  The "name" field given in the archive headers must 
              still contain a valid MS-DOS filename, which must be unique 
              in the archive.  

         2    Extended dates;  The data consists of extended date/time 
              information about the file.  The exact format and content of 
              the data has not yet been defined.  

         3    Icon;  The data consists of a bitmap icon representing the 
              file.  The exact format and content of the data has not yet 
              been defined.  

         4    Attributes;  The data contains information specifying what 
              attributes the file should be given when it is extracted.  
              The data consists of a null-terminated ASCII text string 
              where each character represents one attribute.  The following 
              attributes are currently defined: 

              R    Read only
              W    Write only
              H    Hidden
              S    System
              N    Network sharable

              Attribute characters are case sensitive and must be given 
              exactly as shown.  Implementors should be prepared to see 
              (and ignore) unknown attribute characters, and to contact us 
              with any suggestions for new attribute characters.  


    22   Operating system information

         This item consists of operating system specific information that 
         would not be useful under other operating systems.  Which 
         operating system it pertains to is given by a unique code in the 
         "length" field.  Record types for this type of information item 
         are specific to the operating system.  Contact SEA if you require 
         this type of information item.  



Given the rigidly defined structure of information items, one may safely 
assume that anything in the range of 20 to 29 is some sort of information 
item and that it may be safely processed by the guidelines given here.  

Implementors should be prepared to deal with unknown information item types 
and unknown record types, as we plan on being reasonably generous with 
them.  Please contact us if you see a need for, or have a suggestion for, 
any additional information item types or record types.



Control Items
=============

The format of any given control item depends on which type of control item 
it is.  See the notes below for specifics.  The following types of control 
items are currently defined:

    30   Subdirectory

         This item marks the beginning of a subdirectory.  The header is 
         identical in format to a standard header.  The fields in the 
         header should be filled as follows: 

         name      The name of the subdirectory.

         size      The total number of bytes of compressed data for all 
                   entries in the subdirectory, including headers and the 
                   end of subdirectory mark.  I.e. after reading the header 
                   a relative seek of "size" bytes should position the file 
                   at the start of the first header after the subdirectory 
                   and its contents.  

         date      This and the "time" field should be set to match that of 
                   the newest entry within the subdirectory.  

         time      See "date".

         crc       Undefined.  Should be zero.

         length    The total uncompressed length of all of the entries 
                   which make up the subdirectory.  

         A subdirectory may be logically extracted by creating the named 
         subdirectory, and then making that the current directory.  


    31   End of subdirectory

         This item marks the end of a subdirectory.  The header is 
         identical in format to an end of archive marker (i.e. it is a two-
         byte item).  

         An end of subdirectory may be logically extracted by changing to 
         the parent directory ("chdir ..").  


By their nature, control items are not as rigidly defined as information 
items.  Implementors should not make assumptions about the format or 
content of unknown control item types, and should not attempt to process 
them.  For this reason we expect to be "stingy" in assigning new control 
item types.  



Conclusion
==========

It is our intent to provide a platform whereby implementors of ARC-
compatible utilities can store and retrieve information beyond the limits 
normally associated with the MS-DOS operating system.  We're not knocking 
MS-DOS, but people who have ported ARC to other operating systems have 
often chafed at the loss of information that their operating system 
provides.  We also intend to provide implementors of MS-DOS based ARC-
compatible utilities and programs with increased functionality, to the 
ultimate benefit of their (and our) users.

This enhancement to the standard ARC file archive format is the result of 
the many suggestions we've received, both from users and from implementors.  
We feel we've created a flexible means whereby almost any special needs can 
be accomodated.  If we've left out anything you require, please feel free 
to contact us.  We can be reached by voice between 9 AM and 5 PM Eastern 
time at (201) 473-5153.  You can also leave a message for us on our 
customer support bulletin board at (201) 473-1991.  This is a five-line 
system that is available 24 hours a day at up to 2400 baud.  We can also be 
reached by mail at: 

                    System Enhancement Associates, Inc.
                       21 New Street, Wayne NJ  07470
