future disk storage enhancements

Terry Slattery, SECAD tcs at BRL.MIL
Sat Jan 21 03:09:41 AEST 1989


> 1) Is your interest in this mainly:
>  
>    a) A need for larger storage, ie more files in a filesystem, or
> 
>    b) A need for really long files?
> 
>    If the latter, what sort of file sizes do you think might arise in practical
>    applications for our systems over the next year or so?

Both.  With graphical images, large amounts of storage are needed,
especially when making animation sequences.  Similarly, these
sequences are often stored on some sort of bulk media.  Sometimes the
bulk media must be used in a way which dictates the use of a single
file for efficiency reasons (Exabyte tapes, for example).  I would
like to take a set of animation files (a large number of 1Mby files)
and store them as a single entity and still be able to perform seeks
within this entity.  This isn't possible if the file is > 2^^31 bytes.

> 2) Is there any percieved need for large RAW (ie non-filesystem) storage?

Not from me.  But what about optical disks?  Most WORM drives are best
used as raw devices.

> 3) To address large storage objects, we would have to augment the semantics
>    of 'lseek' in some way. Two possibilities: 
>    
>       *	extra defines in the "whence" field specifying units of (say) Kbytes
> 	instead of bytes.
> 
>       * A new system call taking more than one word to allow a larger
> 	range. (Presumably 64 bits, probably typedef'd for convenience).
> 
>    Preferences? Alternatives?
> 
>    Is there really a need to address very large storage objects to byte
>    granularity?

Please make the 64 bit offset your first choice and don't change the
semantics of lseek.  Other vendors have already started work on
"qseek", which takes a quad instead of a long as the offset.  Retain
the lseek syscall in the kernel for some period of time (e.g., two
major releases) and install a library routine named lseek which calls
a new qseek syscall, so that existing binaries and sources both
continue to work.  Old binaries will still work via the retained lseek
syscall and any recompilation will be linked to the new qseek syscall
via the library.  Applications using large files can be fixed to use
the new qseek.  

> 4) How important is extensibility? (ie the ability to add the space of a new 
>    disk to an existing filesystem). If this IS considered important,
>    must it be totally dynamic (ie can be done even when the filesystem is
>    mounted and in use) or could we live with needing to unmount the
>    filesystem to extend it?

We generally don't use local disks, preferring instead to keep our
user files on NFS servers which are regularly backed up by operators.
This feature would be nice on the NFS servers.  Since NFS is stateless,
it would not be unreasonable to have to unmount the partition in order
to extend its size.  

> 5) Do you foresee any demand for 'mirroring' ie automatically keeping data
>    on duplicate/triplicate etc storage for reliability? (We are not currently
>    in the database/transaction-processing market where this is common, but
>    who knows)....

This feature would be nice for critical data.  The database folks are
not the only people who are concerned about data reliability.

You might want to get in touch with Tom Van Baak at Pyramid Technology
(pyramid!tvb) about his paper "Virtual Disks: A New Approach to Disk
Configuration" which was presented at the Winter 1987 Usenix.  The
proceedings I have contains a very short two page paper which looks
like it may have been truncated; his introduction mentions
experimentation results and a more complete description which is
missing.  He may be able to send you the complete paper.  His work
allowed 1) single disks, 2) concatenated disks, 3) striped disks, and
4) mirrored disks, all under control of a single configuration file.

Thanks for asking.

	-tcs



More information about the Comp.sys.sgi mailing list