I've been waiting a while before writing a summary on NFS & large-scale backup
and storage servers. I was hoping to get some more war-stories, and to have time to bone up on these issues myself. Looks bleak though. Hopefully this summary may prod others with more knowledge and hands-on experience to follow up.
I've include most of the email I received at the end, incl. two messages
just to prove that Epoch's indeed are being used.
-----------------------------------------------------------------------------
A couple of things is clear:
- these questions are of great interest to many people
- there's a large variety of potential technologies coming to market now
- the use of NFS accelerators (of any type) is becoming well known,
and there are several types of solutions. One should be able to
select whatever suits one's check book. The best known solutions are"
- The high-end: Auspex
- The medium-end: NFS accelerator cards from Legato/Opus (?),
the newer ethernet boards like Interphase
- The low-end: Performance tuning and sw solutions like eNFS (?)
- For the technical minded, with no budget but Sun source license:
Turn off the synchronous write, life should be interesting. Actually
I think this is pretty safe for some environments.
The high and medium solutions are required if you need to support a large
numbers of users / NFS requests on one or a very small group of machines.
(If you don't care about the number of machines, then an SS1 is a wonderful
server :-) The only problem is where to stock all the unused 19" monitors.)
One major problem here is all the disk space that gets wasted. (My rule of
thumb is that if I give a group of people 3-400MB of added storage, they
will fill it up with ftp-files in 4-6 weeks, 90% of said files will never
be used. Let's not mention font technology :-()
So this is were the storage servers come in, and the new buzzword: The IEEE mass storage reference model. (Forget at your own peril. The key seems to be to consider it as a "virtual memory extension" for disk storage.) There seems to be quite a few possib
le solutions out there, some of which also seems to be getting some use.
This class can perhaps be sub-divided in 4 groups:
- the *real* storage servers
Epoch is the best known, with the largest installed base. There seems to be
at least 2-3 companies preparing to enter this marketplace too. Cost
seems to be in the US$ 60.000++ range for an entry level system.
These systems combine NFS services, hard-disks, optical jukeboxes with
proprietary SW for backup and migration facilities. From looking at the
Epoch spec's, my only concern is how many NFS users it can really handle,
and the requirement that the migration facility *must* go through the
Epoch hard-disks. While the optical storage certainly can be configured to
handle the entire university, the migration facility seems to me to be
more suited to the departemental level. *Real* experience would be nice
here though, and may easily prove me wrong.
- the optical jukeboxes
This should be an interesting solution to all those who figure they
*have* enough NFS servers, they just need the storage facilities. Quite a few companies have these, but the SW solutions seems much less capable
here. I've found no mention of migration facilities for instance, only
backup.
(For small sites one or a couple of read/write optical drives may be
good enough, coupled with Exabyte technology for backup.)
- independent SW solutions
Some software solutions exist though. The best known are probably Unitree,
which is available on an impressive number of platforms. For really large
sites this may be a good way to go. (My boss has displaced the papers we
got on Unitree, so I haven't had the opportunity to study this thoroughly.)
This is not inepensive, and I don't believe this is a ready-to-run solution.
My impression is that Unitree will take some programmer time to set up, and
to tweak. But the added flexibility may be worth it.
- home-grown solutions
Certainly exists. I got one reference to a Usenix paper, and there was also
some interesting presentations at the December 1990 SUG. Once again more
*real* experiences are needed. (I'm hoping that this theme will re-surface
at the LISA V conference in San Diego. Anybody in the know?)
Apart from the questions of technical merit for the different solutions, this
is also to a large part an *administrative* issue. No centralized backup
system can be made to work without considering stuff like user name space,
departemental autonomy vs. central control etc. This others will have to
worry about.
The big question seems to be: Can you get - and how much will you need - migration facilities. IMHO it's a brilliant idea if it can be made to work transparently for both small and large sites. We routinely add disks all over our network. Disks are cheap
investments now, but they certainly creates more work. My favourite system is probably a jukebox that I can hang of one of our fileservers. But I certainly would like to have most of the software facilities provided by Epoch and co.
Regards,
Jan Berger Henriksen
Institute of Informatics E-mail: jan@eik.ii.uib.no
University of Bergen
H|yteknologisenteret
N - 5020 Bergen, Phone : 47-5-544173
Norway Fax : 47-5-544199
address
-----------------------------------------------------------------------------
From: keves@meaddata.com (Brian Keves - Consultant)
Message-Id: <9107012004.AA02345@casedemo.meaddata.com>
To: jan@eik.ii.uib.no
Subject: Re: Network storage?
The kind of information that is useful when asking for this kind of information
is not the number of users but the number and make of the hosts.
If you are going to have a large number of hosts (>200) and want a very good
Distributed File Server and centralized backup, then you might look at the
Unitree Virtual File System product. It does a lot of things people in the
Mainframe world know about to maintain large disk arrays and backups.
General Atomics markets the product.
Brian
-------------------------------------------------------------------------------
From: etnibsd!vsh@uunet.UU.NET (Steve Harris)
Message-Id: <9107012054.AA19986@etnibsd.UUCP>
To: uunet!eik.ii.uib.no!jan@uunet.UU.NET
Subject: Re: Network storage?
There was a talk some years ago at Usenix about network management
in which the speaker described the system at her University. What it
boiled down to was: "We'll provide you with a standard set of services
(email, backup, etc.) iff you agree to a (reasonable but stringent)
set of rules about how your systems are setup and used. If you
choose not to avail yoursefves of this service, you're on your own.
Non-intensive users (e.g., sociology, physics) agreed, intensive
users (comp. sci., math (or parts thereof)) chose to go their own
way. It all seemed very reasonable to me; perhaps that example
could serve as a useful guideline.
Regards,
Steve Harris - Eaton Corp. - Beverly, MA - uunet!etnibsd!vsh
--------------------------------------------------------------------------------
From: "Mark D. Baushke" <mdb@ESD.3Com.COM>
Sender: mdb@ESD.3Com.COM
Organization: 3Com, 5400 Bayfront Plaza, Santa Clara, CA 95052-8145
Phone.......: (408) 764-5110 (Office); (415) 969-8328 (Home)
In-Reply-To: jan@EIK.II.UIB.NO's message of 30 Jun 91 21:52:28 GMT
Subject: Re: Network storage?
On 30 Jun 91 21:52:28 GMT, Jan Berger Henriksen <jan@EIK.II.UIB.NO> said:
Hi Jan,
I am on holiday this week, but here is a quick answer (I had one I
sent out a few months ago to someone else) to your query for
information. If you have additional questions about Auspex, feel free
to drop me a message.
jan> Our university computing centre is migrating from centralized
jan> systems to a distributed computing platform. They have some
jan> -justified- concerns about centralized backup, large-scale nfs
jan> disk-systems etc.
jan> I'd like to hear war-stories about this from others, specifically
jan> experiences in solving these problems, like:
jan> - Epoch storage servers, others ?
jan> - Auspex nfs-servers, others ?
We have an Auspex NS-5000 and really like it. We use Exabyte 8mm tape
as the method of backup. I like it.
jan> - large-scale backup systems ? migration technologies ?
jan> Whatever we come up with should work for a medium-sized
jan> university (10-20.000 students). There's a short implementation
jan> time, so the solutions should work now. Services should
jan> preferably be available for everything from PC's, Mac's to
jan> workstations.
jan> I'll summarize if there's interest *and* informed answers.
jan> Vendor-views are also welcome.
jan> Regards,
jan> Jan Berger Henriksen
jan> Institute of Informatics E-mail: jan@eik.ii.uib.no
jan> University of Bergen
jan> H|yteknologisenteret
jan> N - 5020 Bergen, Phone : 47-5-544173
jan> Norway Fax : 47-5-544199
I am including a slightly out of date copy of another message I sent
to someone about our setup here.
If you have any other questions, feel free to ask.
Later,
-- Mark
mdb@ESD.3Com.COM
[Note: I have removed the information which identifies the person
asking the questions, since I did not obtain permission to quote that
persons' questions or identity. -- mdb]
Update: I think I have updated the list below to reflect our current
quantity and configuration, but we have around 25 more SLCs (Sun-4/20)
either on order or in various stages of installation. They have (or
will have) local disks. All new SLCs are being installed with SunOS
4.1.1 and the old ones are scheduled to be upgraded soon. Also, soon
our Sun-3/50 machines will all be taken out of service (being replaced
by the SLCs on order). I believe we also have around 3 SS2s acting as
compute servers.
------- Forwarded Message
From: Mark D. Baushke
Subject: Re: Experiences with the Auspex NS-5000
> What type (and how many) workstations do you have?
We have approximately 100 Sun machines of various flavors:
Workstations:
(1) Sun2/120 - SunOS 3.5
(14) Sun3/50 - mostly SunOS 3.5, one SunOS 4.1
(43) Sun3/60 - mostly SunOS 3.5, some SunOS 4.0.3, one SunOS 4.1
(3) Sun4/110 - SunOS 4.0.3
(2) Sun3/470 - SunOS 4.0.3
(11) SPARCSatation 1 - many SunOS 4.0.3c, some SunOS 4.1, some SunOS 4.1.1
(8) SPARCSatation 1+ - many SunOS 4.0.3c, some SunOS 4.1, one SunOS 4.1.1
(15) SLC - many SunOS 4.0.3c, some SunOS 4.1, some SunOS 4.1.1
Servers and non-workstation machines:
(3) Sun3/280 - all SunOS 4.0.3
(1) VAX 785 - MORE/bsd 4.3 (mt. xinu)
(3) VAX 750 - MORE/bsd 4.3 (mt. xinu), Eakins 4.2, VMS
(1) VAX 8200 - Ultrix,
It might be interesting to note that our machines are spread over many
different sub-networks. The majority of our Sun machines are connected
to four sub-networks each of which is connected to the NS5000.
> How many workstations are you serving with the NS5000?
That depends on what you mean by serving. If you are talking about
being used with diskless and/or dataless then the numbers are:
(1) Diskless machines served - Sun3/60, SunOS 4.0.3
(8) Dataless machines served - Sun4c/60, SunOS 4.0.3
Note: SPARCStation 1 == Sun4c/60.
(FYI: Diskless means that the NS5000 provides the root, swap, and usr
disk. Dataless in this case means that the local machine has a local
SCSI disk which is typically used for root and swap only and the usr
disk is provided by the NS5000.)
However, we are using the NS5000 as the home for our local development
tools as well as the repository of our current software base. In
addition all of the 'home directories' for the diskless and dataless
machines' users are on the NS5000.
Of course, nearly every machine on our network is getting at least
some files from our NS5000.
We also have a small number of SLCs on order. These machines will
probably run dataless. (We intend to attach some kind of SCSI disk to
each.) In addition, two SPARCStation 2 machines are on order and will
run with swap and os local disk.
> What were you using for fileservers before the NS5000 and how
> many workstations were you serving from them?
Previously, we mostly used local shoebox disks to hold the operating
system and swap space. We still favor this approach to keep network
traffic reduced.
The few diskless machines we have were serviced by a Sun3/280.
> What problems, if any, have you encountered with the NS5000?
No problems. We are EXTREMELY pleased with the product. It has reduced
network problems (congestion) by allowing us to break our software
developers over multiple sub-nets.
> How was Auspex to work with
> (did they meet their commitments for delivery and service?
N.B.: We were a beta site. There were a few minor rough spots, but
nothing important was missed. (They were training a new FE at the time
and some messages did not get passed on correctly.)
> was their installation done smoothly and professionally?)
Yes, there were meetings prior to the installation in which we
discussed exactly what was to happen. Everything went smoothly with
the exception of the wrong kind of plug being brought out initially.
This was easily corrected on-site in a few minutes.
> What configuration of NS5000 did you buy?
The initial configuration was:
(1) Host Processor
(2) Ethernet Processor (four network connections)
(1) Storage Processor
(1) File Processor
(6) 600 MB Disk
(1) Exebyte Tape Unit
We have since purchased an additional 600 MB Disk and we are likely to
get more disks in the next 6-9 months. (One nice thing about the
NS5000 is that it scales nicely and we can add new disk incrementally.)
> Any other opinions on the product or the company?
They have a large number of very technically competent people. All of
the initial glitches in the beta software were fixed in a very short
time. We have had uptime on the NS5000 machine which lasts until the
next release of the software (> 90 days).
We are currently running release 1.2 of the Auspex software.
When there are problems (even if it turns out not to be their
problem), Auspex is always quick to respond.
If you have any more specific questions, feel free to ask.
I hope that you find the above useful.
-- Mark D. Baushke Internet: mdb@ESD.3Com.COM UUCP: {3comvax,auspex,sun}!bridge2!mdb------- End of Forwarded Message ----------------------------------------------------------------------------- From: jan Message-Id: <9107020936.AA01970@alm.ii.uib.no> To: keves@meaddata.com Subject: Re: Network storage?
Thanks, I'm looking at the UniTree product. Your comment on number of hosts is reasonable. I haven't given this much thought yet, it was just dropped in my lap.
Do you anything about Epoch vs. Unitree, except that the last is a pure sw-product running on lots of different machines?
Jan. --------------------------------------------------------------------------------
From: keves@meaddata.com (Brian Keves - Consultant) Message-Id: <9107021239.AA02981@casedemo.meaddata.com> To: jan@eik.ii.uib.no Subject: Re: Network storage?
I have not seem Epoch. I am just starting an evaluation of lots of different DFS (Distributed File Servers) and VFS (Virtual File System) products. I may be able to post my results to the net somewhere, although there may be non-disclosures in the way.
Good luck.
Brian --------------------------------------------------------------------------------
From: dfl@dretor.dciem.dnd.ca (Diane Luckmann) Message-Id: <9107021417.AA23352@dretor.dciem.dnd.ca> To: jan@eik.ii.uib.no Subject: Re: Network storage? Status: R
We are going through the same exercise of distributing the processing and of centralizing the backups.
While we look at systems (Epoch being one of them), we also try to use common sense to weight data availability versus data security. A risk assessment is perfect for that. Ask your users and management which information they are willing to loose or to wait (???) hours to reconstruct. If their data is precious, go with central backups that YOU can look after, and create enough data mirroring to be up in no time; if not, put backup facilities where needed and entrust users with doing their own backups. (Did I really just said <users> and <backups> in the same sentence?)
And remember common sense; there is probably a logical and economical blend of backup methods and equipment that will meet all your various users' needs.
I am very interested in a summary. Bye! --------------------------------------------------------------------------------
From: "J.LINN" <sys044@aberdeen.ac.uk> Date: Wed, 3 Jul 91 10:55:06 BST Message-Id: <A9107030955.AA01920@uk.ac.aberdeen> Reply-To: J.LINN@aberdeen.ac.uk To: jan@eik.ii.uib.no Subject: Re: Network storage?
We have just moved to a slightly more distributed approach, 3 instead of 1, but it will be more distributed next year. There are many problems but the one that hits us most is because files are shared between machines. When one is down, users find it strange that things don't work as they do not consider their particular system as distributed. It may be software they cannot access or even their own data but it is confusing to them (they may have been warned that one machine is going down but do know appreciate the implications). It is certainly a systems administration nightmare and requires a lot of thought on how to distribute software, users (we use a static load sharing algorithm i.e. users assigned specific machines with the hope that all machines will be loaded fairly equally.)
I will be interested in how you get on and other comments.
John Linn (j.linn@uk.ac.aberdeen) --------------------------------------------------------------------------------
From: Andy Stefancik 234-3049 <bcstec!eerpf001!ajs6143@uunet.uu.net> Message-Id: <9107031652.AA24447@eerpf001> To: sun-managers@eecs.nwu.edu Subject: epoch ls
We have an Epoch optical storage server which exports mounts to motorola and sparc clients running 4.0.3OS. We have one directory on the epoch where every filename is exactly 20 characters. When we do an ls -l from a sun client over the NFS mount on the above filesystem, we get an error file not found on one file. When rlogining to the epoch the file is there and the epoch ls -l shows with no errors. This error does not happen on the other filesystems with variable length filenames. It also does not happen when ls -l is run from a sun client running 3.5OS or 4.1.1OS. Epoch suggested that the sun is not unpacking the packed messages sent by the epoch and that one long integer is missing at the start of a following message send and their alignment is off, causing the argument to ls -l of one of the files to truncate 8 characters from the end. Does anyone know of a 4.0.3 patch that fixes this problem.
Andy Stefancik Internet: as6143@eerpf001.ca.boeing.com -------------------------------------------------------------------------------- Boeing Commercial Airplane G. UUCP: ...!uunet!bcstec!eerpf001!as6143 P.O. Box 3707 MS 64-25 Phone: (206) 234-3049Seattle, WA 98124-2207
From: CEB4@phoenix.cambridge.ac.uk Message-Id: <A42BD7236223A060@UK.AC.CAMBRIDGE.PHOENIX> To: jan@eik.ii.uib.no Subject: sun-nets posting re network storage Status: R
I would be very interested in a summary of responses you get to your request about large scale nfs, centralized backup and the like. We currently run an IBM mainframe running MVS with a large central filestore, but are being pushed to running a more ditributed Unix system.
Thanks.
Caroline Blackmun, --------------------------------------------------------------------------------
From: Kenny McDonald <c60244%CCFIRIS.AEDC@livid.uib.no> Subject: Using amd X-To: unix-wizards@sem.brl.mil X-Cc: info-iris@ccfiris.aedc To: Jan Henriksen <jan@eik.ii.uib.no>
We have recently installed an Epoch-1 Model 21 Infinite Storage Server on our network. It comes with the automount program (amd). Currently we have Silicon Graphics machines running NFS and we are exporting /usr & /usr1 from each host. We are mounting them on the clients as: /net/servername/usr & /net/servername/usr1 from /etc/fstab.
The following is part of our /etc/fstab file:
wthomson:/usr /net/wthomson/usr nfs soft,bg,rw 0 0 stokes:/usr /net/stokes/usr nfs soft,bg,rw 0 0 stokes:/usr1 /net/stokes/usr1 nfs soft,bg,rw 0 0 buford:/usr /net/buford/usr nfs soft,bg,rw 0 0 buford:/usr1 /net/buford/usr1 nfs soft,bg,rw 0 0 effrim:/usr /net/effrim/usr nfs soft,bg,rw 0 0 effrim:/usr1 /net/effrim/usr1 nfs soft,bg,rw 0 0 rockford:/usr /net/rockford/usr nfs soft,bg,rw 0 0 magnum:/usr /net/magnum/usr nfs soft,bg,rw 0 0
I am wanting to configure our Epoch-1 to be able to automount filesystems on our machines automatically when I cd to /net/hostname.
What I am needing to know is: 1. Is anyone running amd with this type environment? 2. Is there a way to configure amd to work the following way: when a user on stokes cd's to /net/ccfiris/usr1, amd will automatically mount this directory. 3. If so what does the configuration file look like?
Thanks.
This archive was generated by hypermail 2.1.2 : Fri Sep 28 2001 - 23:06:21 CDT