load sharing

M Gordon mfg at castle.ed.ac.uk
Thu Feb 7 20:20:19 AEST 1991


fwp1 at CC.MsState.Edu (Frank Peters) writes:

>: On 6 Feb 91 16:22:07 GMT, pjw at usna.navy.mil, , jw at math30, (Peter J. Welcher (math FACULTY)) said:

>pjw> The question is, is there any easy way to perform load-sharing, other than by
>pjw> randomly assigning sections or students to hosts ?  

>I once toyed with an idea to do something like this using DNS but
>never implemented it.

>Basically the idea was to define a new record type in my local DNS
>tables called PROG that would run the given program and return the
>result in an A record to the calling program.

...

>I think this idea has the following advantages:

>1.  I'd be willing to bet that the necessary modifications to bind
>    would be relatively trivial.

>2.  Since all that ever gets returned is an A record no modifications
>    are required to the world wide DNS system or to individual
>    resolver clients.  And no front end host beyond the nameserver
>    would need to be involved...none of this 'telnet to machine A and
>    let it decide where you should go' stuff.

>3.  The actual load program can be upgraded/replaced/modified with no
>    changes to the bind code.  I can make leastload return a random
>    host as a first pass, then the least number of users later, then
>    the least loaded cpu and so on for finer levels of balance.  The
>    two tasks (picking a destination and returning it to the user) are
>    isolated.  I always did like modularity.

>Any comments on this idea?  Any reason why it would be especially
>difficult/impractical? 

>Anyone who has actuall done this?? :-)

I implemented a similar idea for our network of suns.  Named has been
altered to recognise "sun3" and "sun4" as special cases and use RPC to get
a hostname from a server.  There were several reaons for doing it this way,
rather than having named doing the polling itself.

	If a machine is down named would hang until the poll of the 
	dead machine timed out, stopping it responding to other calls.

	As well as the terminal servers using DNS for name lookup we have
	some Bridge terminal servers which use their own name server 
	machines. The primary server for these is set to a Bridge box,
	the secondary to the address of the server.  The primary server 
	will not recognise the name "sun3" so it will be passed to the
	server to reply with an address or "name unknown" if it is not
	a request for "sun3" or "sun4". The same server can respond to
	both RPC requests from named and Bridge boxes.

	We still have some people with serial lines into Vaxes. These 
	lines are running a modified getty. Instead of /bin/login the
	modified getty runs a small program which makes an RPC request
	to the server and execs an rlogin to the machine returned. This
	part of the system will gradually disappear as the Vaxes are 
	retired and we move people onto the terminal servers.

	The server is actually two programs, one which does the polling
	and puts the results into a shared memory segment and the other
	which responds to RPC requests. This means that the response to
	a request is immediate, even if the polling program is waiting
	on a dead machine.  It also makes it possible to use the
	information gathered for other purposes.  e.g. a screen in our
	machine room shows the load average of all our suns and the name
	starts flashing if a machine dies, letting us monitor the state of
	machines all over the building.


Michael
-- 
							 _   _   _    _   _	
Michael Gordon - mfg at castle.ed.ac.uk OR ee.ed.ac.uk	| |_| |_| |__| |_| |   
							| . . . .      . . |    
I spilt spot remover on my dog and now he's gone! 	|_________|~~|_____|    



More information about the Comp.unix.internals mailing list