`Ninth System Administration Conference (LISA ’95)
`Monterey, California, September 18-22, 1995
`
`OpenDist - Incremental Software Distribution
`
`Peter W. Osel and Wilfried Gnsheimer
`Siemens AG, Mnchen, Germany
`
`For more information about USENIX Association contact:
`1. Phone:
`510 528-8649
`2. FAX:
`510 548-5738
`3. Email:
`office@usenix.org
`4. WWW URL: http://www.usenix.org
`
`Page 1 of 15
`
`GOOGLE EXHIBIT 1023
`
`
`
`OpenDist – Incremental
`Software Distribution
`Peter W. Osel and Wilfried Gänsheimer – Siemens AG, München, Germany
`
`ABSTRACT
`
`OpenDist provides efficient procedures and tools to synchronize our software file
`servers. This simple goal becomes challenging because of the size and complexity of
`supported software, the diversity of platforms, and because of network constraints.
`Our current solution is based on rdist (1) [1]. However, it is not possible anymore to
`synchronize file servers nightly, because it takes several days just to compare distant servers.
`We have analyzed the update process to find bottlenecks in the current solution. We
`measured the effects of network bandwidth and latency on rdist. We created statistics on the
`number of files and file sizes within all software packages.
`We found that not only the line speed, but also the line delay contributes substantially
`to the overall update time. Our measurements revealed that adding a compression mode to
`rdist would not have solved our problem, so we decided to look for a new solution.
`We have compiled a list of requirements for evaluating software distribution solutions.
`Based on these requirements, we evaluated both commercial and freely available tools. None
`of the tools fulfilled our most important requirements, so we implemented our own solution.
`In the following we will describe the overall architecture of the toolset and present
`performance figures for the distribution engine that replaces rdist. The results of the
`prototype implementation are promising. We conclude with a description of the next steps
`for enhancing the OpenDist toolset.
`
`Our Environment
`The CAD Support Group of the Semiconductor
`Division of Siemens AG installs, integrates and dis-
`tributes all software needed to develop Integrated
`Circuits. We have development sites in Germany
`(München and Düsseldorf), Austria (Villach),
`the
`United States (Cupertino, CA), and Singapore. The
`development sites are connected by leased lines with
`a speed of 64 to 128 kBit/s. At each site, a central
`file server stores all software. Client workstations
`mount software from these servers. Software is
`installed and integrated in München and distributed
`to all other development sites. System administra-
`tors of the development sites initiate the transfer on
`the master server in München.
`The CAD Support Group takes care of the CAD
`software and tools, only. A separate department is
`responsible for system administration,
`i.e., mainte-
`nance of the operating system and system tools,
`backups, etc.
`Our software distribution problem differs in
`many ways from the one solved by traditional
`software distribution tools. Most software distribu-
`tion tools we looked at are designed to distribute a
`moderate number of fairly static software packages
`of moderate size to many clients.
`In contrast, we have to synchronize few file
`servers (under a dozen), which store many (about
`
`200) packages of sizes ranging from tiny (a couple
`of kilobytes) to huge (1.8 GBytes). The total size of
`the software we store is currently 25 GBytes, 10-15
`GBytes are currently being kept up-to-date at all
`sites. Many packages are changed each day. A
`change might update only a single file of a few bytes
`or could change up to 50,000 files for a total of 1
`GBytes per day. Every month about 10 % of the
`software change. Most changes are small, but many
`files are constantly updated. The installation of a
`huge patch or a new software package changes many
`files at once.
`There is no separate installation- or test-server,
`all changes are applied to the systems while our
`clients are using them. The changes are tested in
`München and, ideally, copied to all slave file servers
`within one day.
`Synchronizing or cloning file
`servers is the best way to describe our setup.
`
`Our Current Solution
`Our current software distribution process uses
`rdist (1) to find changed files and to update slave
`software servers.
`It is no longer possible to compare
`two software servers in one night. A complete
`check of all software packages on the slave file
`server in Singapore would take several days which is
`not acceptable nor
`feasible. During that
`time,
`software packages would be in inconsistent states,
`and changes of the master software server could take
`
`1995 LISA IX – September 17-22, 1995 – Monterey, CA
`
`181
`
`Page 2 of 15
`
`
`
`OpenDist – Incremental Software Distribution
`
`Osel & Gänsheimer
`
`up to a week to be transferred to the slave file
`server. Though it
`is possible to apply different
`update schedules – updating small packages daily,
`some weekly – the setup is not satisfactory. With an
`ever-increasing number of software packages and an
`ever-growing size of each software package, the dis-
`tribution process using rdist
`is not acceptable any
`more.
`Searching The Bottleneck
`We have analyzed the update process to find
`bottlenecks in our current solution. We analyzed our
`lines and measured bandwidth, latency and compres-
`sion rate (all leased lines are equipped with datamiz-
`ers – devices that compress all traffic). We created
`statistics on the number of files and their size for
`more than 200 software and data packages. Com-
`mercial software packages, technology data and cell
`libraries, as well as many free packages like X11
`and gnu tools were analyzed. We were also
`interested in the compression rate and time of
`software packages and how much the compression
`rate differs when software packages are compressed
`file by file or as a complete archive. We analyzed
`where rdist spends its time during updates. Com-
`pared to the installed software, our change rate is
`small, so finding changed files must be efficient.
`Changes can be rather huge, so the transmission of
`changed files must be efficient, too.
`The Benchmark
`We wrote a benchmark suite that measures the
`elapse time needed to perform typical software dis-
`tribution operations such as
`installing, comparing,
`deleting, and updating files of different sizes, instal-
`ling symbolic and hard links. All operations were
`
`executed many thousand times to equalize differ-
`ences of the link performance.
`The benchmark measures ping (1), rcp (1), and
`rdist (1) performance and times. Each rdist test runs
`on a directory with an appropriate number of random
`files of the same size. Each test contains an add,
`check, update and delete sequence. The file size is
`increasing from 1 Byte to 1 MBytes. Thus the effect
`of transfer rate and rdist protocol can be separated.
`The rdist part of the benchmark source tree contains
`approximately 5,000 files. This sums up to 10,000
`transferred files, 5,000 check actions, 5,000 delete
`actions and 30 MBytes transferred data per test run.
`rcp (1) times are measured for a text, a binary and a
`compressed file of 1 MBytes each. This shows the
`achieved on-line compression.
`The leased lines (except the dialup ISDN link)
`are shared by many users. So it is not astonishing
`that the benchmark results varied a lot, sometimes
`by more than a factor of three. To make our bench-
`mark of the line performance more comparable, we
`calculated the average value for the best results of
`several runs of the benchmark. Some of the small
`numbers are within the magnitude of time resolution
`and must be interpreted cautiously.
`The Results
`Size and Composition
`Software packages vary substantially in size
`and composition of file types, however bigger pack-
`ages don’t necessarily have bigger files, they have a
`few huge files, but the average file size is more or
`less independent of the total size of the package
`(Diagram 1).
`
`LAN
`
`MÜNCHEN
`
`DÜSSELDORF
`
`VILLACH
`
`CUPERTINO
`
`SINGAPORE
`
`Line Type
`Nominal Line Speed [kBit/s]
`Transfer rate [kByte/s]
`Ping Response Time [ms]
`
`Ethernet
`10,000
`90-100
`<1
`
`ISDN
`64
`6-7
`33-88
`
`X.25
`64
`4-5
`188-372
`
`leased
`128
`7-12
`81-311
`
`X.25
`64
`2-3
`530-1083
`
`leased
`64
`3-4
`617-1375
`
`rdist file create [s]
`rdist file check [s]
`rdist file delete [s]
`10 kBytes transfer rate [kByte/s]
`
`Run Benchmark [h]
`
`rdist check SW subset [h]
`OpenDist check SW subset [h]3
`rdist check all SW [h]2
`OpenDist check all SW [h]
`
`0.2
`0.02
`0.1
`
`-
`
`-
`
`1
`
`0.5
`
`3
`1.5
`
`0.2
`0.06
`0.13
`5.9
`
`2.5
`
`-
`-
`
`-
`-
`
`1.2
`0.5
`0.5
`2.5
`
`7
`
`16
`21
`69
`51
`
`0.6
`0.2
`0.3
`4.9
`
`4
`
`8
`.5
`
`27
`1.5
`
`2.1
`1.
`1.
`1.5
`
`12
`
`-
`
`.75
`
`140
`2
`
`4.5
`2.1
`2.3
`1.4
`
`24
`
`>802
`.75
`
`290
`3
`
`1Increased time, because software pools in Düsseldorf are accessed via NFS not UFS.
`2Estimated.
`3This subset consists of technology data and is changed and distributed daily. The subset contains approximately 150,000
`files with a total of 1.1 GBytes.
`
`Table 1: Line Characteristics
`
`182
`
`1995 LISA IX – September 17-22, 1995 – Monterey, CA
`
`Page 3 of 15
`
`
`
`Osel & Gänsheimer
`
`OpenDist – Incremental Software Distribution
`
`1E+05
`
`1E+04
`
`1E+03
`
`1E+02
`
`1E+01
`
`Number of Files
`
`worthwhile to compress archives before transmis-
`sion. Datamizers increased the transmission rate of
`uncompressed data by 10 .. 15 %, whereas gzip
`reduced the data to a third of their original size.
`Compression Rate
`On a SPARCstation 10/41 (Solaris 2.4, 128
`MBytes memory) gzip created compressed data at a
`rate of 65 kByte/s, many times faster than the speed
`of our leased lines. This figure is important to know
`when you want to pipeline the creation, compression,
`and transmission of update archives.
`In case the
`throughput of the lines is in the same order of mag-
`nitude as the gzip output rate, it would be advisable
`to decrease the compression level.
`Decompression
`Decompressing the archives with gunzip (1) is
`usually six times faster than compressing the data.
`Decompression time does not depend significantly on
`the compression quality chosen for compression
`(Diagram 2).
`Compression and Archives
`It is better to compress an archive of files than
`to archive compressed files. Compressing complete
`packages is significantly faster and creates smaller
`archives than compressing each file separately and
`archiving the compressed files. For example, archiv-
`ing and compressing X11R6 was completed in three
`minutes elapse time, and the overall
`size was
`reduced by 55 %. Compressing each individual file
`and archiving the compressed files in a second step
`took five minutes elapse time and reduced the
`overall file size by only 45 %. All tests were per-
`formed several
`times on an unloaded machine.
`Compressing individual files and archiving them
`needs many more file and disk operations compared
`to archiving the uncompressed files and compressing
`the archive. Compressing several small files (or
`small network packets)
`is not as
`efficient
`as
`compressing the files in a single run.
`Transmission and Archives
`It is better to transmit an archive of files than
`to transmit each file individually. Depending on the
`file transfer protocol used, the latency of the line has
`a high impact on transfer rates. The smaller the files
`and the higher the latency, the higher is the delay
`caused by inefficient protocols.
`The latency increases the time rdist needs to
`check or create files.
`If you have many files, rdist
`needs a long time to compare master and slave
`server.
`If many or all files changed (e.g. when ins-
`talling a new software package), rdist will need
`much more time to transfer all files. The average
`file size of our software packages is 30 kBytes
`(Diagram 3). To our Singapore site, we need about
`10 seconds (3 kByte/s) to transfer a file of this size.
`However, rdist needs more than 4 seconds to create
`the new file, for a total
`transmission time of 14
`
`1E+00
`1E+04
`
`1E+05
`
`1E+08
`1E+07
`1E+06
`Overall Package Size
`Diagram 1: Package Size vs. File Count
`
`1E+09
`
`Compression Factor
`The average compression factor of our software
`packages is three. Most of our software packages
`were compressed by this factor, though we observed
`compression factors between two and five.
`When using gzip (1), you can regulate the
`compression speed between fast (less compression)
`and slow (best compression). For our software pack-
`ages, increasing the compression quality reduces the
`compressed file size by less than 5 %, the compres-
`sion time however sometimes increased by more
`than 200 % (Diagram 2). The default compression
`level of 6 is a good compromise, so we decided to
`use it.
`
`Compression Rate
`Compression Time
`Decompression Time
`
`4
`
`3
`
`2
`
`1
`
`0
`
`Relative Performance
`
`2
`
`6
`4
`Gzip Compression Level
`Diagram 2: Gzip compression Quality and Speed
`
`8
`
`leased lines are equipped with
`Though our
`datamizers
`that compress network traffic,
`it
`is
`
`1995 LISA IX – September 17-22, 1995 – Monterey, CA
`
`183
`
`Page 4 of 15
`
`
`
`OpenDist – Incremental Software Distribution
`
`Osel & Gänsheimer
`
`seconds (40 % increase), a 30 % decrease in transfer
`rate. The transfer rate for 10 kBytes files is only
`half of the normally achievable transfer rate (See
`Table 1).
`
`150000
`
`100000
`
`50000
`
`0
`
`Number of Files
`
`10
`
`100
`
`1000
`
`10000 100000 1E+06 1E+07 1E+08
`File Size Range
`Diagram 3: File size range (All packages)
`
`the
`overhead,
`protocol
`avoiding
`Besides
`transmission of archives has additional advantages.
`By first transferring all changed files to a holding
`disk, and installing changes locally on the remote
`server from the holding disk, the time during which
`the software package is in an inconsistent state is
`significantly reduced. Moreover, we can use the
`same tools to archive and roll-back changes. The
`installation of changes can be done asynchronously,
`so a system administrator at
`the remote site can
`easily postpone updates. The advantages compen-
`sate the disadvantage of needing holding disks to
`temporarily store the file archives.
`
`Benchmark Time
`
`Ping Response Time
`
`Observed Transfer Rate
`
`Nominal Line Speed
`
`Line Characteristics
`
`rdist and Latency
`Although the line speed from München to Vil-
`lach and to Singapore differs by only a factor of
`two,
`the time needed to run the rdist benchmark
`differs by a factor of six (see Table 1 and Diagram
`4). The ping response time (and therefore latency)
`has a greater impact on the time rdist needs to create
`or compare files than the line speed.
`Benchmark Summary
`Our measurements have revealed that the line
`speed is not the only bottleneck:
`the latency also
`plays an important role.
`rdist compares source and
`target directory file by file. Because the time for
`this is proportional to the latency, and because our
`change rate is small compared to the installed
`software, adding compression to rdist would not
`have solved our problem.
`rdist spent most of its
`time trying to figure out what
`to update, and not
`actually updating files. On the other hand, if a new
`version of our biggest software package is installed,
`we have to transmit 1.8 GBytes, so transmission
`must be optimized, too. The transmission of single
`files is another bottleneck as in our environment, the
`protocol overhead and transmission time are in the
`same order of magnitude, which reduces the average
`actual transfer rate by up to 30 %. For an efficient
`solution in our environment, files that have to be
`updated must be archived first and then be transmit-
`ted in one large file.
`Upgrading our lines would not solve our prob-
`lem, because the latency would not get
`small
`enough. It is also a very costly solution.
`We found that we had to tackle two problems:
`making the finding of changed files more efficient,
`and making the transmission of data more efficient.
`We began to look for a new solution.
`
`Requirements for Software Maintenance
`We compiled a long list of requirements that a
`new solution should fulfill. Here are some of the
`more important ones:
`Optimal Support of Incremental Distribution
`We do not want to trace changes as they are
`applied and re-apply them at a later date on slave
`file servers. Changes should be found by comparing
`the status of the master and the slave file server.
`Comparison should be stateless – it should not
`depend on update history. Each file server is admin-
`istrated by independent system administrator groups,
`so we don’t want to rely on what we think the status
`is, but we rather have to check the actual status of
`the remote file server. We have to detect changes
`applied by remote administrators.
`Update Programs Currently Executing
`Files that are updated may not be overwritten.
`The old file has to be moved and unlinked, then the
`new file has to be moved to it’s final destination.
`
`MUC
`
`D
`
`VI
`Site
`Diagram 4: Line Characteristics
`
`CUP
`
`SIN
`
`184
`
`1995 LISA IX – September 17-22, 1995 – Monterey, CA
`
`Page 5 of 15
`
`
`
`Osel & Gänsheimer
`
`OpenDist – Incremental Software Distribution
`
`Do Not Require Root Permission To Run
`We install all
`software using unprivileged
`accounts and try to avoid using root permissions as
`much as possible.
`Synchronizing software file
`servers
`should be done using an unprivileged
`account, too. If root permission are required (e.g., to
`update entries in system files, or programs that
`require an user or group s-bit with a system owner-
`or group-ship), a script should be created,
`that
`is
`executed separately by the system administrator of
`the slave file server.
`Support Mapping
`To allow localization, a flexible mapping of,
`e.g., file- and path-names, permissions, and owner
`should be supported. Software packages might be
`owned by different accounts on different servers.
`Symbolic links
`replace files
`to implement
`site
`specific changes (e.g., for configuration files).
`Support Execution of Scripts Before and After
`Updates
`roll-back it
`Before and after an update or
`should be possible to execute scripts on the server
`and client. You might want to shutdown a database
`server and restart
`it after the update has finished.
`License servers might have to be re-started,
`if
`license files were updated.
`Transfer Data Efficiently and Reliably
`The data transfer must be efficient, because our
`links are slow and have a high latency.
`If the link
`fails for a short period of time and the data transfer
`is aborted, transmit only the missing data, do not re-
`transmit all data.
`Be Humble
`Do not require special installation of software
`packages. We don’t want to change the installation
`of commercial
`software and have to support a
`variety of different package types.
`Should Support Roll-back
`least one
`It should be possible to undo at
`update.
`If an update of a software package intro-
`duces problems, it should be possible to go back to
`the previous state of the software package. Also, if
`an update fails, roll-back the already applied changes
`to return to a consistent state.
`Minimize Inconsistent States
`The time that a software package is in an
`inconsistent state (the time between the first and the
`last change that is applied) should be as small as
`possible. The time between applying changes to
`software packages that depend on each other should
`be as small as possible, too. Roll-back changes, if
`an update did not complete successfully.
`Avoid Errors Pro-actively
`Try to verify in advance, whether an update is
`likely to succeed, e.g.
`check whether the target
`
`server has enough disk space to store new or
`changed files.
`Should Be Flexible
`It should be easy to choose alternative distribu-
`tion media:
`e.g., tape, email, direct network link.
`Comparing the status of master and slave file servers
`should be possible, even if no direct network link
`exists between the servers. The tool should be
`modular and extensible. Tool interfaces should exist
`and be well-documented.
`Should Use Standards
`Use well-known existing standards and standard
`tools as much as possible. Do not re-invent wheels.
`Should Be Cost Effective
`its installation and
`The cost for the product,
`customization, and its maintenance must be accept-
`able.
`
`Evaluation of Alternatives
`We took a look at freely available tools, as
`well as commercial
`tools, proposed standards, and
`papers dealing with software management
`([14],
`[17], [20], [21]).
`Freely Available Tools
`The tools that we looked at can be categorized
`as follows: Tools that help to maintain source code
`and install software in a heterogeneous platform
`environment, like rtools [22]; Tools whose primary
`focus is network and disk space efficient installation
`and an unified setup and access by users in a campus
`network – tools like ‘‘The depot’’ [3], [18], depot-lite
`[7], opt_depot [24], ldd [4], lude [8], and beam [19]
`fall in this category; and tools that are designed to
`distribute software,
`like rdist [1], fdist [6], mirror
`[25], track [9], sup [2], and SPUDS [5].
`All
`tools lack efficient
`incremental software
`distribution over slow WAN links. Bad assumptions
`include the set-up of
`software packages which
`imposes too tight restrictions. This won’t work in
`our multi-vendor system environment. None cares
`about controlling the transmission in terms of media,
`scheduling,
`interruption or measurement and self-
`adoption. No roll-back support exists.
`Commercial Tools
`Some commercial data distribution tools exist,
`as well as software management tools, that provide
`additional functions to cover a broader range of the
`software life-cycle, e.g., packaging, installation and
`de-installation. XFer from ViaTech, MLINK/ACM
`& DistribuLink from Legent, Tivoli/Courier
`from
`Tivoli, and DSM-SAX from SNI fall more into the
`data distribution category, whereas HP OpenView
`Software Distributor from Hewlett-Packard, and Sun-
`DANS from Sun Microsystems [13] fall
`into the
`latter category.
`
`1995 LISA IX – September 17-22, 1995 – Monterey, CA
`
`185
`
`Page 6 of 15
`
`
`
`OpenDist – Incremental Software Distribution
`
`Osel & Gänsheimer
`
`GUI and object-oriented methods and policies
`ease software packaging and automate distribution
`and gathering tasks. Typical application fields for
`these commercial
`tools are large companies with
`diverse offices with many client machines like finan-
`cial or insurance companies.
`All commercial tools claim to comply to stan-
`dards, although it is sometimes hard to tell which
`standard they mean. Only one commercial tool pur-
`ports to be compliant to the draft of the POSIX stan-
`dard 1387.2 (formerly 1003.7.2) Software Adminis-
`tration.
`We found it difficult to explain exactly what
`we mean by incremental update to some tool ven-
`dors. No package had proper mechanisms built-in.
`Incremental updates can be added to most commer-
`cial tools by writing scripts. Price is also a problem.
`Truly powerful tools won’t start below $50,000 just
`for the licenses. Add an equivalent amount for ins-
`tallation, customization, maintenance, and updates.
`One benefit of commercial
`tools is to reduce the
`required skill and cost of the personnel at remote
`sites. This will not work for our few, demanding
`development sites.
`tools is problematic if
`Usage of commercial
`you want to establish links to external companies.
`You need to buy licenses, so has your partner, too.
`All evaluated distribution software tools require that
`on both target and source locations daemons are run-
`ning, and you have to pay a license fee per master
`and per client.
`It’s foreseeable that not all companies (esp.
`small consulting groups) are willing or able to spend
`the extra money and the extra effort of installation.
`Therefore it’s very important for us to have a tool
`that can be used without any restrictions at least at
`the client side.
`Two software packages
`though:
`Tivoli/Courier
`Tivoli Systems implemented an extensive col-
`lection of system administration tools. One of them,
`Tivoli/Courier, allows automatic software distribu-
`tion
`and
`control
`of
`server
`and workstation
`configuration. Tivoli/Courier is embedded into the
`Tivoli Management Framework.
`Together with
`other
`tools
`this
`forms a complete management
`environment. A graphical user
`interface and an
`object oriented approach allow easy maintenance of
`large
`systems.
`Tivoli/Courier
`allows
`to define
`software packages, different styles of scheduling, to
`define which files are updated at what time. Scripts
`can be added to customize the management environ-
`ment to special requirements.
`Tivoli/Courier does not fit to our requirements
`in respect to incremental distribution as we need it
`(It could be implemented by external scripts). It was
`
`looked promising,
`
`not clear if we could run Tivoli/Courier stand alone
`without the framework or the other system adminis-
`tration tools. A direct network link is mandatory.
`All hosts
`involved in the update process need
`licenses and the software has to be installed as root.
`Reference customers seem to have a different profile
`(many hosts to update, static package design, smaller
`volume) than we have.
`If we already had Tivoli Management Environ-
`ment in productive use as the basis of our system
`administration, it would make sense to evaluate the
`performance of Tivoli/Courier. For the time being it
`would be too costly to implement the Management
`Environment to just use the Tivoli/Courier part.
`XFer
`
`XFer from ViaTech Technology was the second
`tool we evaluated very closely. XFer is optimized to
`solve the standard software distribution task: Update
`many hosts spotted over the globe with packages of
`(in our opinion) modest size and well-known struc-
`ture. Compression and packaging of updates are
`standard. Packages, hosts and other resources are
`objects and can be managed efficiently. Machines
`can be grouped in ‘‘profiles’’. These profiles allow
`to send updates to machines which require certain
`software, regardless of type and location.
`But at the time of the evaluation there was no
`support built in for incremental distribution (in our
`terms). High entry costs would pay off for many
`client hosts, but not for the few servers we run.
`Database and protocol overhead are not known.
`XFer would profit if installed together with coopera-
`tive network, user administration and configuration
`management
`tools, which are not available on our
`sites.
`POSIX 1387.2: Software Administration
`P1387.2 provides the basis for standardized
`software administration.
`It
`includes commands to
`install, remove, configure, and verify software. A
`distribution format
`(install
`image) of software is
`defined along with commands to create and to merge
`distribution images. Provision is also made for
`tracking what software is installed and what its level
`is. Commands may be directed on one system to
`occur on any number of systems throughout your
`network.
`There is a set of concessions to operating sys-
`tems not based on POSIX.1 and POSIX.2, and there
`are exception conditions, so that systems such as
`DOS can conform to P1387.2.
`Status
`P1387.2 has passed its ballots within IEEE and
`has become the first POSIX system administration
`project to complete its work. Now P1387.2 has to
`be approved by the IEEE Standards Board, registered
`by ISO as a Committee Draft, and soon will be bal-
`loted as a Draft International Standard.
`
`186
`
`1995 LISA IX – September 17-22, 1995 – Monterey, CA
`
`Page 7 of 15
`
`
`
`Osel & Gänsheimer
`
`OpenDist – Incremental Software Distribution
`
`Copies of the current draft, P1387.2/Draft 14a,
`April 1995, are available from the IEEE Computer
`Society and from the IEEE Standards Office. A pre-
`vious version of the draft (P1387.2/Draft 13) is also
`available by anonymous ftp [16]. See [15] and [23]
`for a more detailed discussion of
`the status of
`P1387.2 as of April 1995.
`Suggestions for
`follow-on activity include a
`guide to best use of the current standard, a profile
`for DOS (and related) systems, version and patch/fix
`management, policies
`in distributed management
`(especially related to the definition of the ‘‘success’’
`of an operation) and associated recovery policies, file
`sharing and client management, hierarchical distribu-
`tion, scheduling, and queuing and queue manage-
`ment.
`Because P1387.2 does not specify the means by
`which distributed functions occur,
`the System
`Management Working Group at X/Open are working
`to provide the necessary specifications to permit dis-
`tributed interoperability.
`Relevance
`P1387.2 focuses on the distribution of software
`by installation. Software Service (patching software)
`has explicitly been left out of the standard, because
`existing schemes currently in use were too diverse.
`A possible solution is described in the rationale,
`though.
`Once ISVs and all our internal developers of
`software,
`technology and library data use P1387.2
`for
`their products,
`initial
`installation of software
`might become easier
`(or at
`least more similar).
`Hopefully even software
`service
`(i.e.,
`applying
`patches) will eventually become standardized.
`However, unless all our changes to all our
`software is done using standard procedures, we will
`have to clone file servers. Even if we were able to
`install all changes in a standard way, we would need
`some kind of queuing, so that we first can test the
`changes, before all servers are updated. Updates
`would have to be scheduled for night
`time. We
`don’t believe that there will be a standard or a com-
`mercial product for cloning file servers any time
`soon.
`
`OpenDist
`No available tool solved all important require-
`ments, so we decided to implement our own tool set.
`OpenDist consists of administrative tools taking care
`of scheduling updates, statistical
`tools to report
`changes and performance of updates, and distribution
`tools do the actual update. Design goals are modu-
`larity and flexibility. The tools should be indepen-
`dently usable
`entities,
`easily exchangeable
`and
`should work together in changeable configurations.
`All tools are implemented in Perl [12], Version 5.
`Wherever possible, existing standard tools are used:
`e.g., GNU tar (gtar(1)) gzip, etc.
`
`All tools currently use a command line inter-
`face. A more convenient graphical user interface for
`casual users will be added using TkPerl or a HTML
`browser.
`Administrative Tools
`These manage the scheduling of updates and
`the distribution tools. Software administrators
`call
`can subscribe and unsubscribe to software packages,
`query the software package database for information
`about each software package, temporarily postpone
`the update of selected packages, force an immediate
`update of selected packages, or roll-back updates.
`There are tools to browse the update history and to
`retrieve information about
`the status of each file
`server and software package.
`Information about our software packages is
`stored in a flat (ASCII) file database, and includes:
`name and purpose of the package, status (test, old,
`current), dependency between packages, grouping of
`packages into bundles,
`recommended update fre-
`quency, contact information of package maintainer,
`etc.
`Statistical Tools
`information.
`performance
`These
`display
`Transfer rates and update duration are indicators of
`bottlenecks and problems with WAN lines. We
`need early indicators to be able to upgrade our net-
`work in a timely manner. The update history can be
`shown, as well as the current and historic free disk
`status on the file servers.
`Distribution Tools
`These distribute software packages, replacing
`rdist in our environment. The tools are optimized
`for low speed links with high latency. Software
`updates are transmitted in a compressed format.
`They try hard to never leave a package in an
`inconsistent state, i.e. an update must be completed
`or has to be rolled back. Pre- and post-processing
`scripts are supported to save files before updating, or
`restart a license server after updating.
`
`The OpenDist Distribution Engine
`The OpenDist distribution engine is imple-
`mented in several
`independent stages. Each stage
`has clearly defined input and output data formats.
`Tools can be easily combined and exchanged as long
`as the interface does not change. The different steps
`can be pipelined and performed in parallel when
`updating several software packages to further speed
`up the update and to optimized resource usage. For
`lines that do not support independent data transfers
`in both directions,
`the sending of updates and the
`retrieving of index files should not be done in paral-
`lel.
`
`For our important packages we run the update
`once a day on the distribution server. The distribu-
`tion process is controlled and mainly run on this
`
`1995 LISA IX – September 17-22, 1995 – Monterey, CA
`
`187
`
`Page 8 of 15
`
`
`
`OpenDist – Incremental Software Distribution
`
`Osel & Gänsheimer
`
`server. We do not use the master file server but a
`dedicated machine as the distribution server. This
`server must have a direct access to all file servers.
`The index files and archives are temporarily stored
`on a holding disk.
`The Update Flow
`A package is updated in the following steps:
`(cid:15) INDEX
`Create an index of the software package on
`the master and all slave servers. Compress
`and transfer to the distribution server’s data-
`base.
`(cid:15) COMPARE
`Compare the master index against each slave
`index. Output a list of changed file attributes.
`Exceptions are handled here.
`(cid:15) BUILD-ARCHIVE
`Build a compressed