360 likes | 436 Views
Software structure and distribution How to share your code with the rest of the world. Why it has to be… AA: Kurtis Heimerl(kheimerl@cs) AB: YongChul Kwon(yongchul@cs). Overview. You’ve already “shared code” Example 1: sharing your code with you
E N D
Software structure and distributionHow to share your code with the rest of the world Why it has to be… AA: Kurtis Heimerl(kheimerl@cs) AB: YongChul Kwon(yongchul@cs)
Overview • You’ve already “shared code” • Example 1: sharing your code with you • You write a method as part of some application • You invoke it from multiple places • Example 2: sharing some else’s code • You run Firefox • There are many techniques in between • The spectrum illustrates some design principles (or at least issues)
General Issues • How hard is it for the user to make use of the shared code? • What “environment” is required? • How wide is the potential audience? • How flexible is the code? • What decisions are hard coded? • What can be customized by each user? How? • Are “easy to use” and “flexible” antonyms? • To what extent does the shared code help the user debug misuse? • Not just “if you use it correctly, you get the following results,” butalso anticipating common problems and helping users get over them • Example: documentation vs. hotline vs. open source vs. informational error messages • How efficient is it to simply invoke the code?
Basic Options • Distribute an application • Example: Firefox • Distribute object (compiled) code: • Example: standard C library (Java API) • Distribute souce: • Example: skeleton code in assignments We’ll assume code is written in C, but for many of the issuesit really doesn’t matter (for some it does).
Why C? - portability • C compiler is the first compiler when a new platform is established • Omnipresence • Fast • If it is well-written • Source level compatibility • Reuse source codes • Philosophy of UNIX family • Compile & Install & Run in new platform • Binary level compatibility? • Any ideas? • Talk about it later
Why source level compatibility? • Because there are more than 150 OSes 1BSD/2BSD/3BSD/4BSD/4.4BSD Lite 1/4.4BSD Lite 2/386 BSD/Acorn RISC iX/Acorn RISC Unix/AIX/AIX PS/2/AIX/370/AIX/6000/AIX/ESA/AIX/RT/AMiX/AOS Lite/AOS Reno/ArchBSD/ASV/Atari Unix/A/UX/BOS/BRL Unix/BSD Net/1/BSD Net/2/BSD/386/BSD/OS/CB Unix/Chorus/Chorus/MiX/Coherent/CTIX/CXOS/Darwin/Debian GNU/Hurd/DEC OSF/1 ACP/Digital Unix/DragonFly BSD/Dynix/Dynix/ptx/ekkoBSD/Eunice/FireFly BSD/FreeBSD/GNU/GNU-Darwin/Gnuppix GNU/Hurd-L4/HPBSD/HP-UX/HP-UX BLS/IBM AOS/IBM IX/370/Interactive 386/ix/Interactive IS/IRIX/Linux/Lites/LSX/Mac OS X/Mac OS X Server/Mach/MERT/MicroBSD/Mini Unix/Minix/Minix-VMD/MIPS OS RISC/os/MirBSD/Mk Linux/Monterey/more/BSD/mt Xinu/MVS/ESA OpenEdition/NetBSD/NeXTSTEP/NonStop-UX/Open Desktop/Open UNIX/OpenBSD/OpenServer/OpenSolaris/OPENSTEP/OS/390 OpenEdition/OS/390 Unix/OSF/1/PC-BSD/PC/IX/Plan 9/Plurix new/PWB/PWB/UNIX/QNX/QNX RTOS/QNX/Neutrino/QUNIX/ReliantUnix/Rhapsody/RISC iX/RT/SCO UNIX/SCO UnixWare/SCO Xenix/SCO Xenix System V/386/Security-Enhanced Linux/Silver OS/Sinix/Sinix ReliantUnix/Solaris/SPIX/SunOS/Triance OS/Tru64 Unix/Trusted IRIX/B/Trusted Solaris/Trusted Xenix/TS/Tunis/UCLA Locus/UCLA Secure Unix/Ultrix/Ultrix 32M/Ultrix-11/Unicos/Unicos/mk/Unicos/mp/Unicox-max/UNICS/UNIX 32V/UNIX Interactive/UNIX System III/UNIX System IV/UNIX System V/UNIX System V Release 2/UNIX System V Release 3/UNIX System V Release 4/UNIX System V/286/UNIX System V/386/UNIX Time-Sharing System/UnixWare/UNSW/USG/Venix/Xenix OS/Xinu/xMach/z/OS Unix System Services/ -- From http://www.levenez.com/unix/ • # of required distributions • # of binary distribution = SUM(# of architectures supported by each OS) • # of source distribution = 1 • But the user has to compile it in their systems
Option 1: Distribute a .exe • What does the user have to do to takeadvantage of your code? • How many different types of systems can your code run on? • What assumptions does your code makeabout the configuration of the user’smachine?
Option 2: Distribute a library • What does the user have to do to takeadvantage of your code? • How many different types of systems can your code run on? • What assumptions does your code makeabout the configuration of the user’smachine?
Library • Real life • You want to solve a very difficult equation • Will you devise a new numerical analysis method? • Programming • You want to store your 1G data in B+-tree • Will you write your own B+-tree? • Save your time! • There are well-defined numerical analysis methods • There are plenty of B+-tree implementations • In “Library”
A library • IS • A file contains a collection of precompiled functions • HAS • An index of the functions in library • COMES WITH • Header files to let C compiler do type-checking on the parameters and return type • Header file has • Definitions of data structures • Declarations of functions
Link library with program • Before linking • Your object file(.o) contains a lot of holes where you make function calls • Linking • Append compiled functions to your binary • Fill out the holes with address in your binary • Does every program have a copy of the printf() source?! • No.
Shared library • There are tooo common functions • Why don’t we share the code instead copy to every program? • Shared library • Only one copy in the disk • Linker will specially tag the holes • Loader will fill out the holes while your program is being loaded into memory • Reference • http://en.wikipedia.org/wiki/Dynamic_linking
Writing code - functions • Can we use any function in our programs? • Standard C library • POSIX • What if we use platform dependent functions? • BSD 4, UNIX System V • GNU • Linux/Mac OS X/Win32 • Your program won’t be compiled!
Writing code - functions • But the non-standard function I used is so convenient! • I know. But then you sacrifice the portability • What shall we do then? • Implement our own • Use a portable library • Example • bzero(char *,int) • BSD specific. Fill given buffer with zero • How can we fix it?
Writing code – allowing customization • How can a user interact with your program? • No. GUI is not a right answer here. (A GUI is an application all on its own…) • Command line • Is command line enough? • User ID/Current working directory/Home directory/OS/Hostname/Path/Login shell/Language/… • These are useful common information to every program you run • Environment variable • NAME=VALUE pairs • Define the runtime environment per user
Environment variable • System wide environment • OS/HOSTNAME • User specific environment • HOME/PWD/TERM/SHELL/PROMPT/LANG/LOCALE • System wide default environment • PATH/MANPATH/PAGER/LANG/LOCALE • Can be overridden by user • Program can use implicit information • Default behavior • They are passed down to the processes you invoke • By your shell!
CODE DATA HEAP STACK ARGV ENV KERNEL How can we use them? • By using getenv(3), setenv(3) • setenv or putenv are not standard functions • Where the environment variables are stored? • Usually below your stack • Have a look at execve(2) int execve(const char *filename, char *const argv[], char *const envp[]);
Problem: tools • Now we need to build our programs • What do we use? • At least preprocessor, C compiler, assembler, linker • Libraries • Platform specific functions • Problem? • What if the names of the programs are different? • What if the paths to the programs are different? • What if the programs accept different options? • What if the versions of the programs are different? • …
More problems: source • We have implemented our own functions for the portability • What if the platform has the function already? • What if the platform does not have the function? • We have used a library for the portability • What if the library does not exist? • What if they have different library which has the same interface? • What if the header and library files are in different location?
Solution to source problem • How can we address the source related problems? • By conditional compilation • #define, #if, #else, #elif, #endif • Typical usage • #if HAVE_FUNC1 Use func1 • #else Use our own implementation • #endif • Define constants whether we can use specific features in current system • Results in a huge header file which contains the information
Solution to tool problem • We can use variables in Makefile • Substitute the names of tools to variables • gcc –o myprog myprog.o myprog1.o • $(CC) –o myprog myprog.o myprog1.o • Append include path and library path • gcc –I/usr/local/myprog/include –o myprog … • $(CC) $(INC_PATH) –o myprog … • Now only need to fill the variables
Set up the building environment • We need to • Construct a header file of feature lists • Fill out variables of Makefiles • Nothing we can do except • Collect information of building system • Generate customized Makefile • Wait… this is just what everybody does! • GNU auto* tools • autoconf • automake • libtool • Save huge amount of time for this process
Autoconf • Autoconf • Collects information of the current system • Generates a header file(config.h) • Generates Makefiles from templates(Makefile.am) • What should we do? • Describe the libraries & tools & features you use in configure.in • Run autoconf it will generate a shell script named configure • The script does all dirty works
Automake • Automake • Generates customized Makefile from template using information collected by autoconf • What shall we do? • Write template Makefile.am in all directories under your source tree • Specify the templates in configure.in • Run automake
Ready to distribute • Congratulations! • We have got a portable source distribution! • Now users can compile & install our program by typing • ./configure • collect information • make • compile the source code • make install • copy binaries and documents to proper location
Kernel compile? • Initialize building environment • make mrproper • Collecting information • make menuconfig • Compile • make or make bzImage • Install • make install • Exactly what we have covered!
Pros & Cons • Source distribution • Binary distribution
Case study: Java • Can we distribute binary? • Why? • Do we need to distribute source code? • Why? • Do we need to collect information of the system? • Why? • Are there concepts of header file and library file? • What are they? • Are there any concepts similar to environment variable? • What are they?
Case study: Windows • Can we distribute binary? • Why? • Do we need to distribute source code? • Why? • Do we need to collect information of the system? • Why? • Are there concepts of header file and library file? • What are they? • Are there any concepts similar to environment variable? • What are they?
Case study: Java • Can we distribute binary? • Yes. If they are compiled in lower version of byte code specification. • Do we need to distribute source code? • Not necessary. But it helps people those who want to optimize the program in their environment • Do we need to collect information of the system? • Yes. Especially if the program interacts with other programs. Or to check whether there are required class libraries • Are there concepts of header file and library file? • Each class file contains all information about return type and parameter types as well as binary code • Compiler can do type-checking if it can find the classes • Classloader does loading and linking in runtime • Are there any concepts similar to environment variable? • java.lang.System.getProperty() = getenv() • java.lang.System.setProperty() = setenv() or putenv()
Case study: Windows • Can we distribute binary? • Yes. Because windows only supports x86 architecture. • Well, let’s ignore Windows for Alpha platform • Do we need to distribute source code? • Not necessary. But it helps people those who want to optimize the program in their environment. • Do we need to collect information of the system? • Not necessary unless the program requires other programs or libraries. • Are there concepts of header file and library file? • Yes. It is basically written in C/C++. • Are there any concepts similar to environment variable? • There are environment variables just like UNIX. Type ‘set’ in cmd.exe • Registry