|
Last updates: Thu Dec 1 10:00:21 2005 ... Tue Mar 20 09:34:53 2018
What Ada compilers are available?
The only Ada compiler that we have is GNU gnat, available only on GNU/Linux x86 and x86-64 systems.
What C compilers are available?
Almost all of our systems have a vendor-provided traditional C compiler named cc, and a 1989 ISO Standard C compiler named c89. Oracle/Sun Solaris, and most current GNU/Linux and BSD systems, also have a 1999 ISO Standard C compiler named c99.
All of our systems also have the GNU C compiler, named gcc. On GNU/Linux, FreeBSD, Mac OS X, OpenBSD, and NetBSD systems, the native compiler cc is just a version of the GNU C compiler, and usually older than the gcc that is installed in /usr/local/bin, or on newer systems, /usr/uumath/bin.
The GNU C compiler is normally version 4.x.y, or on older systems, 3.x.y or 2.95.3. However, several systems also have newer versions named gcc-5.x.y, gcc-6.x.y, gcc-7.x.y, or gcc-8.x.y. Many systems have additional versions of the GNU C compiler; list them like this:
% ls /usr/local/bin/*gcc* % ls /usr/uumath/bin/*gcc*
As of early 2007, GNU/Linux x86-64 and IA-32 systems, and FreeBSD 6.0 IA-32, have dgcc, an extended version of the GNU C compiler with support for decimal floating-point arithmetic, a feature that has been proposed for addition to future ISO Standards for C and C++; see the /usr/local/src/dectest/doc directory for documentation. Support for other platforms will follow as soon as the compiler team completes the necessary development work.
The dgcc compiler itself recognizes three new decimal data types, but comes with no library support, so all that it can do is provide add, subtract, multiply, and divide operations on decimal data. The mathcw library provides the necessary library support with a complete C99 function repertoire, and all functions are documented in section 3cw of the local manual pages. For example,
% man -s 3cw sqrt
provides a description of the supported binary and decimal floating-point data types, and how to compile and link code that uses them. Decimal floating-point suffixes for constants are DF (32-bit format, 7 decimal digits), DD (64-bit format, 16 decimal digits), and DL (128-bit format, 34 decimal digits). The mathcw library manual pages for scanf() and printf() describe the new input/output format extensions, which also include support for digit grouping, exponent-width control, based numbers, and binary and octal floating-point data representations. A book about the new library is available .
The decimal floating-point support underneath the dgcc compiler and mathcw library is provided by the IBM decNumber C library, which is based on 25 years of experience with decimal arithmetic in software in the Rexx and NetRexx programming languages, and forms the basis of firmware and hardware implementations of decimal arithmetic in IBM zSeries mainframes (2006) and IBM PowerPC CPUs (2007).
Apple Mac OS X, Compaq/DEC OSF/1 Alpha, GNU/Linux x86-64, GNU/Linux IA-32, GNU/Linux IA-64, SGI IRIX MIPS, and Oracle/Sun Solaris SPARC systems also have the AT&T/Princeton C compiler, lcc. That compiler is small and fast, and has excellent diagnostics, especially with the -A -A command-line option. It conforms strictly to the 1989 ISO Standard, and as result, will not compile certain system header files that rely on compiler-specific C-language extensions. Nevertheless, if you can get your code to compile successfully with it, then you can be fairly confident of its portability to other systems.
Some systems (GNU/Linux IA-32 and x86-64, Mac OS X) have the LLVM clang and clang++ compilers and static analyzer, and some even have several different versions that you can list like this:
% ls /usr/local/bin/clang* % ls /usr/uumath/bin/clang*
The DotGNU C/C# compiler, cscc (available on most of our GNU/Linux, FreeBSD, and Solaris 10 SPARC systems) compiles C and C# code into executables that can be run in the Common Language Infrastructure (CLI) virtual machine, implemented in Microsoft .NET Framework on Microsoft Windows, the Mono Project mono on Unix, and the DotGNU ilrun on Unix.
All of the GNU/Linux x86-64, IA-32, and IA-64 systems, and some of the Mac OS X Intel-based systems, have a relatively old version of the Intel C/C++ compiler, icc.
All of the GNU/Linux CentOS 6 and 7 x86-64 systems have the Portland Group C / C++ / Fortran compiler family, pgcc and pgc++ (same as pgCC).
GNU/Linux systems on IA-32, x86-64, and IA-64 have the Linux Standards Base (LSB) C compiler, lsbcc. It is a binary wrapper around a recent version of gcc that provides a much-stricter compilation and run-time environment intended to provide binary application portability across multiple O/Ses on the same CPU platform. If your code will compile, build, and validate with this compiler, you have achieved a worthy level of portability. The associated lsbappchk tool can be used to check LSB conformance of compiled executables and shared libraries.
The GNU/Linux IA-32 and x86-64 systems also have the Solaris Studio compilers (c89, cc, c99, CC, f77, f90, and f95) ported from Solaris to those operating systems. Because their names conflict with already-installed compilers, they are not in the default system search path, PATH. To use them, we recommend that you run them by explicit pathname, e.g, /usr/local/ashare/sys/solstudio12.2/bin/cc, rather than adjusting the search path, so as not to hide vendor-provided compilers, or better, use local aliases (solc89, solcc, solc99, solCC, solf77, solf90, and solf95).
Because of the use of the C99 data type long long in many GNU/Linux header files, it is unlikely that solc89 will be usable on such systems, since it rejects that data type. Use solcc solc99 instead.
Stable versions of the Oracle/Sun Solaris SPARC and IA-32 compilers are available under the names solc89, solcc, solc99, solCC, solf77, solf90, and solf95.
The GNU/Linux x86-64, IA-32, and IA-64 systems have the opencc compiler from the Open64/PathScale Project, which is a derivative of the venerable SGI MIPS C compiler.
The GNU/Linux x86-64 and IA-32 systems have the nvcc compiler, which nVidia developed from the Open64/PathScale Project. That compiler is capable of generating code to use the high-performance floating-point units in nVidia graphics display processors that are available on some of our systems. It should be noted, however, that most current graphics processors do not implement arithmetic that fully conforms to the IEEE 754 Standard. More information is available at the nVidia CUDA Project site. Copies of documentation files are available locally in the directory /usr/local/src/cuda/.
GNU/Linux, OpenBSD, and Solaris (all on IA-32), and Solaris on SPARC, have pcc, a version of the original Portable C Compiler that first made C available outside Unix systems in the late 1970s. That enhanced version supports C99 language features, but does not supply the C99 library. The compiler is also installed on several other platforms, but may not be fully operational on them.
The IA-32, IA-64, and x86-64 GNU/Linux systems have the tcc tiny C compiler. It is approaching C99 conformance, and is amazingly fast: 15 times faster than icc, 10 times faster than gcc, and 4 times faster than lcc, in tests with large libraries of code. It generates only code for IA-32 and x86-64, but the resulting IA-32 executables run without problems on IA-64 as well.
The GNU/Linux (x86-64/EM64T, IA-32, and IA-64) and SGI IRIX systems have the upc compiler, a Unified Parallel C extension of gcc.
The Solaris SPARC systems and GNU/Linux (x86-64, IA-32, and IA-64) have the ch interpreter for C/C++, with a built-in help system for the syntax of those languages.
The IA-32 GNU/Linux systems have the cyclone compiler. Cyclone is a C-like language that introduces language extensions for type-safe programming. Some C programs compile without problems with the Cyclone compiler, and if they do not, then the errors reported may well indicate unsafe coding practices that should be repaired.
What C++ compilers are available?
The C++ language has been in development since 1979, but its evolutionary path has meandered a lot, so it was not until 1998 that an ISO C++ Standard appeared (and was updated in 2003). Consequently, our C++ compilers each represent a different snapshot of that evolution, and portability of C++ code suffers seriously. Even a simple Hello, world program needs about three different versions to work on all of our systems.
Regrettably, the complexity of the C++ language has delayed compiler and library development, so even in late 2005 when this sentence was written, almost none of our C++ compilers claim conformance to the 1998 ISO C++ Standard. However, the Intel and Oracle/Sun compilers claim full conformance, and the very recent GNU compilers are fairly close to conformance.
All of our systems have a vendor-provided traditional C++ compiler named either CC, or on GNU/Linux, Mac OS X, NetBSD, and OpenBSD, c++.
There is regrettably no standard file-naming convention for C++ source and header files. Some systems expect suffixes .C and .H, most expect .cc and .hh, and a few may also accept .cxx and .hxx. The first style is highly undesirable, because it does not port to case insensitive filesystems, such as Microsoft Windows and older Mac OS systems.
All of our systems also have the GNU C++ compiler, named g++. On GNU/Linux, FreeBSD, Mac OS X, OpenBSD, and NetBSD systems, the native compiler c++ is just a version of the GNU C++ compiler, and is usually older than the g++ that is installed in /usr/local/bin.
The GNU C++ compiler is normally version 4.x.y or, on very old systems, 3.x.y or 2.95.3.
The Oracle/Sun Solaris CC compiler ported to GNU/Linux IA-32 and x86-64 is available; see the FAQ entry for C compilers.
The GNU/Linux x86-64, IA-32, and IA-64 systems all have an old version of the Intel C/C++ compiler, icc. It behaves as a C++ compiler according to the source file extensions, but you can force compilation of C code in C++ mode with the -Kc++ option.
GNU/Linux systems on IA-32, x86-64, and IA-64 have the Linux Standards Base (LSB) C++ compiler, lsbc++. It is a binary wrapper around a recent version of g++ that provides a much-stricter compilation and run-time environment intended to provide binary application portability across multiple O/Ses on the same CPU platform. The associated lsbappchk tool can be used to check LSB conformance of compiled executables and shared libraries.
The SGI IRIX MIPS systems have three additional C++ compilers, DCC, NCC, and OCC. The latter is based on the early practice of translating C++ to C.
The GNU/Linux x86-64, IA-32, and IA-64 systems have the openCC compiler from the Open64/PathScale Project, which is a derivative of the venerable SGI MIPS C++ compiler.
Hint: It is often very illuminating to compile C code with C++ compilers, since they are much stricter than C compilers. It is much better to find problems at compile time, than after a core dump at run time. Although there are a few obscure areas where C and C++ disagree, almost all C code that conforms to the 1989 ISO C Standard should be readily compilable by C++ compilers.
Both the 1998 and 2003 ISO C++ Standards base their run-time library on the 1989 ISO C Standard, so if you use new library features of the 1999 ISO C Standard, you may lose the ability to test your code with C++ compilers, and you will certainly lose portability to those few platforms and systems where there is no C compiler yet for the 1999 ISO Standard.
What C# compilers are available?
The Java-like C# programming language is supported in the Mono Project installations on GNU/Linux x86-64, IA-32, and IA-64 systems, Apple Mac OS X PowerPC systems, and Oracle/Sun Solaris SPARC systems. On the 64-bit x86-64 and IA-64 systems, the current installations are for the IA-32 subset architecture, since we have yet to find native 64-bit versions for our operating systems.
C# is also supported by the DotGNU Portable.NET installations on GNU/Linux on x86-64, IA-32, MIPS, PowerPC, and SPARC, FreeBSD on IA-32, and Solaris 10 SPARC.
Like Java, C# is based on a virtual machine, and the executables, once compiled on any supported platform, will run on any other platform with the DotGNU and Mono virtual machines, or the Microsoft .NET virtual machine, called Common Intermediate Language, or Microsoft Intermediate Language (MSIL).
Unlike older Java virtual machines, which interpreted code at run time, often slowly, with C#, programs are translated to native code at run time by a just-in-time-compiler. They then execute at native speeds. Modern Java virtual machines do much the same.
The Mono C# compiler for language version 1.0 is mcs, and that for version 2.0 is gmcs. Their output files are .exe files for recent Microsoft Windows versions, as well as for the Mono runtime system.
The DotGNU C# compiler is cscc.
C# programs written for the DotGNU and Mono Project compilers should compile and run without change with the Microsoft C# compiler, csc, and the AT&T/Princeton C# compiler, lcsc.
Here is an example of compilation and execution of a simple program on a local Unix system:
% cat hello.cs using System; class Hello { static void Main() { Console.WriteLine("Hello, World from C#"); } } % mcs hello.cs % ls -l hello.* -rw-rw-r-- 1 jones devel 117 Mar 13 19:07 hello.cs -rwxr-xr-x 1 jones devel 3072 Mar 13 19:05 hello.exe % mono hello.exe Hello, World from C#
Here is the same example under DotGNU:
% cscc -o hello.exe hello.cs % ls -l hello.* -rw-r--r-- 1 jones wheel 110 Apr 19 12:16 hello.cs -rwxrwxr-x 1 jones wheel 2560 Apr 24 10:34 hello.exe % ilrun hello.exe Hello, World from C#
With the DotGNU and/or Mono Project systems installed, it is possible to run any software built for .NET, and there is expected to be a rapidly growing market for that platform.
What Fortran compilers are available?
Most of our systems have compilers for the 1978 (called Fortran 77), 1990, and 1995 ISO Standards, with the expected names f77, f90, and f95.
The expected file suffixes are .f, .f90, and .f95, respectively.
Because Fortran 90 introduced a new (somewhat) free-form source format, the later compilers assume the old Fortran 66 and 77 style 72-column fixed format if the file suffix is .f, and otherwise, expect the free-form format. Some have command-line options to specify the source format: consult the appropriate compiler manual pages for details.
There are no widespread conventions for suffixes of header files: both .inc and .h have been used.
Most of our systems have the GNU Fortran 77 compiler, g77. Only the GNU/Linux x86-64, IA-32, and IA-64 systems, and the Mac OS X systems, have the GNU Fortran 90/95 compiler, gfortran.
Almost all of our systems have the AT&T Bell Laboratories Fortran-to-C translator, f2c, which handles Fortran 77 only. On Apple Mac OS X, that translator is used internally by f77. Since the translation produces intermediate files with suffix .c, you must take care not to have C and Fortran source files named with a common base name, e.g., myprog.c and myprog.f: compilation on a system that uses f2c would destroy your C source code.
We have the NAG Fortran 90 and 95 compilers, nagf90 and nagf95, on Compaq/DEC OSF/1 Alpha, GNU/Linux IA-32 and IA-64, SGI IRIX MIPS, and Oracle/Sun Solaris SPARC. These compilers were probably the first available anywhere for Fortran 90 and 95, and are unusual in that they translate the Fortran source code to C, then invoke the native C compiler to compile it. That makes it possible for the vendor to support them on most common architectures, since there is no machine-dependent backend needed.
All of the GNU/Linux CentOS 6 and 7 x86-64 systems have the Portland Group C / C++ / Fortran compiler family, pgf77, pgf90, and pgf95.
The Oracle/Sun Solaris f77, f90, and f95 compilers ported to GNU/Linux IA-32 and x86-64 are available; see the FAQ entry for C compilers.
The GNU/Linux x86-64 systems have the PathScale Fortran compilers, pathf90 and pathf95. See the remark above about setting the run-time library path for these compilers.
The GNU/Linux x86-64, IA-32, and IA-64 systems have the openf90 and openf95 compilers from the Open64/PathScale Project, which are derivatives of the venerable SGI MIPS Fortran compilers.
The GNU/Linux x86-64, IA-32, and IA-64 systems, and some of the Mac OS X Intel-based systems, have the Intel Fortran compiler ifort.
The GNU/Linux x86-64 systems have the gfortran compiler.
On GNU/Linux x86-64 systems, only gfortran supports REAL*10 and real (kind = 10) :: (80-bit IEEE 754 arithmetic in hardware).
On GNU/Linux x86-64 systems, only gfortran, ifort, solf90, and sunf90 support REAL*16 (128-bit IEEE 754 arithmetic in software). You can get automatic precision increase from REAL*8 and DOUBLE PRECISION to REAL*16, without source-code changes, like this:
% gfortran -fdefault-real-8 -fdefault-integer-8 myprog.f && ./a.out % solf90 -xtypemap=integer:64,real:64,double:128 myprog.f && ./a.out
In 2017, a few systems began to supply a Fortran compiler named flang. Like clang, it is built on top of the LLVM compiler base. We have successfully built and installed it on our CentOS 7, Ubuntu 16, Fedora 27, and TrueOS systems.
Finally, we have the struct and strsf3 converters on some systems: they convert Fortran 66 to Ratfor and SFTRAN3, respectively. SFTRAN3, in particular, is syntactically close to Fortran 77, so the conversion may be helpful in cleaning up really ancient code and moving it forward to the late 1970s. We at one time had a license for the commercial Cobalt Blue Fortran structurer, but it has lapsed for lack of use.
What Fortran-to-C translators are available?
As far as we know, there are only three Fortran-to-C translators in existence:
The first is installed on almost all local systems (only GNU/Linux MIPS lacks it). We no longer have a license for the second. The third is available on GNU/Linux IA-32, IA-64, and x86-64, IRIX MIPS, OSF/1 Alpha, and Solaris SPARC systems.
What Java compilers are available?
Most of our systems have the javac compiler, the javah C header and stub file generator, the javadoc Java API documentation generator, and the java Java interpreter.
Most also have the GNU Java compiler, gcj, and the companion gcjh header-file generator and gij Java interpreter.
However, beginning with version 7 of the GNU compiler family, all support for the Java language, and the gcj compiler, have been removed.
GNU/Linux IA-32, x86-64, and IA-64 systems, and Mac OS X systems, have the IBM Jikes Project compiler, jikes. That compiler requires that the pathname for rt.jar be defined in the CLASSPATH or JIKESPATH variable, or on the command line:
% jikes -classpath /usr/java/jre1.5.0_03/lib/rt.jar Hello.java % java Hello Hello, world
If your system does not appear to have a copy of rt.jar, simply make a private copy from a Java directory on some other system, most likely from the directory trees /usr/j* or /usr/local/j*.
GNU/Linux IA-32, x86-64, and IA-64 systems also have the IBM Java implementation. To use it, add the directory /opt/ibm/java-x86_64-60/bin (for x86-64 systems) or /opt/ibm/java-java-i386-60/bin (for IA-32 and IA-64 systems) to the front of your PATH variable (see the example below for another Java implementation). Because new versions are installed as soon as they are released, the directory names shown may be out of date. Run ls /opt/ibm to see what versions are available, adjust the PATH variable setting accordingly, and report the discrepancy in this documentation to systems staff.
GNU/Linux IA-32 and x86-64 systems have the Apache Harmony Open Source Java SE installed. This is a completely independent implementation of the entire Java environment. To use it, set one environment variable, and add the Harmony binary directory to the front of your PATH variable:
# csh and tcsh shells: % setenv JAVA_HOME /usr/local/ashare/harmony/harmony-jdk-current % set path=($JAVA_HOME/bin path) # sh, bash, ksh, pdksh, and zsh shells: $ JAVA_HOME=/usr/local/ashare/harmony/harmony-jdk-current ; export JAVA_HOME $ PATH=$JAVA_HOME:$PATH ; export PATH
When new Harmony releases are installed, old ones are usually preserved. Run the command ls /usr/local/ashare/harmony to see what versions are available. You can switch versions just by changing harmony-jdk-current to a version-specific name, such as harmony-6.0-jdk-917296.
Many of our systems have multiple version of Java compilers. You can choose a different one either by giving its full pathname, or else by putting its directory first in your PATH variable. On GNU/Linux systems, look for subdirectories under /opt/ibm and /usr/lib/jvm. On Oracle/Sun Solaris systems, look for directories /usr/j*.
What Lisp compilers are available?
We support three main dialects of Lisp:
Emacs Lisp is most widely available, since it is part of the highly portable GNU Emacs text editor. We have it on all of our Unix operating systems, including Mac OS X, and also on Microsoft Windows systems where Emacs has been installed. Complete documentation for the language can be found in the elib, elisp, and elisp-intro nodes of the info system, available inside Emacs (type C-h i to enter, ? for quick help, h for a tutorial, and q to quit), as well as in the standalone info and xinfo utilities. It is also described in the books GNU Emacs Lisp Reference Manual (ISBN 1-882114-73-6) and An Introduction to Programming in Emacs Lisp (ISBN 1-882114-56-6).
Emacs Lisp is unusual among modern Lisp dialects in having dynamic scoping instead of lexical scoping, a feature that its author maintains is essential for the behavior of Emacs.
Emacs Lisp is certainly the most widely used of all Lisp dialects: many of the characters that you type in the Emacs text editor result in execution of one or more Lisp functions.
We have at least two implementations of Common Lisp: clisp and gcl (GNU Common Lisp). Both are available on GNU/Linux IA-32 and x86-64, and on Solaris SPARC. GNU/Linux (PowerPC and SPARC), Mac OS X, NetBSD, OSF/1 Alpha, SGI IRIX, and Solaris IA-32 have only clisp. Neither is available on FreeBSD, GNU/Linux (Alpha, IA-64, and MIPS), and OpenBSD. gcl is documented in the gcl-si node of the info system, and the cl node describes support for part of Common Lisp inside Emacs Lisp.
Some systems also have ecl (Embeddable Common Lisp) and sbcl (Steel Bank Common Lisp).
ISLISP, defined by ISO Standard ISO/IEC 13816:1997, is a simpler language derived from Common Lisp. Some systems have ISLISP installed under the equivalent names islisp, openlisp, and uxlisp.
For Scheme, we have three implementations:
We attempt to build each new release of these Lisp implementations on all of our systems, but sadly, most are not as portable as we would like. GNU Clisp is a notable exception: thanks to work in 2017–2018 by its maintainer and the author of this FAQ, recent versions now can be built on almost every modern platform.
Some systems have newlisp.
The Reduce symbolic algebra system is based on Portable Standard Lisp, but its syntax is quite different from modern Lisp, and it is unlikely that new code would be written in it. Reduce can also be built on top of another Lisp variant, Codemist Standard Lisp (csl), but it too is unlikely to be encountered outside of Reduce installations.
What Pascal compilers are available?
We have the GNU Pascal compiler, gpc, only on Compaq/DEC OSF/1 Alpha, GNU/Linux x86-64 and IA-32, and Oracle/Sun Solaris SPARC.
We have the Free Pascal compiler, fpc and ppc386, only on GNU/Linux IA-32, x86-64, PowerPC, and SPARC systems, and on Mac OS X PowerPC systems.
What other compilers are available?
There are a surprising number for older or less-well known languages. For details, see Fun with Fibonacci, where a small programming problem, the computation of the famous Fibonacci sequence, is solved in at least 45 different programming languages.
Recently developed programming languages and their compilers are:
What scripting languages are available?
Scripting languages are small languages that are usually much easier to learn than ordinary programming languages. Scripting languages are usually interpreted, rather than compiled, allowing rapid code development and testing at the possible expensive of somewhat slower runtime. Most of them have powerful features for string processing, and support associative arrays, which allow strings as indexes, providing a powerful text-to-text mapping capability. Most scripting languages have only two data types: strings and numbers. Type declarations are not required for program variables. Strings have unlimited lengths, and numbers are represented as floating-point values (usually the IEEE 754 64-bit binary format, although we have variants that provide 80-bit and 128-bit binary arithmetic, and 128-bit decimal arithmetic).
Popular scripting languages available on most of our systems include at least these (and parenthesized lists show names for variants):
These scripting languages are documented in numerous books, and most also have entries in the online info system available inside the emacs text editor, and with the standalone info program, or at least are summarized in Unix man pages.
All of our systems have the GNU debugger, gdb, and almost all have the GUI front end for it, ddd.
Compaq/DEC Alpha OSF/1 and SGI IRIX systems also have dbx, xdbx, and xxgdb.
GNU/Linux x86-64 and IA-32 systems also have fdb and idb.
GNU/Linux x86-64, IA-32, and PowerPC systems also have valgrind, a tool suite for debugging and profiling.
GNU/Linux IA-64 systems also have idb.
Oracle/Sun Solaris IA-32 and x86-64 systems also have adb and dbx.
Oracle/Sun Solaris SPARC systems also have adb, dbx, xgdb, and xxdbx. The dbx debugger has a very useful feature in the check command, which has options for checking memory access, memory leaks, and heap memory use; they can be helpful in catching use of uninitialized variables, and out-of-bounds array and pointer references.
Oracle/Sun Solaris systems have the ctrace C program debugger. They also have the Solaris Studio integrated development environment, sunstudio, for editing, compiling, debugging, and tuning C, C++, Fortran, and Java programs.
Systems with the NAG compilers have the dbx90 debugger for Fortran 90/95.
All systems have the Electric Fence library which can be used by including the options -L/usr/local/lib -lefence on a compiler command that produces an executable program. That library replaces the standard C library memory-allocation routines with new versions that check for heap-memory corruption, out-of-bounds pointer references, and duplicate free-memory operations.
All of our systems have the GNU gprof utility for analyzing execution traces and producing histograms of function times, and the gcov utility for producing a test-coverage report showing execution frequencies of lines of code, and importantly, identifying unused (and, likely, untested) code. They work for any language that can be compiled with the GNU Compiler Collection.
GNU/Linux x86-64 and IA-32 systems, Oracle/Sun Solaris systems, also have the tcov test-coverage utility. It works for any language that can be compiled with the Oracle/Sun compilers.
On all systems, the pawk and pgawk interpreters for awk programs produce execution profiles.
On all systems, dprofpp profiles perl program execution.
ocamlprof profiles execution of ocaml programs.
What syntax checkers are available?
Syntax checkers look for common portability problems and especially dark corners of the language where language standards are unclear, or decree that compiler behavior is implementation defined. All code developers should use them routinely.
For Fortran (66, 77, 90, 95, and HPF), use the excellent ftnchek. It also has options to produce prettyprinted declarations. If you develop Fortran code, then you should be using ftnchek as often as a Fortran compiler, because it is the only tool that we have that can do cross-module checking. Module interface errors are not diagnosed by any Fortran compiler that we know of, because those compilers process each routine in isolation.
For C and C++, use splint, or if unavailable, its predecessor lclint. Otherwise, fall back to lint (Compaq/DEC OSF/1 Alpha, FreeBSD, NetBSD, OpenBSD, SGI IRIX MIPS, and Oracle/Sun Solaris). Other tools that can be helpful in detecting portability and security problems are antic, flawfinder, its4, and rats.
A few systems (GNU/Linux IA-32, Mac OS X) have the clang static analyzer, which can also function as a compiler.
A few systems (GNU/Linux IA-32 and x86-64) have the cppcheck analyzer for C++ and C code.
We do not know of any separate syntax checkers for C# and Java, except for jlint. The languages are strongly typed, and carefully defined in terms of an underlying virtual machine that is identical on all platforms, so compilation should detect all syntactic errors, and there should be no portability pitfalls of the type that other syntax checkers look for.
What system call tracers are available?
FreeBSD, Mac OS X, NetBSD, and OpenBSD systems have the ktrace kernel tracer that creates a trace-data file that must be subsequently analyzed with a separate utility, kdump.
GNU/Linux systems (all CPU architectures) have the strace system-call tracer.
Compaq/DEC OSF/1 Alpha systems have the trace system-call trace utility.
SGI IRIX systems have the /usr/sbin/par tool (not to be confused with the /usr/local/bin/par paragraph-reformatting utility) for system-call tracing.
Oracle/Sun Solaris systems have the truss system-call tracer.
What prettyprinters are available?
Prettyprinters are utilities that clean up and standardize the layout of code, usually making it more readable than most handwritten code is.
For Fortran 66, use /usr/local/plot79/pretty. For Fortran 77, try strsf3 or sf3lex.
For C, use the ancient and venerable, but rigidly inflexible, C beautifier, cb (Compaq/DEC OSF/1 Alpha, SGI IRIX MIPS, and Oracle/Sun SPARC), or better the powerful and flexible indent, available everywhere. The author of this FAQ has a personal options file for that program that looks like this:
% cat $HOME/.indent.pro -bap -bcd -bl -bli0 -di4 -i4 -ncdb -nce -ncs -nfc1 -nip -npcs -nsc
The LLVM 3.4 compiler release of January 2014 includes a new tool for prettyprinting C, C++, and Java programs, clang-format. Run it with the command-line option --style=WebKit to get output similar to that produced by indent with the options shown earlier.
For C++, try bcpp; its many options are documented in its manual page.
For C, C++, and Java, try astyle, also available everywhere. It too has lots of options that let you fine-tune the final appearance of your code.
Warning: For many programmers, code layout is a matter of strong personal preference: be prepared to experiment with prettyprinter options to get the appearance that you seek.
Where are programming languages specified?
Although textbooks and vendor manuals often describe programming language syntax, semantics, and libraries in considerable detail, they are secondary sources. They cannot be considered definitive specifications of how the languages and their libraries are expected to behave, especially when vendors are so fond of supplying often-proprietary extensions.
The formal specifications of major programming languages are complex documents published by national standards bodies, such as ANSI (American National Standards Institute) or BSI (British Standards Institute), or international standards bodies, such as ECMA (European Computer Manufacturers Association) or ISO (International Organization for Standardization).
It is a curious fact that most standards bodies resell work at high prices that was entirely developed at no cost to the standards organization by volunteers over many years at considerable personal expense.
Although electronic versions of all ECMA Standards are free, both printed and electronic versions from other standards bodies are often expensive, sometimes with costs several times that of a typical textbook, and sometimes even when the document has only a few pages. Prices vary: ANSI sometimes charges much less than ISO for the same document. In rare cases, electronic copies of older standards are free.
The licenses of electronic copies usually forbid redistribution or public posting, and permit only a single printed copy to be made. For that reason, we cannot put copies of these documents in public directories. Please consult systems staff if you need to see an original standards document.
However, for academic and other noncommercial purposes, it may be sufficient to have access to recent working drafts of the standards committees. For that reason, we have collected a number of these documents, and any related free standards and tutorials, in subdirectories of the local (not Web-accessible) directory /usr/local/doc: c, clisp, cobol, cs (C#), f77, fortran, icon, java, LIA (language-independent arithmetic), pascal, and scheme.
How can I speed up compilations?
There are at least four different ways that you can make compilations of large software packages run faster:
How do I finding missing libraries?
For vendor-provided compilers, the associated compiler-provided libraries should always be found automatically at link time, because their locations are either supplied by the compiler to the linker, or they are found in a set of directories defined in a system configuration file. On our GNU/Linux systems, that file is /etc/ld.so.conf. It may have include directives that causing reading of additional configuration files, typically, files named /etc/ld.so.conf.d/*.conf.
However, we have numerous locally installed compilers, and it has proven impossible to add their library locations to the system configuration file, because doing so soon produces conflicts and confusion about which library versions are needed. It also produces maintenance headaches when local additions are discarded by a subsequent vendor-supplied system update.
When you specify a compiler library option -lxyz, it refers to either a static library named libxyz.a, or a shared library named libxyz.so. The latter is often a symbolic link to a versioned filename, such as libxyz.so.3.1.4. Curiously, that library version is frequently unrelated to the software package version from which it was created and installed. The first number in the version string (here, 3.1.4) is called the major version, the second is the minor version, and the third is the patch level. For most libraries, the major version changes only when the programming interface changes incompatibly, such as by removing a global symbol, or changing function-call signatures (number and types of arguments and return values). Adding new functions to a library does not require a major version change, because existing user code is likely to be unaffected; however, the minor version number is likely to be incremented for such additions. Thus, software that was originally linked against library version 3.1.4 should continue to work for versions 3.1.5, 3.1.6, 3.1.7, …, and will probably work for versions 3.2.0, 3.2.1, …. However, when version 4.0.0 is installed, many O/S distributions delete the older library versions, forcing you to update your own executables, which at least means recompiling and relinking, and might also require code modifications if signatures are changed for library functions that you call.
Unless overridden by -Bstatic or -Bdynamic options, the linker chooses a shared library in preference to a static library. Code from static libraries is copied into the executable program at link time, and such libraries are therefore not required at run time. By contrast, code from shared libraries is only referenced in the executable, and the reference only includes the library filename, not its complete pathname.
Using shared libraries reduces executable file size, and ensures that any subsequent library updates are automatically included the next time that you run the program, but costs a bit more time in program startup, because the run-time loader has to find the required shared libraries, map them into memory, and then patch all addresses in the memory image of the executable that refer to shared library symbols.
Thus, for locally installed libraries, you need to tell the linker where the library is found at link time with a -L option, and you should also tell the linker to record those paths in the executable with a -Wl,-rpath, option so that they can be found at run time. On Solaris systems, use -R instead. Those options all expect to be followed by a colon-separated list of directories. Here are examples:
# GNU/Linux CentOS 5 and 6: % cc -o myprog -L/usr/local/lib64 -Wl,-rpath,/usr/local/lib64 -lgmp # GNU/Linux CentOS 7: % cc -o myprog -L/usr/uumath/lib64 -Wl,-rpath,/usr/uumath/lib64 -lgmp # Solaris: % cc -o myprog -L/usr/uumath/lib -R/usr/uumath/lib -lgmp
If you omit the -Wl,-rpath, or -R options, the executable can still be run, but only if you set an environment variable to a colon-separated list of directories that the run-time loader must search for libraries:
# csh and tcsh shells: % setenv LD_LIBRARY_PATH /usr/uumath/lib64 # Bourne-family shells (sh, ash, bash, dash, ksh, mksh, pdksh, zsh, ...): % LD_LIBRARY_PATH=/usr/uumath/lib64 % export LD_LIBRARY_PATH % ./myprog
The LD_LIBRARY_PATH solution works fine for a single executable, but is impractical for a system-wide solution, because it is likely to lead to conflicts, and eventually means that numerous executables on the system cannot be run unless that variable is set correctly.
The clang and gcc compiler families support an option to list the installation directories, and the directories that are searched for executable programs and libraries. The output is hard to read because the lists are long pathnames separated by colons. Here is how to make them somewhat more readable, by changing each colon to a newline and a horizontal tab:
% gcc --print-search-dirs myprog.c | sed -e 's@:@\n\t@g' install /usr/lib/gcc/x86_64-redhat-linux/4.8.5/ programs =/usr/libexec/gcc/x86_64-redhat-linux/4.8.5/ /usr/libexec/gcc/x86_64-redhat-linux/4.8.5/ /usr/libexec/gcc/x86_64-redhat-linux/ /usr/lib/gcc/x86_64-redhat-linux/4.8.5/ /usr/lib/gcc/x86_64-redhat-linux/ /usr/lib/gcc/x86_64-redhat-linux/4.8.5/../../../../x86_64-redhat-linux/bin/x86_64-redhat-linux/4.8.5/ /usr/lib/gcc/x86_64-redhat-linux/4.8.5/../../../../x86_64-redhat-linux/bin/ libraries =/usr/lib/gcc/x86_64-redhat-linux/4.8.5/ /usr/lib/gcc/x86_64-redhat-linux/4.8.5/../../../../x86_64-redhat-linux/lib/x86_64-redhat-linux/4.8.5/ /usr/lib/gcc/x86_64-redhat-linux/4.8.5/../../../../x86_64-redhat-linux/lib/../lib64/ /usr/lib/gcc/x86_64-redhat-linux/4.8.5/../../../x86_64-redhat-linux/4.8.5/ /usr/lib/gcc/x86_64-redhat-linux/4.8.5/../../../../lib64/ /lib/x86_64-redhat-linux/4.8.5/ /lib/../lib64/ /usr/lib/x86_64-redhat-linux/4.8.5/ /usr/lib/../lib64/ /usr/lib/gcc/x86_64-redhat-linux/4.8.5/../../../../x86_64-redhat-linux/lib/ /usr/lib/gcc/x86_64-redhat-linux/4.8.5/../../../ /lib/ /usr/lib/
The embedded relative paths can be simplified with commands like either of these:
% realpath /usr/lib/gcc/x86_64-redhat-linux/4.8.5/../../../../lib64/ /usr/lib64 % ( cd /usr/lib/gcc/x86_64-redhat-linux/4.8.5/../../../../lib64/; pwd ) /usr/lib64
For libraries that are supplied with the compiler itself, notably those for the C++ and Fortran compilers built from a gcc distribution, you first have to locate the library in its installation tree:
% find /usr/uumath/ashare/gcc/gcc-8* -name 'libstdc++*' | sort /usr/uumath/ashare/gcc/gcc-8.1.0/lib/libstdc++.a /usr/uumath/ashare/gcc/gcc-8.1.0/lib/libstdc++.la /usr/uumath/ashare/gcc/gcc-8.1.0/lib/libstdc++.so /usr/uumath/ashare/gcc/gcc-8.1.0/lib/libstdc++.so.6 /usr/uumath/ashare/gcc/gcc-8.1.0/lib/libstdc++.so.6.0.25 /usr/uumath/ashare/gcc/gcc-8.1.0/lib/libstdc++.so.6.0.25-gdb.py /usr/uumath/ashare/gcc/gcc-8.1.0/lib/libstdc++fs.a /usr/uumath/ashare/gcc/gcc-8.1.0/lib/libstdc++fs.la /usr/uumath/ashare/gcc/gcc-8.1.0/lib64/libstdc++.a /usr/uumath/ashare/gcc/gcc-8.1.0/lib64/libstdc++.la /usr/uumath/ashare/gcc/gcc-8.1.0/lib64/libstdc++.so /usr/uumath/ashare/gcc/gcc-8.1.0/lib64/libstdc++.so.6 /usr/uumath/ashare/gcc/gcc-8.1.0/lib64/libstdc++.so.6.0.25 /usr/uumath/ashare/gcc/gcc-8.1.0/lib64/libstdc++.so.6.0.25-gdb.py /usr/uumath/ashare/gcc/gcc-8.1.0/lib64/libstdc++fs.a /usr/uumath/ashare/gcc/gcc-8.1.0/lib64/libstdc++fs.la % file /usr/uumath/ashare/gcc/gcc-8.1.0/lib/libstdc++.so /usr/uumath/ashare/gcc/gcc-8.1.0/lib/libstdc++.so: symbolic link to libstdc++.so.6.0.25
Our local convention for complex software packages, such as language compilers and SQL databases, is to install them in a version-specific directory under a basename directory in the path $prefix/ashare, where ashare means architecture-dependent, but otherwise shareable across similar systems. A particular compiler installation tree can then be shared across dozens of similar systems, either by mounting a directory from a common fileserver, or by having a copy of that directory copied nightly from a fileserver. We use both approaches, depending on the system, on available disk space, and on frequency of use.
The convention on many GNU/Linux distributions is that the lib directory contains 32-bit libraries, and the lib64 directory holds the 64-bit libraries. Compilers on all of our public GNU/Linux systems default to a 64-bit world. Newer distributions of several other operating systems have simply dropped support for 32-bit executables entirely, so they supply only lib directories with 64-bit libraries.
You then supply the directory location in your compiler invocation, like this on a CentOS 7 system:
% g++-8.1.0 myprog.cc -o myprog \ -L/usr/uumath/ashare/gcc/gcc-8.1.0/lib64 \ -Wl,-rpath,/usr/uumath/ashare/gcc/gcc-8.1.0/lib64
Such long library paths are a distinct nuisance, and the best way to handle them is to embed them in a Makefile with definitions and rules like this:
prefix = /usr/uumath GCCVER = 8.1.0 CXX = g++-$(GCCVER) CXXFLAGS = $(INC) $(OPT) $(XCXXFLAGS) INC = -I$(prefix)/include LDFLAGS = -L$(prefix)/ashare/gcc/gcc-$(GCCVER)/lib64 \ -Wl,-rpath,$(prefix)/ashare/gcc/gcc-$(GCCVER)/lib64 \ -L$(prefix)/lib64 \ -Wl,-rpath,$(prefix)/lib64 LIBS = -llapack -lblas -lfftw3 OPT = -g -g3 -O XCXXFLAGS = myprog: myprog.cc $(CXX) $(CXXFLAGS) myprog.cc -o $@ $(LDFLAGS) $(LIBS)
You could then compile and link your program with the specified compiler version using the command
% make myprog
or temporarily override the version, prefix, and optimization level like this:
% make GCCVER=9-20180429 prefix=/usr/local OPT=-O3 myprog
The XCXXFLAGS option in the Makefile stands for extra C++ compiler options, and its default value is empty. You might use it like this
% make XCXXFLAGS=--print-search-dirs myprog
to temporarily add one or more compiler options.
You can list the libraries that your executable needs like this:
% ldd myprog linux-vdso.so.1 => (0x00007ffda60d0000) libstdc++.so.6 => /usr/uumath/ashare/gcc/gcc-8.1.0/lib64/libstdc++.so.6 (0x00007ff6850ca000) libm.so.6 => /lib64/libm.so.6 (0x00007ff684d8f000) libgcc_s.so.1 => /usr/uumath/ashare/gcc/gcc-8.1.0/lib64/libgcc_s.so.1 (0x00007ff684b77000) libc.so.6 => /lib64/libc.so.6 (0x00007ff6847b4000) /lib64/ld-linux-x86-64.so.2 (0x000055c51dad0000)
In that output, each filename points to a particular pathname. However, if the library is missing, the pathname field is replaced by a string not found.