Course Introduction
Why Study Programming Languages?
- You will have new means of expressing ideas, regardless of language
- Learning new languages will be easier
- You will be better able to discern between different languages and choose the right tool for the job
- You will get better use of the languages you already know
- You will gain better understanding of the significance of design and implementation decisions
Language Paradigms
One way to classify programming languages is by paradigm. This refers to the set of features, model of computation and mindset of different languages. In the early days of computing, dividing languages by paradigm made more sense. Now all common languages support multiple paradigms. People have suggested many ways of classifying by paradigm, but here are some of the most widely referenced:
- Imperative
- Variables, assignment, iteration are key
- Computation done by updating variables
- Has two sub-paradigms:
- Procedural: code is organized into functions, such as in C, Fortran, BASIC, Pascal, etc.
- Object-oriented: code is organized into class, such as in C++, Java, Python, Ruby, etc.
- Functional
- Computation done by applying functions to parameters
- Does not rely on modifying variables (some languages disallow this)
- Recursion more common than iteration
- Examples: LISP, Haskell, ML
- Logic
- Programs are written by specifying the rules of a system, in any order
- Computation is carried out by an engine which uses the rules to reach conclusions
- Prolog is the most widely-used logic programming language
Many modern languages have adopted features of multiple paradigms. In particular imperative languages like C++, Java, Python, C#, have adopted more functional programming features. Likewise, newer languages such as Rust, Kotlin, and Swift have all embraced multiple paradigms.
Design Trade-Offs
Languages have different design goals. In a perfect world, languages would have every good quality, but many times these are in conflict with each other. Languages have to make trade-offs such as:
- Reliability vs. Efficiency
- Checking array bounds
- Checking for exceptions
- Readability vs. Writability
- Allowing very concise forms can make code easy to write but hard to read.
- APL:
(~R∊R∘.×R)/R←1↓⍳R(This program finds all prime numbers from 1 to R). - Regular expressions
- Control vs. Ease of Use
- Pointers allow for direct control of memory but are hard to use correctly.
Implementation Methods
Different programming languages expose the underlying machine in different ways.
Compilation- Programs are translated into machine language
- Faster execution
- Takes time to compile
- Can be less flexible
- Usually taken by C, C++, Rust, Go, etc.
- An interpreter program runs the code
- Less efficient
- Very flexible
- Easier to implement
- Taken by initial versions of dynamic languages
- Compromise between compilation and interpretation
- Compiles down to "bytecode" which is interpreted
- Just in Time Compilation
- Faster than pure interpretation
- Typically taken by Java, Python, JavaScript, etc.
A Quick History of Notable Languages
Early Languages
- The analytical engine was a machine designed (but never built) by Charles Babbage. Programs were written for this machine with detailed descriptions of how to configure the machine by Ada Lovelace.
- The earliest electronic computers were "programmed" by rewiring them
- Later computers used binary instruction codes
- Then assemblers were developed
Konrad Zuse's Plankalkul is credited with being the first high-level language was designed in 1945 and included arrays, structures and floating point numbers. An assignment statement to assign the expression A[4] + 1 to A[5] would look like this:
| A + 1 => A V | 4 5 (subscripts) S | 1.n 1.n (data types)
Fortran
Fortran (originally called FORTRAN) was first developed in 1955 by John Backus at IBM. It was one of the first compiled programming languages.
Fortran was designed for scientific computing, so it had support for arrays and floating point numbers. Fortran originally had a number of different flaws:
- No strings
- No integers
- No stack
- Names limited to six characters
- Programs could only be in one file
Fortran evolved over the years to fix these flaws, and is now comparable with languages like C++. Here is an example Fortran program for multiplying two numbers:
program add
c This is a simple program to read 2 numbers and print the product
implicit none
real A,B,S
print *, ' This program adds 2 real numbers'
print *, ' Type them in now separated by a comma or space'
read *, A,B
S = A + B
print *, 'The sum of ', A,' and ' , B
print *, ' is ' , S
stop
end
Fortran revolutionized compiler technology and is still used in high-performance computing today, though hardly at all outside of that niche.
LISP
LISP was developed in 1958 by John McCarthy at MIT. It was created for artificial intelligence. It focuses on linked lists and symbolic computation.
Lisp syntax is based on S-expressions which are lists surrounded by parenthesis. All code and data are in the form of S-expressions. Below is an example of a linked list of 3 numbers:
(3 4 5)
Below is a mathematical expression:
(* 2 (+ a 1))
Below is an if/else statement in LISP (which created the structure):
(if (< a b)
a
b)
Below is a factorial function written in LISP:
(defun factorial (n)
(if (<= n 1)
1
(* n (factorial (- n 1)))))
LISP was also the first language to have garbage collection.
LISP was hurt by the fact that it was significantly slower than other languages when it first came out, and the fact that there were many different versions of it that were incompatible (Scheme and Common LISP being the biggest).
COBOL
COBOL was developed in 1959 by a committee led by Grace Hopper. COBOL was initially designed to make programming easier for non-engineers by making it closer to English. This also made it very verbose.
Below is a program that reads in two numbers, multiplies them together and prints the result
$ SET SOURCEFORMAT"FREE"
IDENTIFICATION DIVISION.
PROGRAM-ID. Multiplier.
DATA DIVISION.
WORKING-STORAGE SECTION.
01 A PIC 9 VALUE ZEROS.
01 B PIC 9 VALUE ZEROS.
01 Result PIC 99 VALUE ZEROS.
PROCEDURE DIVISION.
DISPLAY "Enter first number (1 digit) : " WITH NO ADVANCING.
ACCEPT A.
DISPLAY "Enter second number (1 digit) : " WITH NO ADVANCING.
ACCEPT B.
MULTIPLY A BY B GIVING Result.
DISPLAY "Result is = ", Result.
STOP RUN.
COBOL was the first language to separate data from code. In an effort to make more readable code, it accepted names up to 30 characters with hyphens. It also had much better I/O support than other languages at the time allowing for printed reports of data in columns.
COBOL was the standard in business applications for many years. It also led to the Y2K crisis caused by storing dates as only two digits.
ALGOL 60
ALGOL was an attempt to create a universal language that would be portable and able to solve problems in any field.
ALGOL introduced many ideas that are prevalent today:
- Names of any length
- Subscripts in brackets []
- Semicolons to separate statements
- else if clauses
- Pass by reference
Unfortunately, ALGOL included no string handling or I/O as these were seen as too dependent on the machine.
ALGOL was important to the development of computing for several reasons:
- Introduced new language ideas
- First language to have its syntax formally defined
- It was used to publish algorithms for many years
- First machine to work on multiple types of machines
Unfortunately it was never widely used because:
- It lacked I/O and string handling
- It was hard to implement because it had so many features
- Fortran (thanks to IBM) was entrenched as the standard
Here is an example ALGOL program to find the average of an array of numbers:
begin
integer N;
Read Int(N);
begin
real array Data[1:N];
real sum, avg;
integer i;
sum:=0;
for i:=1 step 1 until N do
begin real val;
Read Real(val);
Data[i]:=if val<0 then -val else val
end;
for i:=1 step 1 until N do
sum:=sum Data[i];
avg:=sum/N;
Print Real(avg)
end
end
BASIC
BASIC was first developed at Dartmouth in 1964. BASIC was designed for non-engineers to be able to program in and to be as simple as possible. BASIC was also the first popular language to be primarily interpreted instead of compiled.
BASIC was popularized because it appeared just before a time when computers were becoming more accessible:
- PCs such as the Apple II, TRS-80, VIC-20 and Commodore 64 all included BASIC interpreters
- Computing magazines distributed programs by listing the source code users could type in themselves
- Schools had time-sharing computers that many students could log into
BASIC also suffered from the problem of having many incompatible versions. Versions of BASIC are still used today, though much less than in the past.
Below is a TI-Basic program to multiply two numbers:
PROGRAM:HELLOWLD
:ClrHome
:Disp "Enter two numbers:"
:Input A
:Input B
:C <- A * B
:Disp C
SIMULA 67
SIMULA was developed in 1967 at Norwegian Computing Center. It was intended to do simulations for the sciences and other fields.
SIMULA is important because it was the first object-oriented language and introduced many new features:
- Objects
- Classes
- Inheritance
- Dynamic function binding (virtual functions)
- Co-routines
Simula is no longer used widely, but was hugely influential on OO languages to follow. Its syntax was based heavily on that of ALGOL.
Pascal
Pascal was developed by Niklaus Wirth (who had worked on the ALGOL committee) in 1970. His goal was to make a language that would encourage good programming practices and be good for teaching programming. It took and simplified concepts from other languages.
It was widely used for teaching in the 70s and 80s and spawned more powerful languages such as:
- Object Pascal
- Delphi
- Oberon
- Modula
- Turbo Pascal
program Multiply;
var
A : Integer; {comments go in braces}
B : Integer;
R : Integer;
begin
Write('Enter two numbers: ');
Read(A);
Read(B);
WriteLn('Product is: ')
Write(A * B)
end.
C
C was originally designed in 1972 by Dennis Ritchie at AT&T Bell Labs. It was originally written to develop systems software such as UNIX. It is one of the most widely used programming languages of all time and influenced many others following it.
C was essentially designed to be a portable assembly language. The first version of UNIX, along with other operating systems of the day, was written in assembly. Using a language like C allows for portable OS development.
Unlike some languages, C today is still very similar to its first incarnation.
C became so popular due to:
- The success of UNIX
- Portability
- Simplicity
- Speed
- The C Programming Language
ML
ML was developed in 1973 by Robert Milner at the University of Edinburgh. It was initially developed for the field of automatic theory proving.
ML is a functional language that influenced others such as:
- Haskell
- Ocaml
- F#
- Scala
One large contribution of ML is type inference. This means that the compiler is able to figure out the types of variables without the programmer's help. ML pioneered other language features we wlil learn from Haskell.
An example that prints the sum of two numbers using a function:
let sum a b =
a + b
print_int (sum read_int read_int)
Ada
Ada was designed over many years by the US Department of Defense as a reliable language for military and government use. It was a huge language, including features of many languages that had come before.
Ada contributed several new ideas:- Packages
- Exception handling
- Better concurrency
Unfortunately Ada's complexity hurt its success as a language. The compilers for it were initially hard to write and buggy.
Later developments of Ada fixed these problems and good compilers became available. Ada was not used much outside of the US military.
Later OO Languages
C++ was developed at AT&T Bell Labs by Bjarne Stroustrup in 1983. It was originally called "C with Classes" and attempted to add modern OO features to C. C++ is a tremendously large language with both low-level and high-level constructs.
C++ was successful because it was mostly backwards-compatible with C. Also, though it is very complicated, it introduced new features over many years.
Java was developed in 1990 by Sun Microsystems. It is an evolution of C++ with many of the more diffcult and dangerous features removed including:
- Pointers
- delete
- unsigned numbers
- Free functions
- Inline assembly
- Unchecked arrays
- Multiple inheritance
- Operator overloading
- goto
Objective-C was developed by Apple starting in the early 80s for their NeXTSTEP OS. Objective-C is mostly similar to C++ in that it adds OO features to C. The language was nearly dead until the iPhone was released, which uses Objective-C as the primary language.
C# was developed by Microsoft in 2000, primarily to avoid patent issues with Sun. C# was very similar to Java when it was first released but subsequent versions of C# have added several features not found in Java:
- Properties
- Operator overloading
- Local functions
- Record types
- Dynamic typing support via the dynamic keyword
Scripting Languages
Perl was developed by Larry Wall in 1987 to be an easier to use alternative to UNIX scripting languages. Perl has great text-processing features such as string functions and regular expression matching. Perls syntax is inspired from C and UNIX shells, and is considered by some to be ugly. Perl also was used for early server-side web programs in the form of CGI scripts.
JavaScript was developed in 1994 by Netscape Communications (which became Mozilla) as a scripting language for their web browser. It was initially going to be called "LiveScript" but was renamed to try to ride the popularity of Java. JavaScript is mostly used for client-side web scripting, but the popularity of the language has led to it branching into other domains as well. The first version of JavaScript was implemented in just one week, leading to some unusual quirks, mostly involving the type system.
PHP was developed by Rasmus Lerdorf in 1995 because he was unsatisfied with existing tools for building his personal web site. PHP has long eclipsed Perl as the most common server-side web scripting language. PHP syntax is inspired from C and Perl.
Python was first developed by Guido van Rossum in 1991. The primary focus for Python was to create a readable, elegant syntax. Python was initially used as a scripting language (often in place of Perl), though has expanded into many other areas. Python is widely used as a "glue language" to connect packages written in higher-performance languages (usually C), such as in machine learning.
Ruby was developed by Yukihiro Matsumoto in 1995. It had similar goals to Python, though Python is focused more on readability and Ruby more on writability. Ruby was mostly used in Japan until the development of Ruby on Rails, which made it much more popular.
Newer Languages
There are a number of newer languages developed more recently which have gained some measure of popularity:
- Rust was developed beginning in 2006 by Mozilla. Its goal is to create a better systems programming language, with strong reliability without sacrificing performance.
- Go was developed in 2007 at Google with the goal of making a language similar to C but with modern features such as garbage collection, and memory safety. It also has a strong emphasis on concurrency.
- Kotlin was started in 2011 by Jetbrains (the company that makes the IntelliJ IDE) as an improved version of Java with type inference, and a less verbose syntax. It has become the main language for Android mobile development.
- Typescript was developed by Microsoft beginning in 2012. It adds static typing to JavaScript, with the goal of easier development and better reliability.
- Swift was developed in 2014 at Apple with the goal of replacing Objective-C as the primary language in their platforms. Swift includes type inference and functional programming features, and better reliability than Objective-C.