INF3331 - Høst 2013

Exercises INF3331 H12

Remember that all assignments, when possible, should be well documented and supplied with a runtime example. If the program is not working, a pseudo runtime example must be provided. Failing to meet these requirements automatically disqualifies the assignment from being approved.

Week 1: Sept 2-6 (Bash intro)

1.1 Modify the scientific 'Hello world' script

Write a script that takes as input a string containing a mathematical expression, evaluates the expression and prints the result to the screen. Example of use:

./hw.sh "5+5"

Should give output like

Hello world, 5+5 = 10.

Name of scriptfile: hw.sh (exercise class: Bash)

1.2 Use find to bundle files

Write a script that takes as input a directory (path) name and a filename base (such as "*.*", "*.txt" etc). The script shall search the given directory tree, find all files matching the given filename, and bundle them into a single file. Executing the given file as a script should return the original files. Hint: Combine the find command with the bundle script from the lectures.

Name of scriptfile: bundle.sh (exercise class: Bash)

1.3 Eternal loop interrupted with Ctr-C

Write a script that takes no input arguments, and repeatedly writes "All work an no play makes Jack a dull boy." The loop should be eternal, but when interrupted by Ctrl-C the script should print a suitable message to the screen before exiting. Hint: Use the bash line " trap control_c SIGINT" to capture the Ctrl-C interruption.

Name of scriptfile: jack.sh (exercise class: Bash)

Week 2: Sep 9-13 (Python intro)

2.1 Become familiar with electronic Python documentation

Write a script that prints a uniformly distributed random number between -1 and 1 on the screen. The number should be written with four decimals as implied by the %.4fformat.

To create the script file, you can use a standard editor such as Emacs or Vim on Unix-like systems. On Windows you must use an editor for pure text files - Notepad is a possibility, but I prefer to use Emacs or the ``IDLE'' editor that comes with Python (you usually find IDLE on the start menu, choose File--New Window to open up the editor). IDLE supports standard key bindings from Unix, Windows, or Mac (choose Options--Configure IDLE... and Keys to get a menu where you can choose between the three classes of key bindings).

The standard Python module for generation of uniform random numbers is called random. To figure out how to use this module, you can look up the description of the module in the Python Library Reference. Go to docs.python.org, choose you release (in the upper left corner, probably 2.7 or 3.3), go to the Library Refereence and to the Index (upper left corner). You will see an index of Python functions, modules, data structures, etc. Find the item random (module) in the index and follow the link. This will bring you to the manual page for the random module. In the bottom part of this page you will find information about functions for drawing random numbers from various distributions (do not use the classes in the module, use plain functions). Also apply pydoc to look up documentation of the random module: just write pydoc random on the command line.

Name of scriptfile: printrandom.py (exercise class: Python)

2.2 Extend Exercise 2.1 with a loop

Extend the script from Exercise 2.1 such that you draw n random uniformly distributed numbers, where n is given on the command line, and compute the average of these numbers.

Name of scriptfile: averagerandom.py (exercise class: Python)

2.3 Find five errors in a script

Consider the following Python code:

#!/usr/bin/ env python
import sys, random
def compute(n):
     i = 0; s = 0
     while i <= n:
         s += random.random()
          i += 1
     return s/n

n = sys.argv[1] 
print 'the average of %d random numbers is %g" % (n, compute(n))

There are five errors in this file - find them!

Name of scriptfile: averagerandom2.py (exercise class: Python)

2.4 Basic use of control structures

To get some hands-on experience with writing basic control structures in Python, we consider an extension of the Scientific Hello World script hw.py from the lecture notes. The script is now supposed to read an arbitrary number of command-line arguments and write the sine of each number to the screen. Let the name of the new script be hw2a.py. As an example, we can write

python hw2a.py 1.4 -0.1 4 99

and the program writes out

	
Hello, World!
sin(1.4)=0.98545
sin(-0.1)=-0.0998334
sin(4)=-0.756802
sin(99)=-0.999207

Traverse the command-line arguments using a for loop. The complete list of the command-line arguments can be written sys.argv[1:] (i.e., the entries in sys.argv, starting with index 1 and ending with the last valid index). The for loop can then be written as for r in sys.argv[1:].

Make an alternative script, hw2b.py, where a while loop construction is used for handling each number on the command line.

In a third version of the script, hw2c.py, you should take the natural logarithm of the numbers on the command line. Look up the documentation of the math module in the Python Library Reference to see how to compute the natural logarithm of a number. Include an if test to ensure that you only take the logarithm of positive numbers. Running, for instance,

python hw2c.py 1.4 -0.1 4 99

should give this output:

Hello, World!
ln(1.4)=0.336472
ln(-0.1) is illegal
ln(4)=1.38629
ln(99)=4.59512

Name of scriptfiles: hw2a.py, hw2b.py, hw2c.py (exercise class: Python)

Week 3: Sep 16-20 (Python tasks)

3.1 Combine two-column data files to a multi-column file

Write a script inverseconvert.py that performs the 'inverse process' of the following script:

#!/usr/bin/env python
import sys, math, string
usage = 'Usage: %s infile' % sys.argv[0]

try:
    infilename = sys.argv[1]
except:
    print usage; sys.exit(1)
    
ifile = open(infilename, 'r') # open file for reading

# read first comment line (no further use of it here):
line = ifile.readline()

# next line contains the increment in t values:
dt = float(ifile.readline())

# next line contains the name of the curves:
ynames = ifile.readline().split()

# list of output files:
outfiles = []
for name in ynames:
    outfiles.append(open(name + '.dat', 'w'))

t = 0.0    # t value
# read the rest of the file line by line:
for line in ifile:
    yvalues = line.split()
    if len(yvalues) == 0: continue  # skip blank lines
    for i in range(len(outfiles)):
        outfiles[i].write('%12g %12.5e\n' % \
                          (t, float(yvalues[i])))
    t += dt
for file in outfiles:  file.close()

For example, if we first apply the script above to the specific test file:

some comment line
1.5
  tmp-measurements  tmp-model1  tmp-model2
       0.0             0.1         1.0
       0.1             0.1         0.188
       0.2             0.2         0.25

we get three two-column files tmp-measurements.dat, tmp-model1.dat, and tmp-model2.dat. Running

python inverseconvert1.py outfile 1.5 \
         tmp-measurements.dat  tmp-model1.dat  tmp-model2.dat

should in this case create a file outfile, which is almost identical to the test file above. Only the first line should differ (inverseconvert.py can write anything on the first line). For simplicity, we give the time step parameter explicitly as a command-line argument (it could also be found from the data in the files).

Hint: When parsing the command-line arguments, one needs to extract the name model1 from a filename model1.dat stored in a string (say) s. This can be done by s[:-4] (all characters in s except the last four ones). Chapter 3.4.5 describes some tools that allow for a more general solution to extracting the name of the time series from a filename.

Name of scriptfile: inverseconvert.py (exercise class: Python)

3.2 Annotate a filename with the current date

Write a function that adds the current date to a filename. For example, calling the function with the text myfile as argument results in the string myfile_Aug22_2010being returned if the current date is August 22, 2010. Read about the time module in the Python Library Reference to see how information about the date can be obtained.

Name of scriptfile: add_date.py (exercise class: Python)

3.3 Make a specialized sort function

Suppose we have a script that performs numerous efficiency tests. The output from the script contains lots of information, but our purpose now is to extract information about the CPU time of each test and sort these CPU times. The output from the tests takes the following form:

...
f95 -c -O0  versions/main_wIO.f F77WAVE.f
f95 -o app  -static main_wIO.o F77WAVE.o   -lf2c
app < input > tmp.out
CPU-time: 255.97   f95 -O0 formatted I/O
f95 -c -O1  versions/main_wIO.f F77WAVE.f
f95 -o app  -static main_wIO.o F77WAVE.o   -lf2c
app < input > tmp.out
CPU-time: 252.47   f95 -O1 formatted I/O
f95 -c -O2  versions/main_wIO.f F77WAVE.f
f95 -o app  -static main_wIO.o F77WAVE.o   -lf2c
app < input > tmp.out
CPU-time: 252.40   f95 -O2 formatted I/O
...

First we need to extract the lines starting with CPU-time. Then we need to sort the extracted lines with respect to the CPU time, which is the number appearing in the second column. Write a script to accomplish this task. A suitable testfile with output from an efficiency test can be found here.

Hint: Find all lines with CPU time results by using a string comparison of the first 7 characters to detect the keyword CPU-time. Then write a tailored sort function for sorting two lines (extract the CPU time from the second column in both lines and compare the CPU times as floating-point numbers).

Name of scriptfile: ranking.py (exercise class: Python)

Week 4: Sep 23-27 (Regular Expressions)

4.1 Count words in a text

Write a script that reads a filename and a word from the command line, and reports the number of times the word occures in the file. When an optional argument -i is provided at the command line, the count should be case insensitive. Another command line flag -b should make the script respect word boundaries. Consider the following text in the file football.txt:

Football is a game involving two teams, 11 players on each team,
three referees, two goals and a ball. Players use their feet to
kick the ball into the opponent's goal. Goal keepers can use both
feet and arms inside a certain area. After 90 minutes, the Germans
win the game.

Running the script count_words.py on this file should report:

$ python count_words.py football.txt ball
Number of occurances of string 'ball': 3
$ python count_words.py -i football.txt goal
Number of occurances of string 'goal' (case insensitive): 3
$ python count_words-py -i -b football.txt goal
Number of occurances of word 'goal' (case insensitive): 2
$ python count_words-py -b football.txt goal
Number of occurances of word 'goal': 1

Name of scriptfile: count_words.py (exercise class: Regex)

4.2 Find errors in regular expressions

Consider the following script:

#!/usr/bin/env python
"""Find all numbers in a string."""
import re
r = r"([+\-]?\d+\.?\d*|[+\-]?\.\d+|[+\-]?\d\.\d+[Ee][+\-]\d\d?)"
c = re.compile(r)
s = "an array: (1)=3.9836, (2)=4.3E-09, (3)=8766, (4)=.549"
numbers = c.findall(s)
# make dictionary a, where a[1]=3.9836 and so on:
a = {}
for i in range(0,len(numbers)-1,2):
    a[int(numbers[i])] = float(numbers[i+1])
sorted_keys = a.keys(); sorted_keys.sort()
for index in sorted_keys:
    print "[%d]=%g" % (index,a[index])

Running this script produces the output

[-9]=3
[1]=3.9836
[2]=4.3
[8766]=4

while the desired output is

[1]=3.9836
[2]=4.3E-09
[3]=8766
[4]=0.549

Go through the script, make sure you understand all details, figure out how the various parts are matched by the regular expression, and find the error.

Name of scriptfile: regexerror.py (exercise class: Regex)

4.3 Explain the behavior of regular expressions

We want in a user interface to offer a compact syntax for loops: [0:12,4] means a loop from 0 up to and including 12 with steps of 4 (i.e., 0, 4, 8, 12). The comma and step is optional, so leaving them out as in [3.1:5] implies a unit step (3.1 and 4.1 are generated in this example). Consider the two suggestions for suitable regular expressions below. Both of them fail:

>>> loop1 = '[0:12]'     # 0,1,2,3,4,5,6,7,8,9,10,11,12
>>> loop2 = '[0:12, 4]'  # 0,4,8,12
>>> r1 = r'\[(.+):(.+?),?(.*)\]'
>>> r2 = r'\[(.+):(.+),?(.*)\]'
>>> import re
>>> re.search(r1, loop1).groups()
('0', '1', '2')
>>> re.search(r2, loop1).groups()
('0', '12', '')
>>> re.search(r1, loop2).groups()
('0', '1', '2, 4')
>>> re.search(r2, loop2).groups()
('0', '12, 4', '')

Explain in detail why the regular expressions fail. Use this insight to construct a regular expression that works.

Name of scriptfile: loop_regex.py (exercise class: Regex)

4.4 Interpret a regex code and find programming errors

The following code segment is related to extracting lower and upper limits of intervals (read Chapters 8.2.5 and 8.2.6):

real = \
 r"\s*(?P<number>-?(\d+(\.\d*)?|\d*\.\d+)([eE][+\-]?\d+)?)\s*"
c = re.compile(real)
some_interval = "[3.58652e+05 , 6E+09]"
groups = c.findall(some_interval)
lower = float(groups[1][c.groupindex['number']])
upper = float(groups[2][c.groupindex['number']])

Execute the Python code and observe that it reports an error (index out of bounds in the upper = assignment). Try to understand what is going on in each statement, print out groups, and correct the code.

Name of scriptfile: findallerror.py (exercise class: Regex)

Week 5: Sep 30-Oct 4 (Regular expressions, Python tasks)

5.1 Generate data from a user-supplied formula

Suppose you want to generate files with (x,y) data in two columns, where y is given by some function f(x). (Such files can be manipulated by, e.g., the datatrans1.py script from Chapter 2.2.) You want the following interface to the generation script:

xygenerator.py start:stop,step func

x values are generated, starting with start and ending with stop, in increments of step. For each x value, you need to compute the textual expression in func, which is an arbitrary, valid Python expression for a function involving a single variable with name x, e.g., 'x**2.5*cosh(x)' or 'exp(-(x-2)**2)'. You can assume that from math import * is executed in the script.

Here is an example of generating 1001 data pairs (x,y), where x=0,0.5,1,1.5,...,500 and f(x)=x*sin(x):

xygenerator.py '0:500,0.5' 'x*sin(x)'

The xygenerator.py script should write to standard output - you can then easily direct the output to a file.

Try to write the xygenerator.py script as compactly as possible. You will probably be amazed about how much that can be accomplished in a 10+ line Python script! (Hint: use eval.)

Name of scriptfile: xygenerator.py (exercise class: Regex)

5.2 Write an improved function for joining strings

Perl's join function can join an arbitrary composition of strings and lists of strings. The purpose of this exercise is to write a similar function in Python (the string.join function, or the built-in join function in string objects, can only join strings in a list object). The function must handle an arbitrary number of arguments, where each argument can be a string, a list of strings, or a tuple of strings. The first argument should represent the delimiter. As an illustration, the function, here called join, should be able to handle the following example:

list1 = ['s1','s2','s3']
tuple1 = ('s4', 's5')
ex1 = join(' ', 't1', 't2', list1, tuple1, 't3', 't4')
ex2 = join('  #  ', list1, 't0')

The resulting strings ex1 and ex2 should read

't1 t2 s1 s2 s3 s4 s5 t3 t4'
's1  #  s2  #  s3  #  t0'

Hint: Variable number of arguments in functions is treated in Chapter 3.3.3, whereas Chapter 3.2.11 explains how to check the type of the arguments.

Name of scriptfile: join.py (exercise class: Python)

5.3 Use tools to document a script

Equip one of your previous scripts with doc strings. Use either HappyDoc or Epydoc and their light-weight markup languages to produce HTML documentation of the script. The documentation should be user-oriented in a traditional man page style, and both the documentation and the documented script should be handed in.

Name of scriptfile: your_script.py, your_doc.html (exercise class: Python)

5.4 Prefix name of digital image files with date and time

JPEG images taken by digital cameras normally have names like img_1238.jpg. For sorting purposes it might be convenient to have the time and date when the picture was taken as an initial part of the filename. Modern digital cameras encode a lot of information about the picture, including the time and date, into a header in the JPEG file. The coding scheme of the header often vary with the camera vendor, apparently making it necessary to use vendor-specific software to extract picture information. However, the jhead program can parse JPEG headers coming from most digital cameras.

Running jhead on a JPEG file results in output like

File name    : tmp2.jpg
File size    : 179544 bytes
File date    : 2003:03:29 10:58:40
Camera make  : Canon
Camera model : Canon DIGITAL IXUS 300
Date/Time    : 2002:05:19 18:10:03
Resolution   : 1200 x 1600
Flash used   : Yes
Focal length : 11.4mm  (35mm equivalent: 79mm)
CCD width    : 5.23mm
Exposure time: 0.017 s  (1/60)
Aperture     : f/4.0
Focus dist.  : 1.17m
Exposure bias:-0.33
Metering Mode: matrix
Jpeg process : Baseline

A sample output is found here.

Write a Python function gettime that reads a text like the one above, extracts, the time and date when the picture was taken (Date/Time), and returns the time and date information as two tuples of strings (HH,MM,SS) and (YYYY,MM,DD). The first tuple corresponds to the time HH:MM:SS (hours, minutes, seconds), whereas the second tuple corresponds to the date YYYY:MM:DD (year, month, day). (Hint: It might be convenient to split each line wrt. : into a list l, re-join l[1:], split wrt. whitespace, and then wrt. :.)

Write another Python function prefix that takes the time and date string tuples from gettime as input, together with the name of the image file, and prefixes the filename with the date and time. For example, if img_4978.jpg is the original filename, and the tuples returned from gettime are (18,10,03) and (2002,05,19), the returned string from prefix reads

2002_05_19__18_10_03__img_4978.jpg

If the filename already is on this form, no prefix should be added (i.e., the original filename is returned).

In case you have collections of images produced by digital cameras, the filenaming functionality presented in this exercises can be very convenient. The JPEG header is destroyed by many photo manipulation and viewing programs, so keeping the date in the filename preserves this important information. Since the JPEG header is easily destroyed, you should apply jhead and the renaming procedure to fresh files not yet being processed by programs.

Name of scriptfile: jpegrename.py (exercise class: Python)

Week 6: Oct 7-11 (Python tasks)

6.1 Find the paths to a collection of programs

A script often makes use of other programs, and if these programs are not available on the computer system, the script will not work. This exercise shows how you can write a general function that tests whether the required tools are available or not. You can then terminate the script and notify to the user about the software packages that need to be installed.

The idea is to write a function findprograms that takes a list of program names as input and returns a dictionary with the program names as keys and the programs' complete paths on the current computer system as values. Search the directories in the PATH environment variable for the desired programs (see Chapter 3.2.5). Allow a list of additional directories to search in as an optional argument to the function. Programs that are not found should have the value None in the returned dictionary.

Here is an illustrative example of using findprograms to test for the existence of some utility programs used in the course:

programs = {
  'gnuplot'  : 'plotting program',
  'gs'       : 'ghostscript, ps/pdf interpreter and previewer',
  'f2py'     : 'generator for Python interfaces to F77',
  'swig'     : 'generator for Python interfaces to C/C++',
  'convert'  : 'image conversion, part of the ImageMagick package',
  }

installed = findprograms(programs.keys())
for program in installed.keys():
    if installed[program]:
        print "You have %s (%s)" % (program, programs[program])
    else:
        print "*** Program %s was not found on the system" % (program,)

Name of scriptfile: findprograms.py (exercise class: Python)

6.2 Find old and large files in a directory tree

Write a function that traverses a user-given directory tree and returns a list of all files that are larger than X Mb and that have not been accessed the last Y days, where X and Y are parameters to the function. Include an option in this function that moves the files to a subdirectory trash under /tmp (you need to create trash if it does not exist).

Hints: Use shutil.copy and os.remove to move the files (and not os.rename; it will not work for moving files across different filesystems). First build a list of all files to be removed. Thereafter, remove the files physically. The age of a file can be tested with os.path.getatime or the os.stat functions. Read about these functions in the Python Library Reference. You may also want to read about the time module.

To test the script, you can run a script fakefiletree.py that generates a directory tree (say) tmptree with files having arbitrary age (up to one year) and arbitrary size between 5 Kb and 10 Mb:

fakefiletree.py tmptree

If you find that fakefiletree.py generates too many large files, causing the disk to be filled up, modify the arguments in the maketree function call. Remember to remove tmptreewhen you have finished the testing.

Name of scriptfile: old_and_large.py (exercise class: Python)

6.3 Estimate the chance of an event in a dice game

What is the probability of getting at least one 6 when throwing two dice? This question can be analyzed theoretically by methods from probability theory (the result is 1/6 + 1/6 - (1/6)*(1/6) = 11/36). However, a much simpler and much more general alternative is to let a computer program 'throw' two dice a large number of times and count how many times a 6 shows up. Such type of computer experiments, involving uncertain events, is often called Monte Carlo simulation.

Create a script that in a loop from 1 to n draws two uniform random numbers between 1 and 6 and counts how many times p a 6 shows up. Write out the estimated probabilityp/float(n) together with the exact result 11/36. Run the script a few times with different n values (preferably read from the command line) and determine from the experiments how largen must be to get the first three decimals (0.306) of the probability correct.

Hint: Check out the random module to see how to draw random uniformly distributed integers in a specified interval.

Remark. Division of integers (p/n) yields in this case zero, since p is always smaller than n. Integer p divided by integer n implies in Python, and in most other languages, integer division, i.e., p/n is the largest integer that when multiplied by n becomes less than or equal to p. Converting at least one of the integers to float (p/float(n)) ensures floating-point division, which is what we need.

Name of scriptfile: dice2.py (exercise class: Python)

6.4 Determine if you win or loose a hazard game

Somebody suggests the following game. You pay 1 unit of money and are allowed to throw four dice. If the sum of the eyes on the dice is less than 9, you win 10 units of money, otherwise you loose your investment. Should you play this game?

Hint: Use the simulation method from Exercise 6.3.

Name of scriptfile: dice4.py (exercise class: Python)

6.5 Implement a class for vectors in 3D

The purpose of this exercise is to program with classes and special methods. Create a class Vec3D with support for the inner product, cross product, norm, addition, subtraction, etc. The following application script demonstrates the required functionality:

>>> from Vec3D import Vec3D
>>> u = Vec3D(1, 0, 0)  # (1,0,0) vector
>>> v = Vec3D(0, 1, 0)
>>> str(u)        # pretty print
'(1, 0, 0)'
>>> repr(u)       # u = eval(repr(u))
'Vec3D(1, 0, 0)'
>>> u.len()       # Eucledian norm
1.0
>>> u[1]          # subscripting
0.0
>>> v[2]=2.5      # subscripting w/assignment
>>> print u**v    # cross product
(0, -2.5, 1)      # (output applies __str__)
>>> u+v           # vector addition
Vec3D(1, 1, 2.5)  # (output applies __repr__)
>>> u-v           # vector subtraction
Vec3D(1, -1, -2.5)
>>> u*v           # inner (scalar, dot) product
0.0

We remark that class Vec3D is just aimed at being an illustrating exercise. Serious computations with a class for 3D vectors should utilize either a NumPy array (see Chapter 4), or better, the Vector class in the Scientific.Geometry.Vector module, which is a part of ScientificPython (see Chapter 4.4.1).

Name of scriptfile: vec3d.py (exercise class: Python)

Week 7: Oct 14-18 (Python tasks)

7.1 Extend the class from Exercise 6.5

Extend and modify the Vec3D class from Exericse 8.27 such that operators like + also work with scalars:

u = Vec3D(1, 0, 0)
v = Vec3D(0, -0.2, 8)
a = 1.2
u+v  # vector addition
a+v  # scalar plus vector, yields (1.2, 1, 9.2)
v+a  # vector plus scalar, yields (1.2, 1, 9.2)

In the same way we should be able to do a-v, v-a, a*v, v*a, and v/a (a/v is not defined).

Name of scriptfile: vec3d_ext.py (exercise class: Python)

7.2 Vectorize a constant function

The function

def initial_condition(x):
    return 3.0

does not work properly when x is a NumPy array. In that case the function should return a NumPy array with the same shape as x and with all entries equal to 3.0. Perform the necessary modifications such that the function works for both scalar types and NumPy arrays.

Name of scriptfile: vectorize_function.py (exercise class: Python)

7.3 Vectorize a numerical integration rule

The integral of a function f(x) from x=a to x=b can be calculated numerically by the Trapezoidal rule:

Integral of f(x) from a to b = (h/2)*(f(a) + f(b)) + \
                               h*sum(i=1,...n-1) f(a+i*h)

where h equals (b-a)/n (the number of integration points is (n+1)). Implement this approximation in a Python function containing a straightforward loop.

The code will run slowly compared to a vectorized version. Make the vectorized version and introduce timings to measure the gain of vectorization. Use the functions

f_1(x) = 1+x   
f_2(x) = exp(-x*x)*log(x + x*sin(x))

as test functions for the integration. (Hint: Implement f such that it operates on a vector x of all the evaluation points a+i*h, i=0,... n-1.)

Name of scriptfile: vectorize_integration.py (exercise class: Python)

7.4 Make a class for sparse vectors

The purpose of this exercise is to implement a sparse vector. That is, in a vector of length n, only a few of the elements are different from zero:

>>> a = SparseVec(4)
>>> a[2] = 9.2
>>> a[0] = -1
>>> print a
[0]=-1 [1]=0 [2]=9.2 [3]=0
>>> print a.nonzeros()
{0: -1, 2: 9.2}
>>> b = SparseVec(5)
>>> b[1] = 1

>>> print b
[0]=0 [1]=1 [2]=0 [3]=0 [4]=0
>>> print b.nonzeros()
{1: 1}
>>> c = a + b
>>> print c
[0]=-1 [1]=1 [2]=9.2 [3]=0 [4]=0
>>> print c.nonzeros()
{0: -1, 1: 1, 2: 9.2}
>>> for ai, i in a:  # SparseVec iterator
        print 'a[%d]=%g ' % (i, ai),
a[0]=-1  a[1]=0  a[2]=9.2  a[3]=0

Implement a class SparseVec with the illustrated functionality. Hint: Store the nonzero vector elements in a dictionary.

Name of scriptfile: SparseVec.py (exercise class: Class)

7.5 Implement Exercise 6.3 using NumPy arrays

Solve the same problem as in Exercise 6.3, but use NumPy and a vectorized algorithm. That is, generate two (long) random vectors of uniform integer numbers ranging from 1 to 6, find the entries that are 6 in at least one of the two arrays, count these entries and estimate the probability. Insert CPU-time measurements in the scripts and compare the plain Python loop and the random module with the vectorized version utilizing NumPy functionality.

Hint: You may use the following NumPy functions: random.randint, ==, +, >, and sum(read about them in the NumPy reference manual).

Name of scriptfile: dice2_NumPy.py (exercise class: Python)

7.6 Implement Exercise 6.4 using NumPy arrays

Solve the same problem as in Exercise 6.4, but use NumPy and a vectorized algorithm. Generate a random vector of 4*n uniform integer numbers ranging from 1 to 6, reshape this vector into an array with four rows and n columns, representing the outcome of n throws with four dice, sum the eyes and estimate the probability. Insert CPU-time measurements in the scripts and compare the plain Python solution in Exercise 5.4 with the version utilizing NumPy functionality.

Hint: You may use the NumPy functions random.randint, sum, and < (read about them in the NumPy reference manual, and notice especially that sum can sum the rows or the columns in a two-dimensional array).

Name of scriptfile: dice4_NumPy.py (exercise class: Python)

Week 8: Oct 21-25 (Python tasks)

8.1 Assignment and in-place modifications of NumPy arrays

Consider the following script:

from scitools.numpytools import *
x = sequence(0, 1, 0.5)
# y = 2*x + 1:
y = x;  y*=2;  y += 1
# z = 4*x - 4:
z = x;  z*=4;  z -= 4
print x, y, z

Explain why x, y, and z have the same values. How can be script be changed such that y and z get the intended values?

Name of scriptfile: NumPy_assignment.py (exercise class: Python)

8.2 Process comma-separated numbers in a file

Suppose a spreadsheet program stores its table in a file row by row, but with comma-separated rows, as in this example:

"activity 1", 2376, 256, 87
"activity 2", 27, 89, 12
"activity 3", 199, 166.50, 0

Write a script that loads the text in the first column into a list of strings and the rest of the numbers into a two-dimensional NumPy array. Sum the elements in each row and write the result as

"activity 1" : 2719.0
"activity 2" : 128.0
"activity 3" : 365.5

The script should of course treat any number of rows and columns in the file. Try to write the script in a compact way.

Name of scriptfile: process_spreadsheet.py (exercise class: Python)

8.3 Matrix-vector multiply with NumPy arrays

Define a matrix and a vector, e.g.,

A = array([[1, 2, 3], [4, 5, 7], [6, 8, 10]], float)
b = array([-3, -2, -1], float)

Use the NumPy reference manual to find a function that computes the standard matrix-vector product A times b (i.e., the vector whose i-th component is sum from j=0 to 2 ofA[i,j]*b[j]).

Name of scriptfile: matvec.py (exercise class: Python)

8.4 Replace lists by NumPy arrays

Modify the convert2.py script from Chapter 2.5 such that floating-point values are stored in NumPy arrays instead of plain Python lists. A suitable test data file is found here.

Name of scriptfile: convert2_wNumPy.py (exercise class: Python)

8.5 Rock, Paper, Scissors

Description

Rock, paper, scissors, also know as roshambo, is a simple child's game that is frequently used to settle disputes. In the game, a rock breaks the scissors, the scissors cut the paper, and the paper covers the rock. Each option is equally likely to prevail over another. If the players choose the same object a draw is declared and the game is repeated until someone prevails. For more information than you ever thought it was possible to collect about rock, paper, scissors, check out the Web page of the World RPS Society. In this computerized version the human player competes against the computer which chooses a rock, paper, or scissors randomly. The game proceeds until the human player quits the game or until a predetermined score is reached (e.g., 11 pts.) at which time the final tally is displayed. Solutions with fewer numbers of ifstatements are considered more elegant.

Input

The human player enters the number of points required for a win. During the play of the game the human player selects whether to play a rock, paper, or scissors by using the keyboard. The human player may also end the game by pressing the Control-D sequence at any time. (Ending the game early does not allow a winner to be determined if the human player is ahead.)

Output

The program will display the winner of each roshambo round along with the running score. At the conclusion of the game, the computer will display the overall winner and the final score.

Sample session

Welcome to Rock, Paper, Scissors!

How many points are required for a win? 3

Choose (R)ock, (P)aper, or (S)cissors? r
Human: rock    Computer: paper        Computer wins!

Score: Human 0   Computer 1
Choose (R)ock, (P)aper, or (S)cissors? r

Human: rock    Computer: scissors     Human wins!

Score: Human 1   Computer 1
Choose (R)ock, (P)aper, or (S)cissors? p

Human: paper   Computer: paper        A draw

Score: Human 1   Computer 1
Choose (R)ock, (P)aper, or (S)cissors? s

Human: scissors  Computer: paper      Human wins!

Score: Human 2   Computer 1
Choose (R)ock, (P)aper, or (S)cissors? r

Human: rock      Computer: scissors   Human wins!

Final Score: Human 3   Computer 1

This exercise is copied from openbookproject.net under the GNU Free Documentation License.

Name of scriptfile: roshambo.py

Week 9: Oct 28-Nov 1 (Python tasks)

9.1 Wrapping C++

Wrap a C++ array class, either using Swig (recommended) or by hand-coding the wrapper functions using the Python/C API. The C++ source code can be found here; MyArray.h, MyArray.cpp. You do not need to wrap all the functionality of the class, but the minimum requirement is that you must be able to create 1D and 2D arrays from Python, with either real or integer values. Then you should be able to assign to and read from array entries.

Hint 1: The MyArray class is a template class, which is not directly supported by Swig. You need to tell Swig to generate wrapper code for specific values of the template parameter, such as double or integer. The directive %template is central.

Hint 2: As you will see, the overloaded indexing operators () are not wrapped correctly by Swig. To access array entries from Python, you need to extend the class with additional functions (using the directive %extend). Overloaded functions set() and get() will work. You can also implement the __setitem__ and __getitem__ functions to enable standard Python indexing, but this is somewhat more challenging to implement for general 1D or 2D arrays.

To get the exercise approved you should submit the swig interface file (or the C/C++ wrapper code if you write your own), a short shell script that builds the module (running swig and compiling), and a short Python script showing that it works.

Name of deliverables: mywrap.i make_module.sh, test_array.py

9.2 Literary Analysis

Description

Your English teacher has just asked you to write a paper comparing two of the works from the free, online literature library at Project Gutenberg. Since you are a computer scientist, you decide to put your skills to use. You plan to compare your two favorite works of classic literature in terms of the vocabulary used in each. Since this a bit outside the scope of the assignment as described by your English teacher, you ask for permission before you proceed. Intruiged by your proposal, your English teacher agrees and you are ready to go.

You plan to write a program that will take a text file as input and return a report listing alphabetically all the words in the file and the number of occurances of each.

This exercise is copied from openbookproject.net under the GNU Free Documentation License.

Name of scriptfile: gutenberg.py

9.3 Math Quiz

Description

The following program runs a math quiz consisting of 10 multiplication problems involving operands between 1 and 10:

from random import randint 

correct = 0

for i in range(10):
    n1 = randint(1, 10)
    n2 = randint(1, 10)
    prod = n1 * n2
    
    ans = input("What's %d times %d? " % (n1, n2))
    if ans == prod:
        print "That's right -- well done.\n"
        correct = correct + 1
    else:
        print "No, I'm afraid the answer is %d.\n" % prod

print "\nI asked you 10 questions.  You got %d of them right." % correct
print "Well done!"

Your mission will be to do the following:

Modify the program so that the user can choose how many questions they will be asked.
Add levels to the program:
- Beginner - with operands between 1 and 10
- Intermediate - with operands between 1 and 25
- Advanced - with operands between 1 and 100
Modify the message at the end so that it says:
- Well done!: if the user answered more than 2/3 of the questions correctly.
- You need more practice: if they get between 1/3 and 2/3 of the questions correct.
- Please ask your math teacher for help!: if they get less than 1/3 of the questions correct.
Allow the user to start another quiz without restarting the program.
Let the user choose the question type: addition, subtraction, multiplication, or mixed.

This exercise is copied from openbookproject.net under the GNU Free Documentation License.

Name of scriptfile: math_quiz.py

Week 10: Nov 4-8 (Python tasks)

10.1 A web calculator

The script simplecalc.py implements a very simple calculator with a GUI. Make a web version of this calculator. (The core of the script, to be reused in the web version, is found in the function named calc.)

Name of scriptfile: simplecalc.cgi

10.2 A Django app for a movie database

All necessary info to solve this exercise is found in the Django tutorial

(https://www.djangoproject.com/) part 1 and part 3.

Create a Django app for registering movies in a database. A minimum requirement to get this

exercise approved is:

- A data model for storing title and publication date for the movie.

- A view that lists all the movies in the database

It is not necessary to build funcionality for putting data in to the database. To save time, it is possible to use

"python manage.py shell", and skip part two of the Django tutorial.

10.3 Extend 10.3 with a search api

Build a RESTful web service api to search for movies in the database created in exercise 10.3, and a client with urllib2.

This will be introduced in the lecture on November 5. Examples may be found in the source code and slides from the

lecture in 2011 (https://github.com/espenak/inf3331-djevelskap).

10.4 Corporate spammer (from exam 2010)

Write a Python script that reads contacts from a csv (comma separated values) file named “contacts.csv”, on the form:

Surname1, Given Name1, Company1, Email-address1,
Surname2, Given Name2, Company2, Email-address2,

For each of these contacts, the following personalized email should be written and sent:

Dear Given NameX SurnameX,
It is a pleasure for me to inform you that CompanyX, as a
valued customer of Foomatic Inc., has been given the great
and exclusive opportunity to buy our new software FooChart at
a reduced rate. Please see the enclosed leaflet, and do not
hesitate to contact me on Monday if you have any further
questions.
Sincerely, Corporate Weasel Foomatic Inc.

The subject of the email should be:
“Pre-sale on FooChart from Foomatic Inc.”

Here, Given NameX, SurnameX,and CompanyX should be replaced with the contact information read from “contacts.csv”. An image file named “foochart.jpg” should be attached with each email. An example contact file with fake emails is found here. Add your own contact info to this list to verify that the script works.

Publisert 11. sep. 2013 14:37 - Sist endret 4. nov. 2013 15:32

Exercises INF3331 H12

Week 1: Sept 2-6 (Bash intro)

Week 2: Sept 9-13 (Python intro)

Week 3: Sep 16-20 (Python tasks)

Week 4: Sep 23-27 (Regular expressions)

Week 5: Sept 30-Oct 4 (Regular expressions, Python tasks)

Week 6: Oct 7-11 (Python tasks)

Week 7: Oct 14-18 (Python tasks, numerical Python)

Week 8: Oct 21-25 (Python tasks, numerical Python)

Week 9: Oct 28-Nov 1 (Python tasks, mixed language programming)

Week 10: Nov 4-8 (Python tasks, web applications)

Week 1: Sept 2-6 (Bash intro)

1.3 Eternal loop interrupted with Ctr-C

Week 2: Sep 9-13 (Python intro)

Week 3: Sep 16-20 (Python tasks)

Week 4: Sep 23-27 (Regular Expressions)

Week 5: Sep 30-Oct 4 (Regular expressions, Python tasks)

Week 6: Oct 7-11 (Python tasks)

Week 7: Oct 14-18 (Python tasks)

Week 8: Oct 21-25 (Python tasks)

Description

Input

Output

Sample session

Week 9: Oct 28-Nov 1 (Python tasks)

Description

Description

Week 10: Nov 4-8 (Python tasks)