Posts Tagged ‘misc’

Still on Problems 1 and 2

May 19, 2009

Yes, I know that I’ve been saying 2 problems per week, and we started early last week on problems 1 and 2. However, I thought some extra time on the first two problems would give everybody some time to get going. So I set the “Current Problems” to be problems 1 and 2, due this coming Friday (May 22). I, at least, still have things to say about problem 2 :). There’s a little widget in the right-hand sidebar of the blog with the “Current Problems”. I’ll change them after this Friday.

Please leave comments below answering the following question: Do you want me to write a post each time I change the “Current Problems”, or are you happy to just move to the next problems after the “Current Problems” “due date” as passed?

If you’re already done with the “Current Problems” any given week, feel free to go on ahead working on whatever. But if you do so, please refrain from posting solutions. Let’s try to stay, online anyway, at least a little bit together. If you’re ahead, I’d say it’s a good time to spend reading some documentation or something, like the various tutorials or whatever other references you find. If you find any nice references, please share. This reading might show you other ways to approach a problem you’ve already done, which would be helpful to see.

In the same spirit, though, after we get going, if you have something new to say about a previously “Current Problem”, feel free to write a post.

If you seriously disagree with this (or any other) “policy”, let’s talk it out in the comments below.

Advertisements

Performance Testing

May 18, 2009

For no particularly good reason, I decided to clean up the script I used recently to get my performance testing data, and thought I’d share. The script is a bash script, to be run on linux machines. The way I’ve been setting things up (inspired by what Eric said he was doing), I make a directory (folder) for each problem number, and in that directory I have a bunch of .py files, one for each different version of the program. I call my scripts, e.g., prob1-v1.py, prob1-v2.py, and so on. Each of the scripts (that I want to test anyway) is set up to take command line arguments (see my other post for how to do that). The script below assumes that the argument that is changing comes first, and any other arguments are fixed, and come after that. Also, it only works for pure python scripts currently, no sage.

The output of my script is (when things work) a URL that uses the Google Charts API to draw a graph. Visiting that output URL using your favorite (open source) browser should turn up a sweet graph (hopefully). To include the graph in a post here, click the button next to ‘Upload/Insert’ above the editor box (when you mouseover the button, it says ‘Add an Image’), then click ‘From URL’ in the top of the “popup” and go from there.

There’s as much flexibility built in to this script as I could handle this evening. You’ll likely have to tinker with some of the lines to get them working with whatever scheme you’ve got going at home. You’ll also have to tinker with them for different problems, because (at least) the command line arguments will change. Sections you are most likely to have to change have been noted. As things are currently shown below, I am comparing scripts prob1-v3.py and prob1-v6.py (designed to solve problem 1) with inputs ranging from 1000 to 10000 in multiples of 1000. Both of these scripts also take the list of numbers you want to sum multiples of, in the base case of the problem this is “3 5”, which is used below (variable E).

#/bin/bash

###################################
### initial setup block. mess with these values
###################################

# assume scripts have names "probP-vN.py". e.g., "prob1-v3.py"
P=1 # the problem number
V="3 6" # the versions to compare

T=50 # number of times to run each program with each input for averaging

# I should be the string of values to use as inputs
# this will be the changing argument that performance is tested against,
# which scripts should be expecting as the first argument
I="$(seq 1000 1000 10000)"

# the extra arguments that should get passed each time, not varying
E="3 5"

##################################
### url output block. may want to mess with it
##################################
# documentation available at:
# http://code.google.com/apis/chart/
##################################
# set things up here, or, after completion,
# mess with just these parts of the output
##################################
echo -n "http://chart.apis.google.com/chart?"
echo -n "cht=lxy&" # type
echo -n "chs=300x200&" # size
echo -n "chxt=x,y&" # place tickmarks on x and y axis
echo -n "chxr=0,1,10,1|1,0,5,1&" # the tickmark numbers on the axes
echo -n "chco=FF0000,0000FF&" # color for each graph
echo -n "chdl=Sets|Sums&" # legend
echo -n "chds=1000,10000,0,5,1000,10000,0,5&" # pairs of min/max for each data set
echo -n "chd=t:" # data, to be output below

##################################
### generating data block. probably mostly safe to not mess with
##################################

# generate the string of x values, from I (just put , between values)
X="$(for J in ${I}; do echo -n "${J},"; done; echo -n "|")"

( # wrap up all the plot data output, for some post-processing

# start testing
for W in ${V}; do

    # print the string of "x" values for the Google Chart data chd=t:
    echo -n ${X}

    # loop through each input to be tested
    for J in ${I}; do
	C="scale=10;(" # string we'll pass to bc to calculate average

	# run the program ${T} times, getting runtime each time
	for A in $(seq 1 ${T}); do
	    O="$(python prob${P}-v${W}.py ${J} ${E})" # program's output

	    # pull out just the runtime of the output
	    R="$(echo ${O} | awk '{print $3}')"
	    # bc doesn't like scientific notation
	    R="$(echo ${R} | sed -s 's/e/*10^/')"
	    # scale by 1000 for no particularly good reason
	    # except if runtimes are along the lines of milliseconds...
	    R="$(echo "scale=10;1000*${R}" | bc)"

	    # concatenate R to C, and a plus sign
	    C="${C}${R}+"
	done

	# cap off C, compute the average, truncate to 4 characters
	echo -n "$(echo "${C}0)/${T}" | bc | head -c 4),"
    done

    echo -n "|"
done

# all done, pull out extraneous symbols
) | sed -s 's/,|/|/g' | sed -s 's/|$//'

echo

Anyway, if you feel like using it, go right ahead. If not, no worries. It remains to be seen how much I’ll use it myself. Please share success/failure stories. If you’d like to rearrange the script to accomodate your home setup, I’m happy to try to help. I’m sure there’s all sorts of room for improvement, but I hope it’s good enough for now.

Now, what was I supposed to be doing today?

Update: Just after posting, I realized that I failed to mention that this script makes an assumption about the output of the python scripts it is running. It expects the output to look like “1234 in 0.00456587 sec”, where 1234 is the answer calculated, and 0.00456587 is the time as calculated in, e.g., the scripts of my original comparison of solutions for problem 1.

Assignments

May 14, 2009

So I noticed that two solutions used code like this to generate the Fibbonaci numbers:


a,b=0,1

...

a, b = b, a+b

Being used to older languages like C, these multiple assignments would not have been allowed.  But beyond that, it raises the question of the order in which the assignments are made.  For example, I would have thought they would be processed just as they are written, in which case the second assignment would be adding “b” to itself instead of the old “a” (which is why I used an extra temporary variable in my solution.) Since this code gets the right answer, this is apparently not the case.

All Python data types are objects, so what if the multiple assignment had more complicated side effects?  Do all the objects involved get copied in memory as an assignment like this takes place to save their state? This is probably all wrapped up in the reference system.  Maybe someone could clarify the assignment process for me.  (And save my lazy ass the trouble of looking it up.)

A Word of Warning

May 13, 2009

Well, I learned one thing from my first post: Python is really strict about indentation, since that’s how it determines loops and conditionals and such (no braces!).  I think this is a great thing, and most editors which are made for programming will help you out with this.  But I did notice that copying and pasting my code lost a couple of the tabs (though it should be fixed now), and this might make the interpreter give errors to anyone who tried it.  So I guess we gotta be a bit careful about the formatting as we post these things . . .

Hello, World

May 13, 2009

This blog is intended to contain the work of a collection of math graduate students at the University of Virginia as we learn Python (and/or Sage) by using projecteuler.net as a problem library. For those of us that need tutorials, the Python one is here, and Sage here. Surely lots of other documentation will be found and used (and linked to) in this process.

We will (at least, at the start) set two problems each week as our goal, the link in the sidebar giving which problems, and the “due date.” All of us are encouraged to post whatever we come up with as far as solutions, even if it is similar to an already posted solution. We should also feel free to post programming questions here, and any math questions or facts that we encounter while working on these problems. We should link to any pages we find helpful (programming resources, or math resources), and leave comments on each others posts whenever we are inspired to do so.

I encourage each author to tag each of their posts with their initials, so we can track authors. Presumably wordpress will also handle that for us, so we might drop this habit when we see how it all works. Also tag posts with the problem number, for example: use the tag “prob1” and “prob2”. Then all of the content for each problem can be quickly accessed. You might also consider giving your posts titles that indicate the author and the problem. Maybe also use the tag “math” if you talk about the math of a solution, “question” for questions you post, “misc” for other things? These are just ideas – do whatever makes sense to you at the time, and we’ll work it out.

If you are beaten to a solution, you may try gathering up some of the other contributor’s solutions to compare them, e.g. based on speed or generality or… It will also be fun to see what happens when different solutions claim different answers. If you modify a solution that you have already posted, e.g. to fix an error or something, start a new post (with links back), or add to your old post – do not remove content from your old posts. It’s instructive to see errors.

Let’s put all of our code in

# Preformatted Text
print "Hello, World!"

to distinguish it from other text. For authors new to wordpress, the button at the far right, at the top of the editor when you make a new post, gives more formatting options. This is where you will be able to set code as preformatted (and indented, if you chose to use it).

If any of the authors would like to mess around with more settings for this blog, they are welcome to email me, and I will change their status to administrator. Additionally, if authors have ideas about how to change how things are set up or run, they are welcome to just make a post so that everybody can see it (tag it “misc” or “admin” perhaps?). While I may have been the one to start this thing, I really want it to be a group project.

Well, that’s all I can think of for now. Happy hacking!