Lab #1 Photon counting with a PM tube - statistics of light Two weeks - 9/29 & 9/5 (Labor day is 9/4). Report is due on 9/12 For show-and-tell on 9/5 you should have progressed to step 5. Goals What are the fundamental physical limitations on the detection of light? How do you specify how bright is a source of light is? How precisely can that brightness be specified? What determines the precision? What are the statistical properties of photons? Reading assignments: - Unix tutorial - IDL tutorial - TeX tutorial - Taylor Chapters 1 & 2 Summary This lab comprises four topics: Collect data from the photometer experiment and make a plot of the counts per sample versus time. Plot histograms of sets of samples from the photometer. Compute the mean and standard deviation for your samples and investigate the variability of the count rate. Compare the observed histograms with the theoretical Poisson probability distribution function. 1) GETTING DATA FROM THE PHOTMULTIPLIER TUBE (PMT) Before you modify anything associated with the PMT make sure you have read and understood the following. - Read the PMT web page http://ugastro.berkeley.edu/class/ay122/photon/photon.html The PMT is very sensitive---it detects individual photons. Do not expose the PMT to direct room light! Do not expose to lights so that count rate is > 1 MHz. To figure the count rate multiply the counts per sample by the sample rate.e.g., 1000 counts per sample when the sample rate is 1kHz corresponds to a 1,000,000 counts per second or 1 MHz. The first step is to acquire a time sequence of digitized data and save the data as a file. Log into one of the sun workstations. Make a Unix sub directory for your project and type the following at the Unix prompt: % echo counter nsamples=100 rate=1000 fname=c.100.1000.dat | sendphot (Note: % indicates the Unix prompt). This cryptic command takes 100 samples of data from the photo-multiplier tube at a rate of 1000 Hz, and puts the data in a file called c.100.100.dat. You should get a message indicating success. Since you asked for 100 samples, this file should contain 100 lines. In this example how long is the counter active for? What about nsamples=1000 rate=100? A sample rate of 1000 Hz means that the computer accumulates counts from the PMT for 1/1000 of a second. At this rate you should get few counts per sample. Play around with the sample rate and convince yourself that the length of the sample is inversely proportional to the sample rate. If the rate is significantly higher or lower than this then ask an instructor for help. The maximum sample rate is 5000 Hz. Note that not all sample rates are permitted. If you ask for a counter rate that is not supported then the counter program will choose the closest one available. 2) READING DATA The first challenge is to read in the data file. View your file using emacs and notice the format. It is a single column of numbers. There is an IDL program that reads a file into computer memory and labels it with an IDL variable name, in this example it is called "mydata": IDL> readcol, 'c.100.1000.dat', mydata Notice that the file name must be enclosed in single quotes. IDL names between single quotes are called strings. 3) PLOTTING DATA Once you have the data in an IDL variable you can plot it using the idl command IDL> plot, mydata To make useful graph you need to plot your data as a function of time. Label the x- (i.e., time) and y-axes (counts). Make sure that your labels include units. A plot title is also helpful and can be used to denote the sample rate. Invoke IDL help by typing a question mark at the IDL command line. Look up "PLOT" in the index and figure out how to add a title and axes to your plot. A useful on-line resource is http://idlastro.gsfc.nasa.gov/idl_html_help/home.html 4) STATISTICS---Making a Histogram Calculate the mean and standard deviation of the count rate. The IDL function "TOTAL" will turn out to be very useful. Bin the data and make a histogram of the counts. Binning means sorting the data into unique categories, and counting the number of occurrences of those categories. Use my example program in the statistics hand-out to help you solve this programming problem. Does the histogram plot that you have made really reflect that data that you collected? Carefully compare the list of counts in the data file and the plot. Once you can plot histograms with confidence repeat, say, six times and compare the results of your experiments. Does the histgoram change? Do you always get the same mean count rate? Become adept at inspecting the histogram plot and guessing what the mean and standard deviations are. Calculate the mean and standard deviation of the six count rates you just measured. If you want to automate the procedure of acquiring data create a file with the above command, cut and paste the line as many times as needed while changing whatever parameter you need. When your file is ready for execution save it and type at the Unix prompt % source myfile.script Be sure to use a unique file name for each sequence of data. Examining the data in IDL will challenge your ability to write "FOR" loops. A quick and sophisticated way to approach repetitive tasks involves writing the entire sequence of data acquisition in IDL using FOR loops. The simplest IDL FOR loop can be executed at the command line. Try this: IDL> for i=0,9 do print,i If you are plotting multiple sequences of data investigate what happens when you set the IDL plotting variable IDL> !p.multi = [0,2,3] What does this do? To get back to normal plotting set IDL> !p.multi=0 Now repeat with a bigger sample of data, i.e., increase the number of samples by a factor of 10, but keep the sample rate the same as before. % echo counter nsamples=1000 rate=1000 fname=c.1000.1000.dat | sendphot Again, take six sets of data and repeat the exercise of calculating the mean and standard deviation for each longer set of data. What do you notice about the mean count rate and the standard deviation for these sequences? Calculate the means and standard deviations of the six count rates you just measured. Why is the ensemble of measurements different when you take 100 samples and 1000 samples? 5) MEAN AND STANDARD DEVIATION Let's try and get to the root of these variations. Show that there is a relation between the number of counts and the standard deviation. Take a sequence of data with increasingly long (i.e. slow) sample times, e.g., echo counter nsamples=100 rate=5000 fname=c.100.5000.dat | sendphot echo counter nsamples=100 rate=3333 fname=c.100.3333.dat | sendphot echo counter nsamples=100 rate=2500 fname=c.100.2500.dat | sendphot echo counter nsamples=100 rate=1666 fname=c.100.1666.dat | sendphot echo counter nsamples=100 rate=1250 fname=c.100.1250.dat | sendphot echo counter nsamples=100 rate=1111 fname=c.100.1111.dat | sendphot echo counter nsamples=100 rate=1000 fname=c.100.1000.dat | sendphot echo counter nsamples=100 rate=833 fname=c.100.833.dat | sendphot echo counter nsamples=100 rate=714 fname=c.100.714.dat | sendphot echo counter nsamples=100 rate=666 fname=c.100.666.dat | sendphot echo counter nsamples=100 rate=625 fname=c.100.625.dat | sendphot echo counter nsamples=100 rate=555 fname=c.100.555.dat | sendphot echo counter nsamples=100 rate=500 fname=c.100.500.dat | sendphot echo counter nsamples=100 rate=454 fname=c.100.454.dat | sendphot echo counter nsamples=100 rate=416 fname=c.100.416.dat | sendphot echo counter nsamples=100 rate=384 fname=c.100.384.dat | sendphot echo counter nsamples=100 rate=370 fname=c.100.370.dat | sendphot Where do these odd sample rates come from? The counter in the PC has a clock rate of 10kHz, and the sample rate must be an integer multiple of this. So if you ask for a sample rate of 3000 Hz the counter will give you the nearest allowed value: 10000 Hz/3 = 3333 Hz. The fastest that the counter can operate at is 10000Hz/2 = 5000 Hz Calculate the mean count for each sequence and also the standard deviation. Suppose "xbar" and "s" are the means and standard deviations, use the command IDL> plot, xbar, s^2 to make a plot of the mean versus the variance (the standard deviation squared). Now over plot a line representing x=y, IDL> oplot, xbar, xbar i.e., plot the mean versus itself. What does this tell you about the relation between mean and variance for counting (Poisson) statistics? 6) POISSON DISTRIBUTION Plot a histogram for one of your sequences with a small count rate, e.g, 2-4 counts per sample and lots of samples, e.g., 1000. Calculate the mean count rate and compare with the theoretical Poisson distribution, P(x,mu) = mu^x / ( factorial(x) * exp(mu) ). Use IDL's OPLOT function to compare the observations and prediction. Think about that! How do you compare a histogram and a theoretical probability distribution!? The Poisson distribution gives a probability. You have measured counts. Explain how to choose the correct scaling factor (or normalization) to compare the measured and theoretical distributions. Does the Poisson distribution provide a good description of the data? Now arrange so that the counts per sample is increased (be careful that the count rate does not exceed 1 MHz). Aim for several hundred counts per sample. Plot the histogram again. What has happened to the shape of the histogram? Calculate the mean and standard deviation and over-plot the corresponding Gaussian probability distribution, P(x,mu,sigma) = exp(-0.5*((x-mu)/sigma)^2)/(sigma * sqrt(2 * !pi)). Is a Gaussian curve a good approximation to the Poisson distribution? 7) STANDARD DEVIATION OF THE MEAN The more events you count the more accurately you can measure the number of counts per sample (i.e., the count rate). To illustrate the effect take *ten* sets of data with a given number of samples, say 16. Choose a fixed sample rate, say 1000 Hz. For each of these ten sets calculate the mean. Due to statistical variations the ten means will be different, so also calculate the mean of the means and the standard deviation of the means (SDOM). The SDOM is a measure of how precisely we know the average counts per sample. How does the SDOM vary with the number of samples? Intuition suggests that if we have more samples in each of our ten measurements the SDOM will be smaller. To quantify this effect repeat with 2, 4, 8, 16, 32, 64, 128, 256, 512, 1024, 2048 etc. samples. Don't vary the sample rate. For each sample size consider the ten data sets and calculate the mean of the means (MOM) and the standard deviation of the means (SDOM). Plot the MOM and the SDOM as a function of the number of samples. Describe how the MOM and SDOM vary as the sample size increases? Based on your knowledge of Poisson statistics predict the SDOM given the measured mean count per sample the and sample size. Use the IDL OPLOT function to compare your prediction with the data. If I want to improve the accuracy of a measurement of the mean by a factor of two, by what factor do I need to increase the number of samples? How accurate is your best estimate of the count rate, i.e., how accurate is the MOM? Is it possible to construct a light source for the photometer experiment that would not show variations in the count rate? Write up your lab report as a LaTeX document describing each of the above exercises. Show your results by including IDL plots in your report. Your report is due on September 12 at 6PM. NO EXTENSIONS WILL BE GRANTED. EQUIPMENT PMT + dark box, counter, dim light source of controllable intensity. Access to IDL programming environment.