Kurt Schwehr Planet Search 1991 This was accepted by NASA Ames as a Contractor Report. Who knows what contractor number it got assigned, but it was accepted. Title: Planet Search 1991, Using a Robotic Photometric Telescope to Search for Extrasolar Planets TABLE OF CONTENTS 1. INTRODUCTION 2. FIT 2.1 Introduction 2.2 Compiling and linking 2.3 Input files 2.4 Program FIT 2.5 FIT - More Notes 2.6 References 3. TAB 3.1 Generating Data Tables with TAB 3.2 TAB Instruction List 4. PLOT-NIT 4.1 Looking at Night to Night Tables (- Night Table -) 5. PLOT-CLR 5.1 Looking at Within Night Tables (-Color Table-) 6. PLOTTING 6.1 Batch Printing of Graphs 6.2 CLR File 6.3 NIT File 7. PERIOD 7.1 Program 7.2 Synopsis 7.3 Purpose 7.4 Description 7.5 Computer Requirements 7.6.0 Instructions/User Manual 7.6.1 Preparing a data file 7.6.2 Files required to run 7.6.3 Description of a Sample Run 7.6.4 The report 7.6.5.0 Important Definitions 7.6.5.1 Minimum Frequency 7.6.5.2 Nyquist Frequency 7.7 Compiling 7.8 Algorithm 7.9 Program Flow 7.10 Bugs 7.11 Suggestion for future work 7.12 References 1.0 Introduction These programs were written to analyze robotic telescope data from the Fairborn Observatory in Arizona. The programs were developed for William J. Borucki at NASA Ames Research Center by Bob Hogan and Kurt D. Schwehr. Bod Hogan wrote FIT and TAB. Kurt D. Schwehr wrote PLOT-NIT, PLOT-CLR, and PERIOD. This manual was written by Kurt D. Schwehr, except for section 2.0 FIT. Direct all questions and comments to: William Borucki MS 245-3 NASA Ames Research Moffett Field, CA. (415)604-6492 The process of data analysis is taken car of by five programs. The data is transferred from the observatory's computer every couple of days. Each ATIS file contains data for one night and is compressed into an archive (ARC). Once the files are on the local computer, the program FIT does some initial data reduction and creates a very large binary data file. The program TAB is used to extract data for each star. The data is stored in ASCII tables that can be inspected, graphed and analyzed. PLOT-NIT and PLOT-CLR will plot up the data for viewing and printing. Finally, PERIOD is used to determine if any of the stars observed are likely to be variable. The next sections describe each of the five programs. 2.0 FIT 2.1 INTRODUCTION This software package is designed to interpret and analyze photometric data originating from an ATIS controlled facility. The package consists of two FORTRAN programs that extract and process photometric counts of selected stars and areas of the sky. The ATIS user must organize photometric measurements into groups. Each group defines a set of star and sky observations. The processing programs use only the relevant information within groups to correct and reduce star data. The programs are FIT, which does the initial data reduction, and TAB which generates ASCII text data tables. 2.2 COMPILING AND LINKING This software should work on all PC, XT, and compatible hardware. The programs were compiled with MircoSoft (MS) FORTRAN 4.0. The modular structure of the codes require linking to an ATIS library in addition to the standard FORTRAN modules. 2.3 INPUT FILES This table indicates all files used and generated by the ATIS software. Identifiers with the dots refer to the filename extensions. The nuemonic identifiers refer to specific files. Input: Catalog.tbl ! Star catalog Refstar.tbl ! List of reference stars for each group A0*.arc ! The ATIS data files compress in an archive Output: Sum.dat ! Summery of the data, stars, and coeff derived. Dat.dat ! Output file required by TAB to calculate the extinction- ! corrected brightness. Generally, all programs need to know where to find their input files, what nights to process, color filter number and what files to create. 2.4 PROGRAM FIT This program is the workhorse of the package. Its function is to compute nightly extinction coefficients using corrected photometric data of selected stars. In addition, all counts are corrected for dead time and the sky component is subtracted from all star counts. The program input requirements are, The data file directory The range of nights to process (JD) Color filter number Accepted upper limit on first order extinctions Accepted upper limit on first order extinction errors Accepted lower limit on airmass differences The output file name The algorithm has been designed for robustness and efficiency. Data is grouped in chronologically ordered time zones. The data collected in each time zone is reduced as follows, 1) Airmasses are calculated at all times and counts are corrected for dead time. 2) Sky counts are subtracted from Star counts. Sky counts at times between sky events are found by linear interpolation. 3) Corrected star counts are scanned for outliers by computing a mean count and sigma and searching for counts greater than two sigma from the mean. 4) A two parameter linear extinction model is used to fit data from each extinction star and estimate parameter values and their errors. Outliers are not used. Maximum and minimum airmasses are found. The model, MO = MT + E0*X , where MO is the measured extinction star magnitudes converted from counts as MO = log10(-2.5*counts) MT and E0 are the two parameters estimated E0error is the estimated relative error of E0 X is the airmass DX is the airmass difference (maximum - minimum) 5) A weighted average of the star E0 values is determined. The weight function is the airmass difference. EM = sum ( E0*DX ) / sum ( DX ) sum is over all stars whose E0, E0error, DX values fall inside their user defined bounds. After steps 1) to 5) are performed for each time zone , the set of EM values are used to model the temporal behavior of the extinction. 6) These EM data are used to fit a three parameter model, E0 + E1 * time + E2 * time**2 time is the relative JD time measured from the JD time associated with the first star event of the night. The parameters E0, E1, and E2 are called the nightly extinction coefficients. Program FIT saves the reduced nightly star and sky data and extinction coefficients as fixed length records in a direct access binary file. An ASCII summary file is also generated. 2.5 FIT - More Notes I. Purpose: This program calculates the extinction coeff vs time for each day using the data available from all the stars observed during that night. The only data that are used are those for which the extinction coeff meets the following criteria: 1) 0 .LE. ext. coeff. .LE. MAX (input value) 2) delta air mass change .GE. 0.05 3) relative error .LE. 0.20 4) points in a _____ interval that are ___ are removed. (<- unreadable) 5) if sky .GE. star; toss out the point 6) will not use any star designated as "variable" in the calculation of the extinction coeff. 7) airmass .LE. 2.0 8) 0 .LE. extinction coeff .LE. 1.0 2.6 REFERENCES This material was taken from a readme file from a FIT program disk and notes made by William J. Borucki. It is not necessarily up to date with the latest version of TAB. 3.0 TAB In order to analyze the data retrieved from the Fairborn Observatory's Robotic telescope, I wrote several programs to view the data. These programs can be found on the accompanying disk under the directory \EXE. Also an the disk in the directory \TAB, is Bob Hogan's program TAB2 which generates data tables from the binary file DAT.DAT. 3.1 Generating Data Tables with TAB The program TAB2 is designed to use the data in the binary file DAT.DAT to produce tables of the data. The program requires that two data files are in the current directory: REFSTAR.TBL and CATALOG.TBL. If they are not there, TAB2 will crash. REFSTAR contains the list of the reference star for each group. CATALOG contains the current RA and DEC, magnitude, color index, and E (which is something to do with being a variable star?). Also, TAB2 requires that DAT.DAT be in the current directory. The best way to explain tab is show two examples of generating tables: EXAMPLE 1: Generate a Color Table and a Night Table for HD 114762 DRIVE>TAB2 file days stars extinction coefficients * outliers color night show save print quit >days 1 500 ! Set the date range for finding data >color 114862 ! make a color table for the star ! ** Working ** for a long time >show ! view the table created ** lots of data ** >save 114762.clr ! save in the file "114762.CLR" >night 114762 ! generate a night table ! Wait awhile -- the bigger the DAT.DAT the longer >show ! Display table ! the night command saves the data for you. ** lots of data ** >quit ! Exit to DOS DRIVE>DIR 114762.* ! See what you have created. 114762 CLR 162606 09-13-91 10:58p ! File size may be different 114762 NIT 8938 09-13-91 11:18p 114762 PLT 8282 09-13-91 11:18p ! This is a NIT file without a ! header 3 file(s) 179826 bytes DRIVE> ! Your done EXAMPLE 2: Generate night tables for all stars. DRIVE>TAB2 ! Run tab2 >days 1 500 ! Select JD range >night * ! Create night tables for all stars ! Let run overnight >quit ! Back to DOS This version of TAB does not allow you do what is in example 2 with Color Tables so you will want to use I/O redirection to run a large batch of color tables over night like this: Copy con:redirect.dat days 1 500 color 1642 save 001642.clr ! always make sure the number here is 6 characters ... ... ! repeat as many times as needed color 300000 save 300000.clr quit ^Z TAB2 copy con:save.lst 114762.nit ^Z ! Hold down and press Z DRIVE> If you want to view several files at a time, just add each one to a line in this file. You can also use any text editor to make this file. Once you have done this you are ready to view the night table. Start the program: DRIVE>plot-nit The current ranging parameters are: ! controls the range of plot. Minimum: 0.10 Mag ! default settings Maximum: Range of points. Do you like these (y/n): These are the range of the magnitudes displayed. The current settings force the program to have the Y-axis scale show a change of 0.10 magnitude. If you wish to have all your graphs have the same scale, type "n" and . Then put in the same value for both the maximum and minimum values. If you like the values enter "y" or just . The current filter to plot is: filter 3 - VISUAL Do you like this(y/n): Type "y" or just if you want this filter to be graphed. Otherwise, to change filters, type "n" . You will then be asked the number of the filter you want. Next you will be asked if you want error bars. The error bars are taken from the % error list in the NIT table for the filter. Just pressing is the same as "y" . Finally, you are asked if you want outliers marked. This function does two things: it shows where the mean value is on the right hand side and it marks points that are outside of one sigma. The outliers are marked with a yellow circle. It was put in for the purpose of drawing attention to points near the top and bottom of the graph, as they can often go unnoticed. Pressing is the same as "y" . Here is a sample of what you will see: The Graph Explained The top line of the graph's header contains the information on what the data is being viewed. The second line says the filter plotted, the date the table was made and the Julian Dates (-244800) that this data covers. On the bottom, after the X-Axis label, there are two numbers labeled "Mean" and "NStdDev". The "mean" number is the mean value of the points plotted. The "NStdDev" is the normalized standard deviation of the points. The bottom line of text on the screen is the command menu. Use the left and right arrow keys to move across the menu and to select an option. The options are: "QUIT" or , exit to DOS; "NEXT", go to the next table listed in SAVE.LST; "PREV", go to the previous table in SAVE.LST; "PRNT QUE", saves the table name (see BATCH PRINTING); "PRINT", print the graph on the printer(must be an HP LaserPrinter or DeskJet), "ViewData", displays the table used to make the current graph; and "Empty", does nothing. The viewdata command requires a program called BROWSE.COM. To quit viewing a table, press . If you do not have it, a substitute can be made: DRIVE>copy con:browse.bat @echo off type %1 | more pause ^Z ! This is hold down and press Z DRIVE> 5.0 PLOT-CLR 5.1 Looking at Within Night Tables - Color Table - The program used to view within night data is "PLOT-CLR." Before you read this section, please read "Looking at Night to Night Tables." The two programs are very similar and use much of the same source code. Only the things that are very different are explained. First, the required files: TMSRB.FON, graphics fonts; DDIR.COM, a directory displayed; BROWSE.COM, the data viewer; *.CLR, the within night table; *.BIN, the binary version of *.CLR; *.PTS, and the number of data points in *.BIN. The *.BIN and *.PTS files are created with the program "PACKDATA." This was done to allow moving around from night to night at a reasonable speed. An example of how to run the program for the file 114762.CLR: DRIVE>packdata 114762 DRIVE> Then run the graphing program. DRIVE>plot-clr When the program begins, the major change is that you will see: Press Enter if you are using a new file: (Otherwise press anything else.) If you press enter, the program will give you a list of color tables that you may view. It will then copy the file to a binary format in a temporary file. If you pressed any other key, the program will display a list of binary color tables to view. The program will then ask you to enter the HD number of the file you wish to view. It is recommended that you do not use the temporary binary method, because you must regenerate the binary every time you wish to view a color table. This can take a long time for large color tables. The only change when the graph is displayed is that there is a new menu item: "GOTO NIGHT." This option allows you to go directly any JD you wish. If it does no find it or if it has only one point, it will simply put you back at the first night. BUGS: the previous command does not work from the first night. To use it, goto the NEXT night then goto the PREV night and you will be put at the last night in the data set. 6.0 PLOTTING 6.1 Batch Printing of Graphs Batch printing of graphs can be accomplished with the program "IIP-N." Simply put at list of files to print in the file "SAVE.LST." It will take these and plot them. It handles NIT files and CLR files that contain only one day. The program PLOT2 can be used to generate the file. The menu command "PRNT QUE" puts the view file's name into a file name "PLOT.LST." T batch print the selected graphs, rename the PLOT.LST to SAVE.LST and run IIP-N as follows: DRIVER>ren plot.lst save.lst DRIVER>iip-n 6.2 CLR File This file contains a text table of within night data for a particular star. It may hold anywhere from one to hundreds of nights of data each with a number of samples. The first line of the file will state: " - Color Table - ". The third line says the star's HD number, the date the table was created, the group the star belongs to, and the star to which this star is referenced. The fifth line explains each of the columns of data. Julian Date is the actual date minus 2448000. 6.3 NIT File A file containing a text table of the night to night data for a particular star. The table contains one point for each night in each filter. Each line is that mean value for that particular night. The table is titled " - Night Table - ". The third line is the same as in the CLR file. 7.0 PERIOD CREATION : Kurt D. Schwehr - August, 1991 LAST CHANGED: Kurt D. Schwehr - December 15, 1991 7.1 PROGRAM PERIOD.EXE 7.2 SYNOPSIS Generate a Periodogram/Power Spectrum, find the peaks, and generate a graph and a report. 7.3 PURPOSE To analyze robotic telescope data for periodic events (or any time series data). 7.4 DESCRIPTION PERIOD uses an algorithm developed by Scargle which allows the signal power to computed at any frequency. From this power it is possible to determine what the probability that the power represents a real frequency in the data. 7.5 COMPUTER REQUIREMENTS - IBM PC/compatible or greater - 256 Kb or more - VGA graphics adapter - HP DeskJet or LaserJet printer, any model (for hardcopies) 7.6.0 INSTRUCTIONS/USER MANUAL 7.6.1 Preparing a data file and the data. The data file is a file containing whatever data you wish to analyze. It can contain up to 5000 data points and must be in ASCII text format. You may have columns containing other data for each point. Now write down the column number for the time index and magnitude of the data. Write down the number of lines that precede your data. They are usually notes and column titles. Now check that your data columns are separated only Spaces, Commas or TABS. If there are other non numeric characters in your data, the program will only analyze a small portion of your data. It is important that as much data be present as possible. The more periods of the cycle that you are looking for that exist in your data the more likely the program will find it. Unfortunately, as the number of samples increases, the time increases. Large data sets will take a long time to run. 7.6.2 Files required to run. - PERIOD.EXE -> Must be in the current directory or path - TMSRB.FON -> Must be in the current directory - your data file -> Must be in the current directory 7.6.3 Description of a Sample Run Here is the data file, "131156.NIT" used for the sample: - Night Table - Star : 131156 Date : 1991-11-06 Group : 148 Ref Star : 131265 Day # Mag 2 % Err # Mag 3 % Err # Clr 2-3 % Err 346.680 10 5.1374 .44 11 4.6359 .31 11 .7004 .35 347.633 3 5.1284 .11 3 4.6269 .28 3 .7015 .38 348.696 11 5.1162 .22 12 4.6222 .56 11 .6937 .35 350.682 3 5.1135 .11 3 4.6226 .22 3 .6909 .27 351.698 11 5.1302 .16 11 4.6309 .37 11 .6988 .37 353.699 9 5.1360 .31 9 4.6313 .43 8 .7060 .32 354.653 3 5.1322 .86 3 4.6371 .68 3 .6951 .18 355.762 15 5.1188 .57 14 4.6231 .35 14 .6959 .33 356.653 3 5.1194 .18 3 4.6234 .01 3 .6960 .18 357.654 6 5.1302 .22 6 4.6318 .22 6 .6984 .20 359.655 6 5.1390 .23 6 4.6375 .18 6 .7014 .31 360.702 9 5.1306 .35 8 4.6267 .49 9 .7016 .65 361.706 10 5.1216 .35 9 4.6234 .25 10 .6969 .45 364.658 3 5.1416 .57 3 4.6484 1.19 3 .6932 .67 367.745 11 5.1249 .44 11 4.6280 .39 12 .6958 .70 371.662 3 5.1398 .15 3 4.6379 .13 3 .7019 .15 372.665 3 5.1339 .24 3 4.6345 .11 3 .6993 .31 373.663 3 5.1263 .06 3 4.6230 .17 3 .7033 .22 374.677 6 5.1174 .43 6 4.6184 .80 6 .6989 .89 377.695 9 5.1372 .16 9 4.6355 .27 10 .7020 .46 378.737 8 5.1394 .66 8 4.6389 .92 8 .7004 .75 379.667 8 5.1267 .35 7 4.6246 .18 7 .7025 .20 380.714 15 5.1222 .29 15 4.6242 .40 15 .6973 .31 381.728 11 5.1137 .19 11 4.6187 .31 11 .6959 .19 382.663 7 5.1163 .14 7 4.6166 .36 7 .6987 .29 383.681 8 5.1288 .46 7 4.6280 .26 7 .6974 .40 386.666 8 5.1269 .39 7 4.6256 .23 8 .7034 .62 387.710 5 5.1171 .24 5 4.6181 .24 5 .6990 .23 388.684 7 5.1127 .26 8 4.6175 .24 7 .6948 .21 390.689 12 5.1272 .55 11 4.6303 .82 12 .6990 .81 391.737 9 5.1265 .32 10 4.6285 .62 10 .6993 .67 392.707 1 5.1115 .00 1 4.6232 .00 1 .6882 .00 393.753 12 5.1202 .23 12 4.6226 .21 13 .6988 .38 394.696 5 5.1162 .81 5 4.6209 .49 5 .6953 .47 In the sample run: the "!" is followed by comments, computer output is in a different font, and user input is underlined. D:\DATA>PERIOD ! Begin the program Enter filename: 131156.NIT ! Enter your data file Enter Unit Time (EX: Sec,Min,Hr,Day) up to 3 characters: Day Number of text lines to skip in the beginning of your data file: 6 Enter the two column numbers to be read: 1,6 Reading Data . . . Number of Data Points = 34 Maximum Frequency(Nyquist Freq.) = 7.011875E-01 (Cycles/Day) Minimum Frequency = 1.041319E-02 (Cycles/Day) Change Minimum(y/N)? N Change Maximum(y/N) ? N Mean for 2nd col # of data = 4.628 Variation = .000 Enter the subject (up to 60 characters): Star: HD 131156 Filter: Mag3-VIS GRP:148 File:131156.NIT ! Enter whatever title you like Do you want to send the results to the printer(y/N)? N ! Enter 'y' if you want a hardcopy Computing Initial Survey . . . xx% Done ! This is how far through calculation it is Analyzing 3 peaks . . . ! How many peaks it is looking for Here is the graph of the periodogram/power spectrum: Here is the report: Frequency range: .0000 TO .7012 (Cycles/Day) Nyquist Freq.: .7012 Frequency Omega Period Power Probability --------- ----- ------ ----- ----------- .15636 .98 6.40 10.4 .999 .01823 .11 54.85 3.3 .251 .61775 3.88 1.62 2.6 .077 D:\DATA> ! Back to DOS 7.6.4 The report Omega(w) is defined in f(t) = sin w(t) + x. Period is the length of time to complete one cycle. Power is the relative power contained in the signal. Probability is the chance that a signal is real. 7.6.5.0 Important Definitions 7.6.5.1 Minimum Frequency This is the lowest frequency that the power spectrum can distinguish in the data. In order for the equation to detect the frequency it must see at least one full cycle or period of the wave. The more periods in the data, the more likely that the power spectrum will show the wave energy. 7.6.5.2 Nyquist Frequency The Nyquist frequency is the point a which it is no longer possible to detect frequencies above. It is given by: fn = 1 / (Average Time Step). If you need more information on this please refer to Presses' Numerical Recipes or another text on power spectrum analysis. 7.7 COMPILING This program was compiled with MS FORTRAN 5.0 under the large memory model. The command line for compiling was: "FL /FPi PERIOD.FOR POWER.FOR GRAPHICS.LIB" 7.8 ALGORITHM The algorithm used was taken from Horne(1986). Please refer to the paper for more information as this section is completely explained in it. Horne, page 757: For a time series X(tj), where i = 1,2,...,No Px(w) = (1/2) [(Sum12 / Sum2) + (Sum32 / Sum4)] where: tan (2wt) = (sum 2wtj)/(sum cos 2wtj) => sum is from j=1 to j=No sum1 = X(tj) cos w(tj - t) => " sum2 = cos2 w(tj - t) => " sum3 = X(tj) sin w(tj - t) => " sum4 = sin2 w(tj - t) => " For the following equations to produce the correct probability, they must be properly normalized: 1) The data must have the mean value subtracted from is. 2) PN(w) = PX(w)/s2 Ni = Number of Selected Frequencies Ni = -6.362 + 1.193*NumPoints + 0.00098 * NumPoints2 F = Probability that frequency is not real F = 1 - (1 - e (-PN) )Ni Probability = 1 - F 7.9 PROGRAM FLOW This is a description of the program. FILE: PERIOD.FOR - Initialize: Question User Read Data File Calculate constants Prepare Data - Calculate Survey: Get an idea what it looks like - Find Peaks: Survey each peak Find the maximum value for peak - Start Graphics: Find the range and domain Set graphics mode Draw graph and ticks Put on the title Label the axises - Plot Survey: Draw the points of the survey - Plot Peak Surveys: Plot all the points Draw a line at each peak - Print: Send to printer if selected Reset screen to text mode - Report: Show each peak and it's probability Send to printer if selected - Exit: Return to DOS 7.10 BUGS The one serious bug is in reading data files. If the program hits a character between numbers that is not a separator (CR, LF, Space, or TAB) then it thinks that it has hit the end of the data. This is a problem to our work, as we mark certain data points with an asterisk (*). Therefor we must strip them from the data file before running it through PERIOD. The way to see if this is happening to a particular data set is to check the answer given after "Number of data points:." If this value is far lower than you expect, this is probably what is happening. 7.11 SUGGESTIONS FOR FUTURE WORK - Incorporate this program to work automatically in other programs - Set it up for other video displays - I attempted to set up the program to use the LFIT routine from Numerical Recipes. This would find a quadratic approximation (c + cX + cX^2) to a number of samples of the peak. It would then use the derivative of this to find a better approximation of the peak. I was unable to get it working. 7.12 REFERENCES HORNE, J.H. and BALIUNAS S., "A Prescription for Period Analysis of Unevenly Sampled Time Series," The Astrophysical Journal, 302:757-763, 1986 March 15. PRESS, H.P. et. al.,Numerical Recipes, The Art of Scientific Computing (FORTRAN Version), Cambridge Press, 1989. SCARGLE, J. D., "Studies In Astronomical Time Series Analysis. II. Statistical Aspects of Spectral Analysis of Unevenly Space Data," The Astrophysical Journal, 263:835-853, 1982 December 15.