AstronomyOnline.org
home observation science solar system stars our galaxy cosmology astrobiology exoplanets astrophotography
How the SETI@Home Project Works - by Ricky Leon Murphy:

Introduction
What is SETI@Home and Why Use It?
SETI@Home - The large supercomputer
Who is involved
How is the search performed?
How does the program work?
Data Collection
Finding Candidates
Testing Data Integrity
Removing Radio Interference
Identify Final Candidates
Verification - What is Next?
Summary
References

Back to Top | Back to Astrobiology
 

Introduction

On a clear night, you can see literally hundreds of stars. That number increases greatly when looking though a telescope. I recall a statement made by the late Dr. Carl Sagan: the number of stars in the known Universe outnumbers all of the grains of sand on every beach on Earth. That is an enormous number of stars! If a small fraction of those stars are capable of supporting a system of planets, and if a fraction of those planets are capable of supporting life, and if a fraction of the life bearing planets are capable of supporting intelligent life, there will still be an enormous number of civilizations within the Universe (if that phrase sounds somewhat familiar, a variation of this statement is from the movie Contact). The problem is will any of those civilizations make an attempt to send out a signal to alert other civilizations of its existence? This is the foundation of the Search for Extra-Terrestrial Intelligence, or SETI. The founding father of SETI is Frank Drake. As a newly graduated student of Astronomy, Drake worked at the National Radio Astronomy Observatory in Green Bank, West Virginia. During telescope quiet time, he was allowed to use the telescope to search for a signal of extra-terrestrial origins (Shostak, page 153-154). Using home-made equipment, Drake scanned the frequencies above and below the radiation emitted by the Hydrogen atom emitting at 1420MHz (Shostak, page 151, 154). This first SETI was named Project Ozma; and while no signal was detected, his efforts demonstrated that such an effort can take place. Since Project Ozma, there have been several SETI efforts by various organizations and universities. While most of these searches are performed by professional astronomers and those with the means to search on their own, there is one that anyone with a computer can participate: SETI@Home.

Back to Top | Back to Astrobiology

So what exactly is SETI@Home and why should I use it?

SETI@Home is software that is designed to operate as a screen saver for a personal computer while processing work units issued by the University of California at Berkeley and performing as a participant in a very large network of computers behaving like a supercomputer. However, before explaining what SETI@Home is, the first important question is why the program should be used in the first place. Without knowing why, the rest of the questions are meaningless. If you are considering using SETI@Home, you have already answered the question! If you are like me, you want to know what and who is out there. The probability of life was first demonstrated by the famous Drake Equation. Authored by Frank Drake, the Drake equation is not an actual math problem; the equation is just an illustration of the probability that life can exist elsewhere in our Universe. This is the equation:

N = R* fp ne fl fi fc L

N = the number of intelligent civilizations 
R
* = the birthrate of suitable, long-lived stars in our galaxy (between 1 and 10) 
f
p = fraction of stars that have planets – about 50%
n
e = fraction of planets where life can be sustained –  at least 1
f
l = fraction of planets (from ne) where life can be sustained – at least 1
f
i = fraction of fl where intelligent life evolves – at least 1  
f
c = fraction of fi that communicates (or is willing to communicate) – at least 1 
L
= fraction of planet’s life that the civilization can communicate – can be any number

(Equation and parameters borrowed from Shostak, page 180 to 181)

The above values can be any number, and the results can certainly be argued; there is a host of variables to consider – such as the presence of water, what type of star a planet orbits, and what is truly required for life. One thing to remember is Earth counts as part of the equation.

Even if you are not one to believe there may be life outside our boundaries of Earth, being a part of a worldwide network of computer users shows that such a network can be used to help solve other problems requiring intensive computer involvement. A good example is a study out of Stanford University called the Protein Folding project, called Folding@Home (http://www.stanford.edu/group/pandegroup/folding/).

The mission statement on the SETI Institute website (www.seti.org) says it best: “The mission of the SETI Institute is to explore, understand and explain the origin, nature and prevalence of life in the universe.”

Back to Top | Back to Astrobiology

SETI@Home is a large network of computers acting like a large supercomputer.

A standard supercomputer is a device used to process large amounts of data and to solve problems that are too difficult and time intensive for a single user or computer. A supercomputer is a network of several computers controlled by a single server using special server software (Microsoft makes a version of this called Advanced Server). Generally the other computers are not accessible by a user, but given instructions by the server. A very good example is the supercomputer at Swinburne University. This $700,000 machine boasts 1080 Gigaflops (http://supercomputing.swin.edu.au). FLOP is an acronym for floating point operation – the more FLOP’s the better. A floating point is a type instruction built into a processor – like your garden variety Pentium processor – that adjusts its ability to perform mathematical calculations in an accurate yet efficient way. For example, if you have two numbers to process that has a varying amount of numbers following a decimal point, every number is included. If a group of numbers has no numbers past a decimal point, nothing beyond the decimal point is used. The movement of the decimal point is the floating point that has adjusted itself to efficiently process data. Because of the large amount of processing power of the Swinburne supercomputer, this computer is used to simulate galaxy formation and collisions by mapping out the motions of millions of simulated stars individually - although a program can be written to analyze SETI work units or any other project deemed fit by the programmer and the computer owner. While the Swinburne supercomputer has 160 computers in its arsenal, there are over four million personal computers[1] processing SETI@Home work units (as of November 3, 2003). This comes to about 50,000 gigaflops! A dedicated supercomputer with this processing power would cost about $35,000,000 (R1, slide 26). Such a network is capable of processing large amounts of data and saves a tremendous amount of money.

Back to Top | Back to Astrobiology

Who is involved with SETI@Home?

While it is easy to associate SETI@Home with the SETI Institute, the SETI@Home project is an extension to the SERENDIP (Search for Extraterrestrial Radio Emissions from Nearby Developed Intelligent Populations) project designed and performed by the University of California at Berkeley. As the name suggests, SERENDIP is looking for radio emissions that are not natural, but produced either deliberately via radio beacon, or by emission by a civilizations technology like our TV and radio emissions. SERENDIP is an ongoing project, and is already on its fourth version, called SERENDIP IV. All of their listening equipment is housed at the Arecibo Radio Observatory, and is continuously recording and analyzing data (unless the telescope is closed for routine repair). While Berkeley University is brains behind this project, everyone who downloads and uses the SETI@Home client software is a part of the SETI@Home project.

Back to Top | Back to Astrobiology

What are we looking for exactly, and how is the search performed?

This is the fun part. A civilization of intelligence at least equal to our own will either deliberately send out some type of beacon, or emanate radio noise as a result of technology. These signals can be sent optically, through radio waves, or by some other method; however, our own atmosphere limits us to only optical or radio wave detection (Universe, page 140). An optical pulse detector can be used, but visible light suffers from extinction (Universe, page 457) – meaning the signal weakens as it travels through the interstellar medium as it is absorbed by the interstellar dust and debris. While searching for an optical pulse (called Optical SETI, or OSETI) is gaining favor, telescope and other equipment is required to perform such a search, and the cost will alienate just about anyone who wants to join the search (more on OSETI here: http://www.coseti.org/radobs31.htm). A radio telescope can be tuned to a specific frequency, such as above and below the emission of the Hydrogen spectra. An example of this is our ability to map the structure of our own galaxy using a radio telescope – something that is not possible using an optical telescope (Universe, page 568). By scanning above and below the frequency emitted by Hydrogen, we may have luck detecting a deliberate signal. Hydrogen is the most abundant element in the Universe, and any intelligent civilization would know this. Intelligent civilizations would also know that Hydrogen emits radio waves at 21cm (see figure 1).

Figure 1.  A 21cm photon is release when the hydrogen atom goes from higher energy state to lower energy state. Think of this frequency as the interstellar dial-tone. A radio telescope is tuned to 1420MHz to listen to this frequency (Image from: http://instruct1.cit.cornell.edu/courses/astro101/lec08.htm).

By using the Hydrogen frequency as an interstellar dial-tone, an intelligent civilization may send a signal using a frequency above or below the frequency of Hydrogen. This gives us a place to start looking – or listening (Shostak, page 151).

To give an idea to how we listen to this, let’s examine the pieces required to perform this task. First we need something to gather the signal so we will need a radio telescope. Radio is a portion of the electro-magnetic spectrum, but we cannot see radio since the wavelengths are much longer than visible light. A radio dish is used to collect the longer wavelengths, which is why a radio dish is so large in diameter. Like all things in astronomy, the larger the diameter, the more sensitive the dish becomes. The choice for such a dish is the largest radio dish currently available: the Arecibo Radio Observatory. The diameter of this telescope is a whopping one thousand feet.

Figure 2. (Image borrowed from: http://www.naic.edu/aisr/sas/sashomeframe.html)

Radio signals are received by the dish at the bottom of the photo (Figure 2) and are reflected to the feed horn hanging above the dish. The feed horn gathers the radio signals and sends them through wires to an amplifier (because sometimes the signal is very weak and needs to be amplified). The amplifier sends the signal to special instrument called a spectrum analyzer (sometimes the spectrum analyzer can operate without an amplifier – such decisions are left to the engineer setting up the equipment). This examines frequencies from 1418.5 MHz to 1421.5 MHz (http://setiathome.ssl.berkeley.edu/newsletters/newsletter7.html) at every 0.6Hz. That comes to 168 million channels. The SERENDIP IV project uses this spectrum analyzer along with a supercomputer to examine the frequencies in real time. Since data analysis is in real time, only strong signals are looked for, and each frequency is only analyzed for 1.7 seconds. Regardless of the amount of processing performed in real-time, it is still not enough. This is the reason for the initial concept of SETI@Home. There is so much information that the supercomputer evaluating the signal in real time cannot possibly process the excess data. This immense data stream is a direct result of SERENDIP IV operating constantly, using a technique called “piggyback” (R1, slide 4). This means that no matter what type of scientific research is performed at Arecibo, the SEREBDIP IV spectrum analyzer is receiving information with no effect to the current research project in progress. The result is 35 gigabytes of data every day. To put this size into perspective, I have my entire CD collection totaling 850 titles in MP3 format on my computer. For over 6,000 songs, only 26 gigabytes is used. Just like I cannot possible listen to every song in one day, there is no way for the computers at SERENDIP to evaluate all of this data.

Back to Top | Back to Astrobiology

How does the program work?

Because so much more processing power is required, there had to be an efficient yet cost effective way to create a virtual supercomputer. It was suggested that a screen saver program be designed that would analyze this data and anyone who wanted to download this program could do so for free. This idea was a huge success. Within three months, there were 1,000,000 users worldwide (R1, slide 10). As of today, the number exceeds four million (http://setiathome.ssl.berkeley.edu/totals.html).

Interested in signing up? The first thing that must be done is to download the program. The download is available here: http://setiathome.ssl.berkeley.edu/download.html.  During installation, you will be asked to create an account if you wish. Unlike most free software on the Internet, SETI@Home is not one to send you junk e-mail, so feel free to provide your data. The good news is that you will be given credit for locating a signal, so go ahead and give your real name. Once completed with the installation, the program is pre-configured to run as a screen saver after 10 minutes of idle time.

Since computers today are very fast, I suggest changing the program settings to analyze data all the time. Unless you edit video on your PC, you will not notice a dramatic decrease in performance.

Once installed, the program is ready to analyze. There are five steps to the entire SETI@Home process: (http://setiathome.ssl.berkeley.edu/process_page).

Back to Top | Back to Astrobiology

Step One: Data Collection

The SERENDIP computers record data from the Arecibo Observatory on digital laser tapes (these are like really big cassettes, but store digital or binary information). The tapes are sent through the mail to the SETI@Home center at Berkeley University. The tapes are analyzed quickly for recording problems and gaps and other errors, and those errors are removed. The remaining data is broken into numerous 348 kilobytes units called work units. These work units are sent to the SETI@Home client software for analysis. It is important to know that one work unit can be sent to several computers to be analyzed. This is important for step three, which will be discussed later.

Back to Top | Back to Astrobiology

Step Two: Finding Candidate Signals

Before embarking on a guided tour of the SETI@Home program, it is important to help define some of the key words used in this section. Most of the terms here are used primarily in radio astronomy, so even the most adept amateur astronomer may not know these terms.

·         Fast Fourier Transform, or FFT – this is a mathematical algorithm that translates signals based on time to signals based on frequency.

·         Baseline Smoothing – for the SETI@Home screen saver to delve deeply into a single signal, the broadband signal received by SERENDIP needs to be weaned down to a narrow band. This is the first step of Baseline Smoothing. The second step is the removal of any obvious noise, and ensures each frequency is the same level in volume (like the volume knob on your stereo).

·         Chirping – this is the added Doppler effect of a signal from the rotating Earth.

·         De-Chirping – the removal of the rotation effects of the Earth of a Doppler shifted signal.

·         Doppler Shift – the act of a spectrum being shifted towards the lower frequencies if a signal is moving away from us, or the spectrum shifted to higher frequencies if a signal is moving toward us. Think of the noise an automobile makes as it speeds past your ear.

·         Gaussian – the effect of a signal traveling through the beam of a radio telescope, gradually increasing as it enters the beam to gradually decreasing as it leaves the beam. Celestial objects take about 12 seconds to travel the duration of the Arecibo dish.

·         Gaussian fit – the length of time it takes a signal to enter and leave the telescope beam.

·         Gaussian power – the strength of the Gaussian signal as it enters and leaves the telescope beam.

·         Pulses – an oscillating signal at a particular duration

·         Triplet – three equal spaced pulses

·         Radio Frequency Interference, or RFI – interference from the Earth or from a source near Earth.

Now that we have identified some key words, let’s tour the program!        

This is the SETI@Home screen saver program.

It can be divided into the following sections: Data Analysis, Data Info, User Info, and the pretty spectrum on the bottom.

First of all, let’s examine the pretty spectrum at the bottom of this screen:

This actually serves no scientific purpose. What it does show is a graphical representation of the Fast Fourier Transform, or FFT, currently in progress. It also demonstrates how the signal strength is over time. We’ll discuss the FFT function under the Data Analysis header.

This box is the Data Info box:

This gives demographic information of the current work unit. This shows the exact location in the sky using the Right Ascension and Declination coordinate system. It also shows the date at which this signal was recorded, and the source of the signal; usually the Arecibo Radio Observatory. A radio telescope is tuned to a particular frequency to listen; in this case, the base frequency being 1.420859375 GHz (1420.0859375MHz).

The User Info box, shown here,

gives the users total statistics since using SETI@Home. If getting credit is important to you, be sure to give your name and e-mail address when creating your account.

The most important box is the Data Analysis. Everything happens in this portion of the SETI@Home screen. The following information is fun to read, but remember you do not have to remember any of the processes. The program does it for you. When an account is created and the program is ready to receive its first work unit, the work unit is downloaded with the progress shown here:

Once the data is downloaded, the program performs a Baseline Smooth:

What this does is eliminate any broadband interference and normalizes the level of each signal. Sometimes when data is collected, the level of intensity can vary. The Baseline Smoothing changes the level so they are all the same intensity. This allows the FFT’s to perform their work equally on each signal. This also helps eliminate any ambient noise picked up by the interstellar Hydrogen (http://www.computer.org/cise/articles/seti.htm).

Chirping is a method of removing any problems associated with the Doppler shift.

This is very important since the Earth rotates on axis, and revolves around the Sun. In addition, the source location – if it were a planet – is also rotating on its axis and around its star. This can add additional Doppler shift to an already shifted signal. To understand the effects of a Doppler shift, stand near a freeway and listen to the automobiles speed past. As an auto approaches, the sound is slightly higher in pitch then the normal sound when the auto is next to you. The pitch drops as the auto speeds past. This is an example of a Doppler shift. In the case of SETI, the sound is the signal sent by some intelligence. This “chirped” signal changes over time. The signals are “de-chirped” using FFT’s through trial and error – points between plus and minus 50Hz – in an effort to smooth out, or normalize the signal.

Once the data has been “de-chirped,” each frequency resolution is processed by the FFT mathematics process,

converting the signal time to a signal frequency. Notice the Doppler shift rate and the Resolution portions of the Data Analysis window. The Doppler drift rate is the current Doppler shift for the actual signal. The resolution is the current frequency undergoing the FFT. Each FFT scans for frequencies, between 0.075 to 1,221 Hz and is looking for Gaussians, pulses, and triplets.

While the program is performing the FFT’s, Gaussian changes are processed.

As the Earth rotates the Arecibo dish across the sky, a signal creeps across the dish’s beam. A signal appears on one side of the dish as a faint signal. As the signal reaches the center, the level of the signal is increased only to begin decreasing once again as the signal leaves the beam on the opposite side. The time it takes the signal to pass through the beam is a Gaussian fit. The intensity of the signal is called the Gaussian power. For a signal to be marked as a candidate signal, the Gaussian fit must be a small number (12 seconds or less), and the Gaussian power must be high (3.5 times the normal background noise). The data analysis window will display the current best Gaussian fit and Gaussian power. The significance of 12 seconds is simple: this is how long it takes for a star to pass through the beam of the Arecibo dish. Anything longer than 12 seconds is probably man made, or something very close to Earth.

Other signal properties are also analyzed. The signaling civilization may send a radio signal that is pulsed in nature.

The program performs what is called a Fast Folding Algorithm to look for weak, repeating pulses. Interference of some localized variety can result in an artificial pulse so a limit has been set. A pulse score greater than 1 is flagged.

A triplet is three equally spaced pulses. If the center pulse is found to be equidistant from the other two, the results are flagged.

The image above is of chirping data, but notice the Best Pulse below the Doppler drift rate. With a score of 1.11, this work unit will be flagged when it is sent back to the computers at Berkeley.

When one work unit is processed, Doppler effects removed, Gaussians searched for on frequencies between 0.7Hz to 1200Hz, there are about 175,000,000,000 mathematical operations (R1, slide 14).

Signals are found on a routine basis while running the program. An example is the peak Pulse value found on the images above. There are many sources of signals processed by the program, and they are mostly terrestrial in origin. Other objects in the Universe can also be responsible for a Gaussian signal or pulse, an example of which is a pulsar. This rapidly spinning neutron star[2] can have a pulsed signal (R1, slide 39). Regardless of the source of a signal or a pulse, all results must go through the verification process.

Back to Top | Back to Astrobiology

Step Three: Testing Data Integrity

A single work unit may be processed by several different client computers. This is a very important tool for data verification. If a signal is found by one client, that work unit is compared to the results of the same work unit from other clients. If the signal is present in the results posted by all of the clients, then the signal is marked for stage 4. Because the properties of each client machine are different, there is some leeway given to the comparison. The varying properties can be processing errors or different versions of the client software. Either way, a signal is verified if the signal properties match 70% of each other.

Additionally, the Arecibo Radio Observatory scans over a particular area of sky two or three times. All of these results are compared to the verified work units. This helps rule out any equipment malfunction that might contribute to a false signal.

If a signal does not match the other results by other clients, then the signal is not verified; that particular signal does not make it to the next level. However, because of Radio Frequency Interference, there are a large number of work units that make it to stage 4.

Back to Top | Back to Astrobiology

Step Four: Removing Radio Interference

The verified signals are passed though this fourth phase. Radio Frequency Interference, or RFI, is a reality when dealing with radio astronomy or SETI. There are two common types of interference: the “always on” interference as a result of the system hardware or software, and the short period interference. The short period interference can be a host of things from microwave ovens, a car starting, cell phone transmission, or even satellites. Luckily, both of these types of interference can be removed. The “always on” interference occurs at only 5 frequencies (1418.75, 1419.00, 1420.00, 1421.00 and 1421.25) and can therefore easily be removed (SETI@Home: http://setiathome.ssl.berkeley.edu/process_page/removing_rfi.html).  Notice the 1420.00MHz frequency. It is the same frequency as molecular Hydrogen, the most abundant element in the Universe. Short period interference is removed by comparing the multiple incidences of the same work unit. The Gaussian and pulse signal may be identical, but ambient noise in one version of the work unit may mask the results as compared to another version of the work unit. The more samples of the same work unit, the better.

Back to Top | Back to Astrobiology

Step Five: Identifying final signal candidates

Once a candidate signal passes the data integrity and radio interference removal, the signals are re-verified. Further examination is given to rule out Earth bound RFI. The signal undergoes a Persistency Check, which means a Gaussian, pulse or triplet must be consistent in location and frequency across time. This purpose is to further eliminate and additional RFI. Once RFI has been ruled out and a signal verified, a SETI@Home team will need to re-observe the candidate signals. A team from Berkeley University will travel to Arecibo to examine each of the signals location. If the signal is verified by this observation, the location of the signal is relayed to other observatories for continued verification.

By having other observatories review signal verification, and system errors or localized interference is ruled out. Additionally, more scientific weight is granted to the verified signal if two or more alternate observatories are able to duplicate the results.

Back to Top | Back to Astrobiology

A signal was or was not verified. What happens next?

Regardless of what signal is detected on your computer, it is very important to not get excited. For a signal to be a verified signal from an extra-terrestrial intelligence, it must pass through all five steps of the data collection, data analysis and verification processes. SETI@Home has made a declaration of their policy on releasing candidate signal information. It is available here: http://setiathome.ssl.berkeley.edu/declaration.html. It states specifically that no candidate signal will be released as a signal from an extra-terrestrial intelligence unless is has passed strict verification processes. This five step method described above meets this requirement.

Once the signal is verified, every computer running the SETI@Home software responsible for processing that unit will receive an official telegram for the purpose of notification (R1, slide 40). Those individuals will also receive credit by having their names attached to the discovery.

Processing work units does not guarantee a verified result, but processing work units is very important. One of the many sponsors of the SETI@Home project is the Planetary Society. They have made arrangement so that anyone who runs the SETI@Home software can print out a certificate. You can get yours here: http://www.planetary.org/html/UPDATES/seti/seti_certificate_instructions.html.

The SETI@Home website continues to update information on a regular basis. To view the current overall status, check here: http://setiathome.ssl.berkeley.edu/process_page/

For the current map of signal candidates, check here: http://setiathome.ssl.berkeley.edu/candidates.html

Back to Top | Back to Astrobiology

Summary:

For decades now, we have been pointing our radio dishes into space hoping to detect a signal from an extra-terrestrial intelligence. Since Frank Drake demonstrated that such a search can be performed, we have collected an enormous amount of data to analyze. It may seem like the search is in vain, but that is far from the truth. Even with such a large network of computers running the SETI@Home software analyzing signal data collected at Arecibo, there is still plenty of data left to examine. Most off all, work units still need to be analyzed more than once to help with the verification process and areas of sky left to scan more than once to also help with the verification process. There is still much work to be done. By creating this world-wide virtual supercomputer, SETI@Home has demonstrated that such a network can exist. Over four million people are willing to be a part of science, and that speaks volumes about human nature. And most importantly, one work unit can help make a difference in the verification or detection of that one special signal: the one that could be the first interstellar phone call. Even if a signal never makes it through the entire process, a world-wide network or computers like this one could be used to perform analysis for some other type of problem such as other SETI experiments or even medical research, like the Folding@Home project sponsored by Stanford University. So let’s keep those computers running!

Back to Top | Back to Astrobiology

References:
 

Computer Society: http://www.computer.org/cise/articles/seti.htm 

Folding@Home, Stanford University: http://www.stanford.edu/group/pandegroup/folding

Freedman, Roger A. Universe: 6th Edition. W.H. Freeman and Company, 2002 

The Nature of the Universe: http://instruct1.cit.cornell.edu/courses/astro101/lec08.htm SETI@Home: http://setiathome.ssl.berkeley.edu/index.html

SERENDIP: http://seti.berkeley.edu/serendip/oldindex.html

Shostak, Seth. Sharing the Universe: Perspectives on Extraterrestrial Life. Berkeley Hills Books. Berkeley California, 1998.  

Swinburne Centre for Astrophysics and Supercomputing: http://supercomputing.swin.edu.au  

[R1] Swinburne University of Technology. “Let’s Get Technical” HET608, Module 17, Activity 2.  SAO, 2003.


 

[1] Not all computers run at the same time. On average, there are around 400,000 computers running at one time.

[2] A neutron star is a rapidly spinning core of a star that has ended its life with a supernova

 

Back to Top | Back to Astrobiology

 

Search | Site Map | Buy Stuff - Store | Appendix
©2004 - 2013 Astronomy Online. All rights reserved. Contact Us. Legal. Creative Commons License
The works within is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License.