Folding@Home  BOINC (Berkeley Open Infrastructure for Network Computing)

Tips and Advocacy

Click here for a low-tech version: Guaranteed Human Life Extension
 
You have heard the phrase "knowledge is power" but this may be the first time that "power is knowledge"

Dr. Isaac Asimov

Dr. Isaac Asimov

Folding@home is based upon the science of Molecular Dynamics where molecular chemistry and mathematics are combined in computers to predict how protein molecules fold in three spatial dimensions over time.


When I first heard about this, I recalled Isaac Asimov's sci-fi masterpiece colloquially known as The Foundation Trilogy which is based upon the fictional branch of science called psychohistory where statistics, history and sociology are combined in computers to predict humanity's future. How did Asimov conceive of such things?

Years ago, I became infected with an Isaac Asimov inspired optimism about humanity's future and have since felt the need to contribute to it. While Folding@home will not cure my "infection of optimism", I am convinced Dr. Asimov (who received a Ph.D. in Biochemistry from Columbia in 1948 then was employed as a Professor of Biochemistry at the Boston School of Medicine staying there until 1958) would be fascinated by something like this. Click Asimov's picture to view his 1992 pop-ed article about protein.

Dr. Asimov, I'm computing these protein folding sequences in memory of you and your work.

I was considering a financial charitable donation to Folding@home when it occurred to me that my money would be better spent by:
  1. making a knowledgeable charitable donation to all of humanity by increasing my Folding@home computations (which will advance medical discoveries along with associated pharmaceutical treatments). I was already folding on a half-dozen computers anyway so all I needed to do was purchase used video cards on eBay.

  2. convincing other people to follow my example.

Quick-Navigation Menu


Protein Folding Overview

Science Problem:

Misfolded proteins have been associated with many diseases as well as age-related illnesses. However, proteins are so much more complicated than other molecules that it is not possible to begin a chemical experiment without first providing hints to researchers about "where to look" and "what to look for". Since the behavior of "atoms-in-molecules" as well as "atoms-between-molecules" can be computed (Molecular Dynamics), it makes more sense to begin with a computer analysis. Then permitted configurations can then be passed on to experimentalist researchers.


Real-world observation: Cooking an egg causes the clear protein (albumen) to unfold into long strings, with the result that they now can intertwine into a tangled network which will stiffen and scatter light (reflect white light). No chemical change has occurred but taste, volume and color have been altered.

Computer Solutions:

Using the most powerful single processor (CPU) available today, simulating the folding possibilities of one large protein molecule for one millisecond of chemical time might require one million days (2737 years) of computational time. However (and this is where you come in), if the problem is sliced up then assigned to 100,000 donor PCs via the internet, the computational requirement would drop to 10 days. Convincing your friends, relatives, and co-workers to also do this could drop the computational requirement to 1 day.

chemical time
in nature
 
simulation time
one computer one computer 100,000
computers
1 million
computers
1 uS (0.000001 seconds) 1,000         days 2.73 years 14.4 mins 1.44 mins
1 mS (0.001    seconds) 1,000,000     days 2,737 years 10 days 1 day
1 S  (1.0      seconds) 1,000,000,000 days 2,737,850 years 27 years 2.7 years

Additional notes for techies:
  1. Special-purpose research computers like IBM's BlueGene and Roadrunner employ 10,000 to 20,000 processors (CPUs) joined by many kilometers of optical fiber to do something similar.
     
  2. As of December 2011, the Folding@home project consists of 460,000 active processors (some CPUs, some GPUs) which is equivalent to 6.4 PetaFLOPS. This means that the million-day protein simulation problem could theoretically be completed in (1,000,000/460,000) 2.17 days. But since there are many more protein molecules than DNA molecules, humanity could be at this for quite some time to come. Adding your computers to Folding@home will permanently advance mankind's progress in protein research.

    Personal Comment: There are almost 1 billion accounts registered on FaceBook. Even if some of these entries represent "companies" or "duplicate entries by people", I am still shocked that there are less than one million "active" accounts at folding-at-home. We all know there are other distributed computing projects on the internet but "less than one million protein-folders" is almost a crime against humanity.
     
  3. Side Note: When the Human Genome Project (to study human DNA) was being planned it was thought that the task may require 100 years. However, technological change in the area of computers, robotic sequencers, and use of the internet to coordinate the activities of a large number of universities (each assigned a small piece of the problem) allowed the human genome project to publish results after only 15 years. 

  4. Distributed computing projects like Folding@home and BOINC have only been possible since 1995 when the world-wide-web (which was first proposed in 1989 to solve a document sharing problem at CERN) began to make the internet popular and ubiquitous.
     
  5. Distributed computing projects like Folding@home and BOINC have only been practical since 2005 when the CPUs in personal computers began to out-perform mini-computers and enterprise servers. This was partly because...
    1. AMD added 64-bit support to their x86 processor technology calling it x86-64.
    2. Intel followed suit calling their 64-bit extension technology EM64T
    3. Both AMD and Intel extended their respective streaming technologies in CPUs
    4. DDR2 (fast) memory became popular
    5. Intel added DDR2 support to their Pentium 4 processor line
    6. AMD added DDR2 support to their Athlon 64 processor line
       
  6. Since then, these technological improvements have only made computers both faster and cheaper:
    1. multi-core (each core is a fully functional CPU) chips from all manufacturers
    2. shifting analysis from each CPU core into multiple (hundreds to thousands) streaming processors found in high-end graphics cards
      1. ATI (now AMD) Radeon graphics cards
      2. Nvidea GeForce graphics cards
      3. development of high performance "graphics" memory technology (e.g. GDDR3 and GDDR4) to bypass processing stalls caused when processors are too fast.
    3. Intel's abandonment of NetBurst which meant a return to shorter instruction pipelines starting with Core2
      (Note that AMD never went to longer pipelines; a long pipeline is only efficient when running a static CPU benchmark - not running code in real-world operating systems like Windows, UNIX, and Linux)
    4. introduction of DDR3 memory
    5. Intel replacing 20-year old FSB technology with a new approach called QPI (QuickPath Interconnect). See their Core i7 chips.
      Note: this technology was invented by DEC for their Alpha CPUs and named CSI (Common System Interconnect). Compaq bought DEC in 1998. The Alpha Engineering team was sold to Intel in 2001 during the merger discussions between HP and Compaq. The merger was completed in 2002.
    6. The AMD equivalent of QPI is called HyperTransport which has been described as a multipath Ethernet targeted for use within a computer system.
       
  7. As is true in any "demand vs. supply" scenario, most consumers didn't need the additional computing power which meant that chip manufacturers had to drop their prices just to keep the computing marketplace moving. This was good news for people setting up "folding farms".
     
  8. Shifting from brute-force "Chemical Equilibrium" algorithms to techniques involving Bayesian statistics and Markov Models will enable some exponential speedups.

(mostly) Stanford Links

More Information About Proteins and Protein-Folding Science

Protein Videos

Online Documents

(Stanford's) Targeted Diseases:

This "folding knowledge" will be used to develop new drugs for treating diseases such as:

Reference Links: Folding@home - FAQ Diseases

My Computational Statistics

Using your ATI graphics card to increase science

Stream Computing at ATI (now a division of AMD)

Stream Computing at folding@home

ATI-GPU Caveats:

 
New (2011-July) ATI Graphics Card Caveat:

Some distributed science projects require double-precision floating-point but Folding@home is not one of them. That said, I just replaced a defective graphics card with a brand new ATI HD-5570 graphics card and none of the official GPU clients seem to support it. While poking around the Folding Forum I came up with a beta GPU3 client which does work. Click the following link for more information: http://foldingforum.org/viewtopic.php?f=59&t=14683&p=144648

If you do require double-precision floating-point then you had better do some research before trekking to the store:

Using your Playstation3 (PS3) to do protein-folding science

Folding with NVIDIA

GPU Programming (not required to use folding-at-home)

  1. Scalar vs. Vector
     
    1. CPUs (central processing units) are scalar processors which execute instructions sequentially
      • Some RISC processors can exploit certain kinds of instruction-level parallelism. In some cases they can execute instructions out-of-order.
      • Some CISC processors support SIMD (single instruction - multiple data) instructions for certain applications involving DSP or multi-media.
         
    2. GPUs (graphics programming units) are PC-based vector processors which easily execute parallel operations
      • The ATI x1950 was released in 2006 with 36 processors (pixel shaders)
      • The ATI HD-3870 was released in 2007 with 320 processors (unified shaders)
      • The ATI HD-4870 was released in 2008 with 800 processors (unified shaders)
      • This is an increase of 22 times in only 2 years. (Moore's Law expects transistors to double every 18 months)
      • Since graphics cards have their own large memory systems, they should be thought of as private computer systems within your computer. Also, this private memory is not going to be trounced by interrupting devices etc.
         
  2. In many cases vector processors are easily two orders of magnitude (100 times) more powerful than scalar processors.
  3. Nvidia Links
  4. ATI Links

Hacker Sites + Tools

  1. http://www.robpol86.com/Pages/imagecfg.php - a tool to periodically tweak "processor affinity" of any windows process
  2. http://go.microsoft.com/fwlink/?LinkId=4544 - Windows Server 2003 Resource Kit Tools (Also works with XP)
    includes cool stuff like: imagecfg , sleep (used to pause a script), timezone , etc.
  3. http://technet.microsoft.com/en-us/sysinternals/default.aspx
    includes lots of free system and process monitoring tools from Microsoft
  4. http://www.jsifaq.com/SF/Tips/Tip.aspx?id=3542 - setting affinity
  5. http://distributed.org.ua/forum/index.php?showtopic=1149 - Affinity Changer utility

Microsoft Windows Scripting and Programming

  1. MS-DOS/MSDOS Batch Files: Batch File Tutorial and Reference
  2. MS-DOS @wikipedia
  3. Batch file @wikipedia
  4. Microsoft Windows XP - Batch files

Experimental Stuff for Windows Hackers and Gurus

Here are some DOS commands for creating, and starting, a Windows Service to execute a DOS script.

sc create neil369 binpath= "cmd /k start c:\folding-0\neil987.bat" type= own type= interact
sc start  neil369

Once created, you can stop/start/modify a service graphically from this Windows location:

                Start >>  Programs >> Administrative Tools >> Services

Stopped services may only be deleted from DOS like so:

sc query  neil369
sc delete neil369

Console Client Startup Script for GPU1

caveat: no longer required with the newer GPU2 clients associated starting with HD-2000 series ATI cards

@echo off
echo "================================="
echo "GPU console client control script"
echo "================================="
echo "sleeping 2 minutes while Windows is starting"
echo "you may wish to start TASK-MANAGER to set CPU Affinity" 
@echo on
sleep 120
cd /d c:\folding-0
:myloop
echo "starting the GPU console client"
fah6-win-gpu-console.exe -local
echo ">>> the console has just exited <<<"
echo "did someone do a 3-finger salute?"
echo "sleeping 1 minute while the system stabilizes"
sleep 60
goto myloop
rem ==============================

Notes:

        Start >> Programs >> Startup

Neat Folding Farms

BOINC (Berkeley Open Infrastructure for Network Computing)

Other Protein Analysis Projects

POEM@home (via BOINC)

Rosetta@home (via BOINC)

World Community Grid (via BOINC)

Biology Science Links

Wikipedia Links:

Protein Data Bank Links

Local Links


Back to Home
Neil Rieck
Kitchener - Waterloo - Cambridge, Ontario, Canada.