Rosetta@home: Difference between revisions

From BOINC Projects
Jump to navigation Jump to search
Al Piskun (talk | contribs)
elaborate and add images
Al Piskun (talk | contribs)
No edit summary
Tags: Mobile edit Mobile web edit
 
(3 intermediate revisions by the same user not shown)
Line 1: Line 1:
<div style="background-color: #D4E2FC; border-top: 1px solid #5F92F2; font-size: bigger; padding-left: 15px; margin: 12px -5px -5px -5px;">'''BOINC project page template'''</div>
{{Infobox software
| name                = Rosetta@home
| logo                = Rosettahome.png
| logo caption        = Rosetta@home logo
| screenshot          = Rosetta.gif
| caption              = Rosetta@home screensaver showing protein folding simulations
| status              = Active
| category            = Bioinformatics, Protein structure prediction, Distributed computing
| compute              = CPU
| dependencies        = [[wikipedia:Berkeley Open Infrastructure for Network Computing|BOINC]]


[[File:{{#setmainimage:Rosettahome.png}}|alt=Rosetta@home logo|center|frameless]]
| developer            = Baker Laboratory
| author              = [[wikipedia:David Baker (biochemist)|David Baker]] and collaborators
| sponsor              = [[wikipedia:University of Washington|University of Washington]]
| maintainer          = Baker Laboratory and RosettaCommons
| released            = {{Start date and age|2005|06|06}}
| repository          = https://github.com/RosettaCommons


[https://boinc.bakerlab.org/rosetta/ '''''Rosetta@home'''''] is a '''''[[wikipedia:Volunteer computing|volunteer distributed computing]]''''' project that uses [https://boinc.berkeley.edu/ '''''BOINC'''''] to help researchers predict and design the three-dimensional structures of proteins. The project is operated by the [[wikipedia:David Baker (biochemist)|Baker Laboratory]] at the [[wikipedia:University of Washington|University of Washington]] and is one of the best known and most scientifically productive projects in the BOINC ecosystem.<ref>https://boinc.bakerlab.org/rosetta/</ref><ref>https://en.wikipedia.org/wiki/Rosetta@home</ref>
| programming language = C++, C
| operating system    = Windows, Linux, macOS
| size                = Varies by work unit


[[File:Rosetta.gif|alt=Rosetta@home screensaver|Rosetta@home screensaver showing protein folding simulations]]
| stats as of          = {{Start date and age|2026|05|22}}
| average performance  = Several PFLOPS distributed across volunteer hosts
| active users        = 25000
| total users          = 1000000
| active hosts        = 45000
| total hosts          = 3000000


Rosetta@home officially launched in 2005 as a successor to earlier distributed protein-folding experiments and quickly became one of the largest volunteer computing projects in the world. The project allows ordinary volunteers to donate spare CPU power from home computers in order to perform extremely large numbers of protein-folding calculations that would otherwise require enormous supercomputing resources.<ref>https://en.wikipedia.org/wiki/Rosetta@home</ref>
| cpu performance      = Large-scale distributed CPU processing


== Why Rosetta@home? ==
| website              = https://boinc.bakerlab.org/rosetta/
| license              = Mixed proprietary and academic research licensing
}}


Proteins are essential biological molecules responsible for nearly every process inside living cells. Understanding how proteins fold into their complex three-dimensional structures is one of the central challenges of modern biology. Incorrectly folded proteins are associated with many diseases, including Alzheimer's disease, Parkinson's disease, Huntington's disease, cystic fibrosis, and certain cancers.<ref>https://en.wikipedia.org/wiki/Protein_folding</ref>
[https://boinc.bakerlab.org/rosetta/ '''''Rosetta@home'''''] is a '''[[wikipedia:Volunteer computing|volunteer distributed computing]]''' project that uses the [[wikipedia:Berkeley Open Infrastructure for Network Computing|BOINC]] platform to help researchers predict and design the three-dimensional structures of proteins. The project is operated by the [[wikipedia:David Baker (biochemist)|Baker Laboratory]] at the [[wikipedia:University of Washington|University of Washington]] in Seattle, Washington, and is considered one of the most scientifically successful and widely recognized BOINC projects.<ref>{{cite web|url=https://boinc.bakerlab.org/rosetta/|title=Rosetta@home}}</ref><ref>{{cite web|url=https://en.wikipedia.org/wiki/Rosetta@home|title=Rosetta@home}}</ref>


Rosetta@home enables volunteers worldwide to contribute computing power toward:
Rosetta@home officially launched in 2005 as a public volunteer computing extension of the Rosetta protein modeling software suite. Volunteers donate spare CPU resources from personal computers to perform large-scale molecular simulations involving protein folding, protein docking, and protein design.<ref>{{cite journal|last=Das|first=Rhiju|last2=Baker|first2=David|title=Macromolecular Modeling with Rosetta|journal=Annual Review of Biochemistry|year=2008|volume=77|pages=363–382|doi=10.1146/annurev.biochem.77.062906.171838}}</ref>


* Predicting protein structures
== Overview ==
* Designing entirely new proteins
* Developing vaccines
* Creating antiviral therapies
* Studying cancer-related proteins
* Research into Alzheimer's disease and other neurodegenerative disorders
* Understanding immune system interactions


[[File:Protein_structure_examples.png|thumb|Examples of protein structures from Wikipedia]]
Proteins are biological macromolecules composed of amino acid chains that fold into highly complex three-dimensional structures. The function of a protein depends heavily on its final folded conformation, and predicting how proteins fold from their amino acid sequence remains one of the central problems in computational biology and biochemistry.<ref>{{cite web|url=https://en.wikipedia.org/wiki/Protein_folding|title=Protein folding}}</ref>


During the COVID-19 pandemic, Rosetta@home received worldwide attention for helping researchers design proteins capable of binding to the SARS-CoV-2 spike protein. Some designed proteins showed promise as antiviral therapeutics and diagnostic tools.<ref>https://www.ipd.uw.edu/covid-19/</ref><ref>https://www.nature.com/articles/s41586-021-03819-2</ref>
Rosetta@home allows volunteers around the world to contribute spare computing power toward scientific research involving protein structure prediction, protein docking, computational enzyme design, and the study of molecular interactions. The project has also been used in vaccine research, antiviral therapeutic development, cancer-related protein analysis, and studies involving neurodegenerative disorders such as [[wikipedia:Alzheimer's disease|Alzheimer's disease]], [[wikipedia:Parkinson's disease|Parkinson's disease]], and [[wikipedia:Huntington's disease|Huntington's disease]]. By distributing millions of calculations across volunteer computers, Rosetta@home enables scientific simulations that would otherwise require extremely large supercomputing facilities.


== Goal ==
== Scientific basis ==


The primary goal of Rosetta@home is to determine and design accurate three-dimensional protein structures using computational methods. Researchers use the Rosetta software suite to explore millions of possible protein conformations in search of the most energetically favorable structures.<ref>https://www.rosettacommons.org/</ref>
Protein folding is governed by thermodynamics and molecular interactions. Rosetta software attempts to identify energetically favorable conformations by minimizing an approximate free-energy function while exploring large numbers of possible molecular arrangements.


The project also aims to:
The Rosetta platform combines several computational approaches, including Monte Carlo sampling, energy minimization, fragment assembly, comparative modeling, ab initio structure prediction, and protein docking simulations. Modern Rosetta methods also incorporate statistical and machine-learning-assisted scoring functions to improve prediction accuracy.


* Design new proteins not found in nature
The Rosetta energy function attempts to minimize the free energy of candidate structures:
* Improve understanding of protein folding
* Develop new therapeutics and vaccines
* Advance computational biology and bioinformatics
* Provide open scientific tools to the research community


[[File:Protein_folding.png|thumb|Illustration of protein folding pathways]]
<math>E_{total} = \sum_i w_iE_i</math>
 
where <math>E_i</math> represents individual energy terms and <math>w_i</math> represents weighting coefficients applied to those terms.
 
The project also uses stochastic Monte Carlo methods that accept or reject conformational changes according to probabilities derived from statistical thermodynamics:
 
<math>P = e^{-\Delta E / kT}</math>
 
where <math>\Delta E</math> is the change in energy, <math>k</math> is the Boltzmann constant, and <math>T</math> is temperature.
 
[[File:Protein_structure_examples.png|thumb|Examples of protein structures]]


== History ==
== History ==


Rosetta@home was created by researchers from the Baker Laboratory led by Professor [[wikipedia:David Baker (biochemist)|David Baker]]. The project became publicly available through the BOINC platform in 2005 and rapidly attracted volunteers from around the world.<ref>https://en.wikipedia.org/wiki/Rosetta@home</ref>
The Rosetta software project originated during the late 1990s at the Baker Laboratory under the leadership of Professor [[wikipedia:David Baker (biochemist)|David Baker]]. Early versions of Rosetta focused primarily on ab initio protein structure prediction and rapidly gained recognition within computational biology research communities.<ref>{{cite journal|last=Simons|first=K. T.|last2=Kooperberg|first2=C.|last3=Huang|first3=E.|last4=Baker|first4=D.|title=Assembly of protein tertiary structures from fragments with similar local sequences using simulated annealing and Bayesian scoring functions|journal=Journal of Molecular Biology|year=1997|volume=268|issue=1|pages=209–225|doi=10.1006/jmbi.1997.0959}}</ref>


The Rosetta software itself dates back to the late 1990s and evolved into one of the world's leading protein modeling platforms. Rosetta has since been used by thousands of researchers and institutions globally.<ref>https://www.rosettacommons.org/about</ref>
Rosetta@home became publicly available through BOINC in 2005 and quickly attracted a large international volunteer community. During the late 2000s and early 2010s, the project became one of the flagship scientific applications within the BOINC ecosystem.<ref>{{cite web|url=https://web.archive.org/web/*/https://boinc.bakerlab.org/rosetta/|title=Archived Rosetta@home pages}}</ref>


The project experienced major growth during:
The project experienced substantial growth during major scientific initiatives involving influenza and HIV research, CASP protein structure prediction competitions, and the development of computational protein design methods. Public participation increased dramatically again during the COVID-19 pandemic as global attention focused on antiviral research and computational biology.
* The CASP protein structure prediction competitions
* Influenza and HIV research initiatives
* The COVID-19 pandemic
* Major breakthroughs in computational protein design


Discussion threads preserved on the [[wikipedia:Wayback Machine|Wayback Machine]] and historical BOINC forums show Rosetta@home becoming one of the flagship BOINC projects during the late 2000s and early 2010s, often competing near the top of global BOINC statistics rankings.<ref>https://web.archive.org/web/*/https://boinc.bakerlab.org/rosetta/</ref>
== CASP participation ==


== Methods ==
Rosetta methods achieved significant success in the [[wikipedia:Critical Assessment of protein Structure Prediction|CASP]] competitions, which evaluate computational protein structure prediction methods using experimentally determined structures that have not yet been publicly released.


Rosetta@home distributes small computational tasks known as ''work units'' to volunteer computers. Each work unit explores different possible shapes or interactions for a protein molecule.
Performance in CASP competitions helped establish Rosetta as one of the leading protein prediction frameworks in computational biology.<ref>{{cite journal|last=Moult|first=John|title=Critical assessment of methods of protein structure prediction (CASP): Round XIII|journal=Proteins|year=2019|volume=87|issue=12|pages=1011–1020|doi=10.1002/prot.25823}}</ref>


The Rosetta software applies:
== Methods ==
* Monte Carlo sampling
* Energy minimization algorithms
* Fragment assembly methods
* Comparative modeling
* Protein docking simulations
* De novo protein design


[[File:PDB 1p5t EBI.jpg|thumb|Protein docking simulation example]]
[[File:PDB 1p5t EBI.jpg|thumb|Protein docking simulation example]]


Each volunteer computer independently calculates possible structures and returns the results to project servers, where researchers analyze the data and identify promising protein conformations.<ref>https://boinc.bakerlab.org/rosetta/rah_about.php</ref>
Rosetta@home distributes small computational tasks known as ''work units'' to volunteer computers through BOINC. Each work unit evaluates different possible conformations or molecular interactions involving proteins, with completed results returned to project servers for further scientific analysis.
 
The Rosetta software suite contains multiple specialized scientific modules for different forms of biomolecular modeling. Ab initio methods attempt to predict protein structures directly from amino acid sequences without relying entirely on experimentally solved templates. Protein docking simulations study how proteins interact with other proteins or molecules, while RosettaDesign allows researchers to computationally create entirely new proteins not found in nature.<ref>{{cite journal|last=Gray|first=Jeffrey J.|title=Protein-protein docking with simultaneous optimization of rigid-body displacement and side-chain conformations|journal=Journal of Molecular Biology|year=2003|volume=331|issue=1|pages=281–299|doi=10.1016/S0022-2836(03)00670-3}}</ref><ref>{{cite journal|last=Kuhlman|first=Brian|title=Design of a novel globular protein fold with atomic-level accuracy|journal=Science|year=2003|volume=302|issue=5649|pages=1364–1368|doi=10.1126/science.1089427}}</ref>


Unlike projects focused solely on raw computational throughput, Rosetta@home often performs highly complex scientific simulations requiring sophisticated modeling techniques and extensive statistical analysis.
Many Rosetta methods use libraries of experimentally observed protein fragments during conformational searches. This fragment-based approach significantly reduces the complexity of the protein-folding problem while improving the likelihood of identifying physically realistic structures.


== COVID-19 research ==
== COVID-19 research ==


Rosetta@home became heavily involved in COVID-19 research beginning in early 2020. Volunteers worldwide donated massive amounts of computing power to support urgent pandemic-related research efforts.
[[File:Protein_folding.png|thumb|Illustration of protein folding pathways]]
 
Rosetta@home became heavily involved in COVID-19 research beginning in early 2020. Public awareness of the project increased dramatically during the pandemic as volunteers contributed substantial additional computing power toward urgent SARS-CoV-2 research efforts.<ref>{{cite web|url=https://www.ipd.uw.edu/covid-19/|title=Institute for Protein Design COVID-19 research}}</ref>


Researchers used Rosetta to:
Researchers used Rosetta software to study viral protein structures, investigate spike-protein interactions, and design synthetic mini-proteins capable of binding tightly to the SARS-CoV-2 spike protein. Some of these engineered proteins demonstrated strong neutralizing capabilities in laboratory studies and were investigated as potential antiviral therapeutics and diagnostic tools.<ref>{{cite journal|last=Cao|first=Longxing|title=De novo design of picomolar SARS-CoV-2 miniprotein inhibitors|journal=Nature|year=2021|volume=595|issue=7867|pages=551–556|doi=10.1038/s41586-021-03819-2}}</ref>
* Design mini-proteins that bind the coronavirus spike protein
* Study viral protein structures
* Develop potential antiviral therapeutics
* Assist vaccine-related research


[[File:SARS-CoV-2_without_background.png|thumb|SARS-CoV-2 illustration from Wikipedia]]
The project received substantial international media coverage during this period, resulting in large increases in volunteer participation and overall BOINC activity.<ref>{{cite web|url=https://www.reddit.com/r/BOINC/|title=r/BOINC discussions}}</ref>


One highly publicized achievement involved the creation of synthetic mini-proteins capable of neutralizing SARS-CoV-2 in laboratory experiments.<ref>https://www.nature.com/articles/s41586-021-03819-2</ref>
== RosettaCommons ==


The project received substantial media coverage during this period, leading to large increases in volunteer participation from around the world.<ref>https://www.reddit.com/r/BOINC/</ref>
[[File:SARS-CoV-2_without_background.png|thumb|Illustration of SARS-CoV-2]]


== Project team / Sponsors ==
The broader Rosetta software ecosystem is maintained by [[wikipedia:RosettaCommons|RosettaCommons]], an international consortium of universities, medical research institutes, and scientific organizations collaborating on computational structural biology software development.<ref>{{cite web|url=https://www.rosettacommons.org/about|title=About RosettaCommons}}</ref>


Rosetta@home is operated by the [https://www.bakerlab.org/ Baker Laboratory] at the [[wikipedia:University of Washington|University of Washington]] in [[wikipedia:Seattle|Seattle]], [[wikipedia:Washington (state)|Washington]], USA.
RosettaCommons coordinates development of the Rosetta biomolecular modeling framework and supports scientific workshops, educational resources, and collaborative research initiatives. The consortium has played a major role in advancing computational protein design and structural bioinformatics, and Rosetta software is now widely used throughout the international molecular biology research community.


Key figures associated with the project include:
== Project team and sponsors ==
* [[wikipedia:David Baker (biochemist)|David Baker]]
* RosettaCommons collaborators
* Researchers from multiple international institutions


[[File:University of Washington Red Square golden hour Seattle Washington.jpg|thumb|University of Washington campus]]
[[File:University of Washington Red Square golden hour Seattle Washington.jpg|thumb|University of Washington campus]]


The broader Rosetta development community, known as RosettaCommons, includes scientists from universities and research institutes worldwide.<ref>https://www.rosettacommons.org/</ref>
Rosetta@home is operated primarily by the [https://www.bakerlab.org/ Baker Laboratory] at the [[wikipedia:University of Washington|University of Washington]] in Seattle, Washington. The project was founded by Professor [[wikipedia:David Baker (biochemist)|David Baker]], whose research group became internationally recognized for advances in protein structure prediction and computational protein design.
 
In addition to the Baker Laboratory, Rosetta@home benefits from contributions by RosettaCommons scientists and researchers from numerous universities and scientific institutions around the world. The collaborative nature of the project has made Rosetta one of the largest and most influential computational biology frameworks developed through academic research partnerships.


== System requirements ==
== System requirements ==


Rosetta@home primarily supports:
Rosetta@home supports Microsoft Windows, Linux, and macOS operating systems and primarily performs CPU-based scientific calculations rather than GPU acceleration. Work units may run for several hours depending on processor performance and user-selected runtime settings, and some tasks can require moderate to high levels of system memory.
* Windows
* Linux
* macOS


The project mainly uses CPU processing rather than GPU acceleration. Work units can be memory intensive and may run for several hours depending on system performance and user configuration.
The BOINC platform allows volunteers to configure CPU utilization, network scheduling, temperature limits, disk usage quotas, and other operational settings. Rosetta@home applications also support checkpointing, allowing computations to resume after interruptions or system restarts.


BOINC allows volunteers to:
== Community ==
* Limit CPU usage
* Pause computing during active computer use
* Restrict network activity
* Control temperature and power settings


== Community ==
Rosetta@home has maintained a large and active international volunteer community since its launch in 2005. Volunteers commonly participate through BOINC teams, distributed computing forums, Reddit communities, and statistics aggregation websites such as BOINCstats and Free-DC.


Rosetta@home has maintained a large and active volunteer community for many years. Volunteers participate through:
The project has historically been one of the most visible and competitive projects within the BOINC ecosystem, with many volunteer teams contributing substantial computing resources during community competitions and distributed computing challenges. Historical BOINC forums and archived discussions show Rosetta@home frequently ranking among the largest volunteer computing projects of its era.
* BOINC teams
* Project message boards
* Reddit communities
* Distributed computing forums
* Statistics websites


[[File:BOINC Logo custom.png|thumb|BOINC logo]]
== Scientific impact ==


Many volunteers join competitive BOINC teams that contribute large amounts of computing power and participate in distributed computing events and challenges.
Rosetta@home has contributed to major scientific advances in protein structure prediction, computational enzyme engineering, structural bioinformatics, antiviral therapeutic design, and synthetic protein development. Research performed using Rosetta methods has helped establish computational protein design as a major field within modern molecular biology.


The project is frequently discussed on:
The project achieved particular recognition through strong performances in CASP protein structure prediction competitions and through the development of novel synthetic proteins and antiviral binders. During the COVID-19 pandemic, Rosetta-related research became widely known for its work involving SARS-CoV-2 spike-protein inhibitors and de novo designed mini-proteins.
* Reddit BOINC communities
* BOINCstats
* Free-DC
* Team forums
* Historical distributed computing communities archived online


== Scientific results ==
Scientific publications related to Rosetta@home and the Rosetta software suite are archived through BOINC and RosettaCommons publication databases.<ref>{{cite web|url=https://boinc.berkeley.edu/pubs.php#Rosetta@home|title=BOINC scientific publications}}</ref>


Rosetta@home has contributed to numerous scientific breakthroughs in:
[[File:Protein_structure.jpg|thumb|Rendered protein structure]]
* Protein structure prediction
* Protein design
* Enzyme engineering
* Vaccine development
* Viral research
* Computational biology


Notable accomplishments include:
== Scientific publications ==
* Successful participation in CASP competitions
* Development of novel synthetic proteins
* COVID-19 antiviral protein design
* Advances in computational enzyme design


Scientific results:
Rosetta-related research has produced hundreds of peer-reviewed scientific papers published in journals including ''Nature'', ''Science'', ''Proceedings of the National Academy of Sciences'', ''Journal of Molecular Biology'', and ''Proteins''.
* https://boinc.berkeley.edu/pubs.php#Rosetta@home


[[File:Protein_structure.jpg|thumb|Rendered protein structure]]
Selected publications include:


== Scientific publications ==
* {{cite journal|last=Simons|first=K. T.|title=Assembly of protein tertiary structures from fragments with similar local sequences using simulated annealing and Bayesian scoring functions|journal=Journal of Molecular Biology|year=1997|doi=10.1006/jmbi.1997.0959}}
* {{cite journal|last=Kuhlman|first=Brian|title=Design of a novel globular protein fold with atomic-level accuracy|journal=Science|year=2003|doi=10.1126/science.1089427}}
* {{cite journal|last=Gray|first=Jeffrey J.|title=Protein-protein docking with simultaneous optimization of rigid-body displacement and side-chain conformations|journal=Journal of Molecular Biology|year=2003|doi=10.1016/S0022-2836(03)00670-3}}
* {{cite journal|last=Das|first=Rhiju|title=Macromolecular Modeling with Rosetta|journal=Annual Review of Biochemistry|year=2008|doi=10.1146/annurev.biochem.77.062906.171838}}
* {{cite journal|last=Cao|first=Longxing|title=De novo design of picomolar SARS-CoV-2 miniprotein inhibitors|journal=Nature|year=2021|doi=10.1038/s41586-021-03819-2}}


Rosetta-related research has produced hundreds of peer-reviewed scientific papers across major journals including ''Nature'', ''Science'', and ''PNAS''.
Additional publication lists are available through the BOINC publications archive and the RosettaCommons publications database.


Key publication areas include:
== See also ==
* Protein folding prediction
* Protein interface design
* Synthetic protein engineering
* Antiviral therapeutics
* Computational enzyme design


Scientific publications:
* [[wikipedia:BOINC|BOINC]]
* https://boinc.berkeley.edu/pubs.php#Rosetta@home
* [[wikipedia:Protein folding|Protein folding]]
* https://www.rosettacommons.org/publications
* [[wikipedia:RosettaCommons|RosettaCommons]]
* [[wikipedia:Distributed computing|Distributed computing]]
* [[wikipedia:Computational biology|Computational biology]]
* [[wikipedia:David Baker (biochemist)|David Baker]]


== External links ==
== External links ==
Line 181: Line 167:
* [https://boinc.berkeley.edu/ BOINC]
* [https://boinc.berkeley.edu/ BOINC]
* [https://boincstats.com/en/stats/145/project/detail BOINCstats project statistics]
* [https://boincstats.com/en/stats/145/project/detail BOINCstats project statistics]
* [https://boinc.berkeley.edu/pubs.php#Rosetta@home BOINC scientific publications]
[[File:BOINC Logo custom.png|BOINC logo|center|frameless|150x150px]]
== References ==
{{Reflist}}

Latest revision as of 13:37, 29 May 2026


Rosetta@home
Rosetta@home screensaver showing protein folding simulations
Project
StatusActive
CategoryBioinformatics, Protein structure prediction, Distributed computing
ComputeCPU
RequiresBOINC
Development
DeveloperBaker Laboratory
AuthorDavid Baker and collaborators
SponsorUniversity of Washington
MaintainerBaker Laboratory and RosettaCommons
Initial releaseJune 6, 2005  (21 years ago)
Repositoryhttps://github.com/RosettaCommons
Software
Written inC++, C
Operating systemWindows, Linux, macOS
SizeVaries by work unit
BOINC statistics
Stats as ofMay 22, 2026  (0 years ago)
PerformanceSeveral PFLOPS distributed across volunteer hosts
Active users25,000
Total users1,000,000
Active hosts45,000
Total hosts3,000,000
Analytics
CPU performanceLarge-scale distributed CPU processing
Metadata
Websitehttps://boinc.bakerlab.org/rosetta/
LicenseMixed proprietary and academic research licensing

Rosetta@home is a volunteer distributed computing project that uses the BOINC platform to help researchers predict and design the three-dimensional structures of proteins. The project is operated by the Baker Laboratory at the University of Washington in Seattle, Washington, and is considered one of the most scientifically successful and widely recognized BOINC projects.[1][2]

Rosetta@home officially launched in 2005 as a public volunteer computing extension of the Rosetta protein modeling software suite. Volunteers donate spare CPU resources from personal computers to perform large-scale molecular simulations involving protein folding, protein docking, and protein design.[3]

Overview

Proteins are biological macromolecules composed of amino acid chains that fold into highly complex three-dimensional structures. The function of a protein depends heavily on its final folded conformation, and predicting how proteins fold from their amino acid sequence remains one of the central problems in computational biology and biochemistry.[4]

Rosetta@home allows volunteers around the world to contribute spare computing power toward scientific research involving protein structure prediction, protein docking, computational enzyme design, and the study of molecular interactions. The project has also been used in vaccine research, antiviral therapeutic development, cancer-related protein analysis, and studies involving neurodegenerative disorders such as Alzheimer's disease, Parkinson's disease, and Huntington's disease. By distributing millions of calculations across volunteer computers, Rosetta@home enables scientific simulations that would otherwise require extremely large supercomputing facilities.

Scientific basis

Protein folding is governed by thermodynamics and molecular interactions. Rosetta software attempts to identify energetically favorable conformations by minimizing an approximate free-energy function while exploring large numbers of possible molecular arrangements.

The Rosetta platform combines several computational approaches, including Monte Carlo sampling, energy minimization, fragment assembly, comparative modeling, ab initio structure prediction, and protein docking simulations. Modern Rosetta methods also incorporate statistical and machine-learning-assisted scoring functions to improve prediction accuracy.

The Rosetta energy function attempts to minimize the free energy of candidate structures:

<math>E_{total} = \sum_i w_iE_i</math>

where <math>E_i</math> represents individual energy terms and <math>w_i</math> represents weighting coefficients applied to those terms.

The project also uses stochastic Monte Carlo methods that accept or reject conformational changes according to probabilities derived from statistical thermodynamics:

<math>P = e^{-\Delta E / kT}</math>

where <math>\Delta E</math> is the change in energy, <math>k</math> is the Boltzmann constant, and <math>T</math> is temperature.

Examples of protein structures

History

The Rosetta software project originated during the late 1990s at the Baker Laboratory under the leadership of Professor David Baker. Early versions of Rosetta focused primarily on ab initio protein structure prediction and rapidly gained recognition within computational biology research communities.[5]

Rosetta@home became publicly available through BOINC in 2005 and quickly attracted a large international volunteer community. During the late 2000s and early 2010s, the project became one of the flagship scientific applications within the BOINC ecosystem.[6]

The project experienced substantial growth during major scientific initiatives involving influenza and HIV research, CASP protein structure prediction competitions, and the development of computational protein design methods. Public participation increased dramatically again during the COVID-19 pandemic as global attention focused on antiviral research and computational biology.

CASP participation

Rosetta methods achieved significant success in the CASP competitions, which evaluate computational protein structure prediction methods using experimentally determined structures that have not yet been publicly released.

Performance in CASP competitions helped establish Rosetta as one of the leading protein prediction frameworks in computational biology.[7]

Methods

Protein docking simulation example

Rosetta@home distributes small computational tasks known as work units to volunteer computers through BOINC. Each work unit evaluates different possible conformations or molecular interactions involving proteins, with completed results returned to project servers for further scientific analysis.

The Rosetta software suite contains multiple specialized scientific modules for different forms of biomolecular modeling. Ab initio methods attempt to predict protein structures directly from amino acid sequences without relying entirely on experimentally solved templates. Protein docking simulations study how proteins interact with other proteins or molecules, while RosettaDesign allows researchers to computationally create entirely new proteins not found in nature.[8][9]

Many Rosetta methods use libraries of experimentally observed protein fragments during conformational searches. This fragment-based approach significantly reduces the complexity of the protein-folding problem while improving the likelihood of identifying physically realistic structures.

COVID-19 research

Illustration of protein folding pathways

Rosetta@home became heavily involved in COVID-19 research beginning in early 2020. Public awareness of the project increased dramatically during the pandemic as volunteers contributed substantial additional computing power toward urgent SARS-CoV-2 research efforts.[10]

Researchers used Rosetta software to study viral protein structures, investigate spike-protein interactions, and design synthetic mini-proteins capable of binding tightly to the SARS-CoV-2 spike protein. Some of these engineered proteins demonstrated strong neutralizing capabilities in laboratory studies and were investigated as potential antiviral therapeutics and diagnostic tools.[11]

The project received substantial international media coverage during this period, resulting in large increases in volunteer participation and overall BOINC activity.[12]

RosettaCommons

Illustration of SARS-CoV-2

The broader Rosetta software ecosystem is maintained by RosettaCommons, an international consortium of universities, medical research institutes, and scientific organizations collaborating on computational structural biology software development.[13]

RosettaCommons coordinates development of the Rosetta biomolecular modeling framework and supports scientific workshops, educational resources, and collaborative research initiatives. The consortium has played a major role in advancing computational protein design and structural bioinformatics, and Rosetta software is now widely used throughout the international molecular biology research community.

Project team and sponsors

University of Washington campus

Rosetta@home is operated primarily by the Baker Laboratory at the University of Washington in Seattle, Washington. The project was founded by Professor David Baker, whose research group became internationally recognized for advances in protein structure prediction and computational protein design.

In addition to the Baker Laboratory, Rosetta@home benefits from contributions by RosettaCommons scientists and researchers from numerous universities and scientific institutions around the world. The collaborative nature of the project has made Rosetta one of the largest and most influential computational biology frameworks developed through academic research partnerships.

System requirements

Rosetta@home supports Microsoft Windows, Linux, and macOS operating systems and primarily performs CPU-based scientific calculations rather than GPU acceleration. Work units may run for several hours depending on processor performance and user-selected runtime settings, and some tasks can require moderate to high levels of system memory.

The BOINC platform allows volunteers to configure CPU utilization, network scheduling, temperature limits, disk usage quotas, and other operational settings. Rosetta@home applications also support checkpointing, allowing computations to resume after interruptions or system restarts.

Community

Rosetta@home has maintained a large and active international volunteer community since its launch in 2005. Volunteers commonly participate through BOINC teams, distributed computing forums, Reddit communities, and statistics aggregation websites such as BOINCstats and Free-DC.

The project has historically been one of the most visible and competitive projects within the BOINC ecosystem, with many volunteer teams contributing substantial computing resources during community competitions and distributed computing challenges. Historical BOINC forums and archived discussions show Rosetta@home frequently ranking among the largest volunteer computing projects of its era.

Scientific impact

Rosetta@home has contributed to major scientific advances in protein structure prediction, computational enzyme engineering, structural bioinformatics, antiviral therapeutic design, and synthetic protein development. Research performed using Rosetta methods has helped establish computational protein design as a major field within modern molecular biology.

The project achieved particular recognition through strong performances in CASP protein structure prediction competitions and through the development of novel synthetic proteins and antiviral binders. During the COVID-19 pandemic, Rosetta-related research became widely known for its work involving SARS-CoV-2 spike-protein inhibitors and de novo designed mini-proteins.

Scientific publications related to Rosetta@home and the Rosetta software suite are archived through BOINC and RosettaCommons publication databases.[14]

Rendered protein structure

Scientific publications

Rosetta-related research has produced hundreds of peer-reviewed scientific papers published in journals including Nature, Science, Proceedings of the National Academy of Sciences, Journal of Molecular Biology, and Proteins.

Selected publications include:

  • Simons, K. T..(1997}).Assembly of protein tertiary structures from fragments with similar local sequences using simulated annealing and Bayesian scoring functions. Journal of Molecular Biology. DOI: 10.1006/jmbi.1997.0959.
  • Kuhlman, Brian.(2003}).Design of a novel globular protein fold with atomic-level accuracy. Science. DOI: 10.1126/science.1089427.
  • Gray, Jeffrey J..(2003}).Protein-protein docking with simultaneous optimization of rigid-body displacement and side-chain conformations. Journal of Molecular Biology. DOI: 10.1016/S0022-2836(03)00670-3.
  • Das, Rhiju.(2008}).Macromolecular Modeling with Rosetta. Annual Review of Biochemistry. DOI: 10.1146/annurev.biochem.77.062906.171838.
  • Cao, Longxing.(2021}).De novo design of picomolar SARS-CoV-2 miniprotein inhibitors. Nature. DOI: 10.1038/s41586-021-03819-2.

Additional publication lists are available through the BOINC publications archive and the RosettaCommons publications database.

See also

External links

BOINC logo
BOINC logo

References

  1. Rosetta@home.
  2. Rosetta@home.
  3. Das, Rhiju.(2008}).Macromolecular Modeling with Rosetta. Annual Review of Biochemistry. pp. 363–382. DOI: 10.1146/annurev.biochem.77.062906.171838.
  4. Protein folding.
  5. Simons, K. T..(1997}).Assembly of protein tertiary structures from fragments with similar local sequences using simulated annealing and Bayesian scoring functions. Journal of Molecular Biology. pp. 209–225. DOI: 10.1006/jmbi.1997.0959.
  6. Archived Rosetta@home pages.
  7. Moult, John.(2019}).Critical assessment of methods of protein structure prediction (CASP): Round XIII. Proteins. pp. 1011–1020. DOI: 10.1002/prot.25823.
  8. Gray, Jeffrey J..(2003}).Protein-protein docking with simultaneous optimization of rigid-body displacement and side-chain conformations. Journal of Molecular Biology. pp. 281–299. DOI: 10.1016/S0022-2836(03)00670-3.
  9. Kuhlman, Brian.(2003}).Design of a novel globular protein fold with atomic-level accuracy. Science. pp. 1364–1368. DOI: 10.1126/science.1089427.
  10. Institute for Protein Design COVID-19 research.
  11. Cao, Longxing.(2021}).De novo design of picomolar SARS-CoV-2 miniprotein inhibitors. Nature. pp. 551–556. DOI: 10.1038/s41586-021-03819-2.
  12. r/BOINC discussions.
  13. About RosettaCommons.
  14. BOINC scientific publications.