Data freezer: Difference between revisions

From BOINC Projects
Jump to navigation Jump to search
Al Piskun (talk | contribs)
Al Piskun (talk | contribs)
Methods: update
Line 12: Line 12:


== Methods ==
== Methods ==
Data storage on distributed SSDs;
Continuous verification of data integrity, calculation of crypto hashes of stored data random blocks;
Using calculated crypto hashes for data deduplication;
The first set of data is useful for collecting a large volume of “key”: “value” pairs, where the key is a cryptohash, the value is a data block.
Then we can create our own service based on this project. Or provide a commercial deduplication service.[https://frostydata.com/0Kdata/forum_thread.php?id=4]
Is it unreliable? The number of replicas solves the problem. Even if all copies are lost, there is a solution! Reliable servers store data that is many orders of magnitude smaller. From which we can restore the original data!
Is it unreliable? The number of replicas solves the problem. Even if all copies are lost, there is a solution! Reliable servers store data that is many orders of magnitude smaller. From which we can restore the original data!



Revision as of 18:55, 23 June 2024

BOINC project page template

[[File:{{#setmainimage:freezer.png}}|alt=logo image|center|frameless]]Data freezer is a BOINC based volunteer computing project that needs your help to build a data warehouse on consumer hardware.

Why Data freezer?

To find a technical group of enthusiasts who will understand the meaning and be inspired!

Goal

The main goal of the project is to build a data warehouse on consumer hardware, which is not usually used for this. (old SSDs, old smartphones, MicroSD cards, etc). If successful, you can either make a profit or donate the power to charity.[1]

Data freezer is intended to be a commercial venture in the future.

Methods

Data storage on distributed SSDs;

Continuous verification of data integrity, calculation of crypto hashes of stored data random blocks;

Using calculated crypto hashes for data deduplication;

The first set of data is useful for collecting a large volume of “key”: “value” pairs, where the key is a cryptohash, the value is a data block.

Then we can create our own service based on this project. Or provide a commercial deduplication service.[2]


Is it unreliable? The number of replicas solves the problem. Even if all copies are lost, there is a solution! Reliable servers store data that is many orders of magnitude smaller. From which we can restore the original data!

Let's Imagine an algorithm that can restore the original data from a cryptohash without collisions. This is impossible? Usually not possible... But in some cases for some data it is still possible. And we’ll use this opportunity!

Project team / Sponsors

  • Serge Stu